[ 
https://issues.apache.org/jira/browse/HBASE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nichole Treadway updated HBASE-4435:
------------------------------------

    Description: 
Adds in a Group By -like functionality to HBase, using the Coprocessor 
framework. 

It provides the ability to group the result set on one or more columns (groupBy 
families). It computes statistics (max, min, sum, count, sum of squares, number 
missing) for a second column, called the stats column. 

To use, I've provided two implementations.

1. In the first, you specify a single group-by column and a stats field:

      statsMap = gbc.getStats(tableName, scan, groupByFamily, groupByQualifier, 
statsFamily, statsQualifier, statsFieldColumnInterpreter);

The result is a map with the Group By column value (as a String) to a 
GroupByStatsValues object. The GroupByStatsValues object has max,min,sum etc. 
of the stats column for that group.

2. The second implementation allows you to specify a list of group-by columns 
and a stats field. The List of group-by columns is expected to contain lists of 
{column family, qualifier} pairs. 

      statsMap = gbc.getStats(tableName, scan, listOfGroupByColumns, 
statsFamily, statsQualifier, statsFieldColumnInterpreter);



  was:Adds in a Group By -like fucntionality to HBase using coprocessors


> Add Group By functionality using Coprocessors
> ---------------------------------------------
>
>                 Key: HBASE-4435
>                 URL: https://issues.apache.org/jira/browse/HBASE-4435
>             Project: HBase
>          Issue Type: Improvement
>          Components: coprocessors
>            Reporter: Nichole Treadway
>            Priority: Minor
>         Attachments: HBase-4435.patch
>
>
> Adds in a Group By -like functionality to HBase, using the Coprocessor 
> framework. 
> It provides the ability to group the result set on one or more columns 
> (groupBy families). It computes statistics (max, min, sum, count, sum of 
> squares, number missing) for a second column, called the stats column. 
> To use, I've provided two implementations.
> 1. In the first, you specify a single group-by column and a stats field:
>       statsMap = gbc.getStats(tableName, scan, groupByFamily, 
> groupByQualifier, statsFamily, statsQualifier, statsFieldColumnInterpreter);
> The result is a map with the Group By column value (as a String) to a 
> GroupByStatsValues object. The GroupByStatsValues object has max,min,sum etc. 
> of the stats column for that group.
> 2. The second implementation allows you to specify a list of group-by columns 
> and a stats field. The List of group-by columns is expected to contain lists 
> of {column family, qualifier} pairs. 
>       statsMap = gbc.getStats(tableName, scan, listOfGroupByColumns, 
> statsFamily, statsQualifier, statsFieldColumnInterpreter);

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to