[ https://issues.apache.org/jira/browse/HBASE-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13024893#comment-13024893 ]
jirapos...@reviews.apache.org commented on HBASE-1512: ------------------------------------------------------ ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/585/#review535 ----------------------------------------------------------- I think its almost there. This patch won't compile (see below for why). I'd be game for applying the next version. This patch has come on a long way. Lets make new issues after applying it for issues found in it (This patch does include a nice set of tests). /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java <https://reviews.apache.org/r/585/#comment1116> I agree w/ the review that suggested we spell out 'agg' rather than use the abbreviation, especially in javadoc. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java <https://reviews.apache.org/r/585/#comment1117> if should be 'where'. Should we throw an exception if multiple families supplied so users are not surprised when they don't get answers for multiple families? /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java <https://reviews.apache.org/r/585/#comment1118> I'd say leave implementation details out of the public javadoc (the bit about calling private methods) /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java <https://reviews.apache.org/r/585/#comment1119> Does Scan do this test? Internally? (I'm not sure) /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java <https://reviews.apache.org/r/585/#comment1120> 'should' or 'does'? I think you want to say the latter? /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java <https://reviews.apache.org/r/585/#comment1121> Why this javadoc? Don't we inherit javadoc from the Interface? /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java <https://reviews.apache.org/r/585/#comment1122> Whats this? We do nothing on serialization? Is that right? It could be. It just strikes me as a little odd. Maybe put a comment in here to say 'nothing to serialize'? /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java <https://reviews.apache.org/r/585/#comment1123> Do we agree that AggregateCpProtocol was not a good name, that rather it should be AggregateProtocol since cp is in the package name? I see you have a AP later in this patch. Let me go look at it. I think I see whats going on... you didn't mean to include this in the patch? /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java <https://reviews.apache.org/r/585/#comment1124> Otherwise, this Interface looks good. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java <https://reviews.apache.org/r/585/#comment1125> Yeah, this class shouldn't be included either. - Michael On 2011-04-23 16:39:37, Ted Yu wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/585/ bq. ----------------------------------------------------------- bq. bq. (Updated 2011-04-23 16:39:37) bq. bq. bq. Review request for hbase and Gary Helmling. bq. bq. bq. Summary bq. ------- bq. bq. This patch provides reference implementation for aggregate function support through Coprocessor framework. bq. ColumnInterpreter interface allows client to specify how the value's byte array is interpreted. bq. Some of the thoughts are summarized at http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html bq. bq. Himanshu Vashishtha started the work. I provided some review comments and some of the code. bq. bq. bq. This addresses bug HBASE-1512. bq. https://issues.apache.org/jira/browse/HBASE-1512 bq. bq. bq. Diffs bq. ----- bq. bq. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java PRE-CREATION bq. /src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java PRE-CREATION bq. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java PRE-CREATION bq. /src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java PRE-CREATION bq. /src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java PRE-CREATION bq. /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/585/diff bq. bq. bq. Testing bq. ------- bq. bq. TestAggFunctions passes. bq. bq. bq. Thanks, bq. bq. Ted bq. bq. > Coprocessors: Support aggregate functions > ----------------------------------------- > > Key: HBASE-1512 > URL: https://issues.apache.org/jira/browse/HBASE-1512 > Project: HBase > Issue Type: Sub-task > Components: coprocessors > Reporter: stack > Attachments: 1512.zip, AggregateCpProtocol.java, > AggregateProtocolImpl.java, AggregationClient.java, ColumnInterpreter.java, > patch-1512-2.txt, patch-1512-3.txt, patch-1512-4.txt, patch-1512-5.txt, > patch-1512-6.txt, patch-1512-7.txt, patch-1512.txt > > > Chatting with jgray and holstad at the kitchen table about counts, sums, and > other aggregating facility, facility generally where you want to calculate > some meta info on your table, it seems like it wouldn't be too hard making a > filter type that could run a function server-side and return the result ONLY > of the aggregation or whatever. > For example, say you just want to count rows, currently you scan, server > returns all data to client and count is done by client counting up row keys. > A bunch of time and resources have been wasted returning data that we're not > interested in. With this new filter type, the counting would be done > server-side and then it would make up a new result that was the count only > (kinda like mysql when you ask it to count, it returns a 'table' with a count > column whose value is count of rows). We could have it so the count was > just done per region and return that. Or we could maybe make a small change > in scanner too so that it aggregated the per-region counts. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira