[
https://issues.apache.org/jira/browse/HBASE-7474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549452#comment-13549452
]
Anil Gupta commented on HBASE-7474:
-----------------------------------
License headers in SortingClient.java and
BigDecimalSortingColumnInterpreter.java are not properly formatted.
Anil: Pending
Some log statements, such as the following, can be at debug level.
+ log.info("Querying only one region for sorting");
Anil: Done
+ if (sortDecreasing) return instance.sortDecreasing(scan,
columnFamily, columnQualifier,
+ colInterpreter, startIndex, pageSize, true);
+ else return instance.sortIncreasing(scan, columnFamily,
columnQualifier,
'else' keyword is not needed above.
Anil: Done. But, personally i like to explicitly use the "else" keyword for
readability purpose. I am curious to know if there is any technical reason for
not using "else" in the above case?
+ * This method is used to do the merge sort the rows from multiple regions
and produce the final output
Remove 'do the'. Wrap long line.
Anil: Done
+ for (Map.Entry<byte[], Result[]> regionResultsEntryMap :
regionResultMap.entrySet()) {
regionResultsEntryMap -> regionResultsEntry or regionResultsMapEntry
Anil: Done
+ if(totalNoOfRows < startIndex)
+ {
Normally left brace is on the same line as if statement. Insert a space between
if and (.
Anil: Done
currentMaxorMinValueRegion and maxOrMin are used in the if / else blocks. You
can move them inside if / else block and give them names that are clearer in
meaning.
Anil: Done
+ for (Result[] regionResult : regionResults) {
+ if ((regionResult.length - 1) < arrayIndex[regionNum]) {
regionResults and arrayIndex are both arrays. So you can use the same index to
access them - in my opinion the code is more readable.
Anil:Pending
+ finalResult[finalResultCurrentSize++] =
regionResults[currentMaxorMinValueRegion][arrayIndex[currentMaxorMinValueRegion]];
Wrap long line above.
Anil: Done
+ if (colInterpreter.compare(tmp, maxOrMin) > 0) {
If I read the code correctly, the above comparison is the major difference
between ascending and descending sorting. A little abstraction would allow you
to unify the two cases.
Anil: IMHO, the only way to do that is to put an If(sortDecresing) condition
and then either do ">" or "<" comparison on the basis of sortDecreasing. I am
worried that this abstraction will make the implementation a tab more slow
since the worst case complexity of this sorting is O(n*n). I would prefer
performance over few extra lines of code. Let me know your views.
Looking at SortingColumnInterpreter, this is the only method which is not
present in ColumnInterpreter:
+ T getValue(KeyValue kv) throws IOException;
The following method is already provided by ColumnInterpreter:
public abstract T getValue(byte[] colFamily, byte[] colQualifier, KeyValue kv)
throws IOException;
Anil: I am thinking of adding the missing method "T getValue(KeyValue kv)
throws IOException;" in ColumnInterpreter. Is that fine? I dont understand why
we need colFamily and colQualifier in getValue method when only a KeyValue is
passed.
Please consider dropping SortingColumnInterpreter
Thanks a lot for doing the code review.
~Anil Gupta
Software Engineer II, Intuit, Inc
> Endpoint Implementation to support Scans with Sorting of Rows based on column
> values(similar to "order by" clause of RDBMS)
> ---------------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-7474
> URL: https://issues.apache.org/jira/browse/HBASE-7474
> Project: HBase
> Issue Type: New Feature
> Components: Coprocessors, Scanners
> Affects Versions: 0.94.3
> Reporter: Anil Gupta
> Assignee: Anil Gupta
> Priority: Minor
> Labels: coprocessors, scan, sort
> Fix For: 0.94.5
>
> Attachments: hbase-7474.patch, hbase-7474-v2.patch,
> SortingEndpoint_high_level_flowchart.pdf
>
>
> Recently, i have developed an Endpoint which can sort the Results(rows) on
> the basis of column values. This functionality is similar to "order by"
> clause of RDBMS. I will be submitting this Patch for HBase0.94.3
> I am almost done with the initial development and testing of feature. But, i
> need to write the JUnits for this. I will also try to make design doc.
> Thanks,
> Anil Gupta
> Software Engineer II, Intuit, inc
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira