[
https://issues.apache.org/jira/browse/HBASE-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633620#comment-13633620
]
Jean-Daniel Cryans commented on HBASE-7826:
-------------------------------------------
I had a follow-up discussion with [~shiven] and his team, and the issue they
have is that they need to stream through fat rows with millions of columns so
they cannot sort client-side. Their own testing with the original patch shows
much better performance if the Thrift server returns the data already sorted.
Furthermore, it's silly that we don't support sorted columns everywhere but in
Thrift. Right now we're stuck with was written years ago.
[~shiven] suggested that we try to find a way to add this functionality while
keeping thing compatible. In the worst case this could be done by adding a
whole different set of methods that return a different RowResult object that
contains a list.
But here's my proposal that should not involve a whole lot of duplicated
methods:
- Have {{TScan}} carry a new optional {{boolean}} to specify if the user wants
sorted columns back.
- Have {{RowResult}} carry a new optional list of {{TCells}} that will contain
the sorted KVs.
- Change {{RowResult}}'s {{columns}} map to be also optional.
- Add a new wrapper class in {{ThriftServerRunner}} that will contain both a
{{ResultScanner}} and the boolean passed in {{TScan}} and put this in
{{scannerMap}} instead of the {{ResultScanner}}.
- Change {{ThriftServerRunner.scannerGetList}} methods to check the boolean
from the wrapper class to see if it should populate {{RowResult}}'s list or map.
The end result is that current client thrift code won't need to be recompiled
and will get their map, and new clients that talk to a new server will be able
to pass a boolean when creating a scanner that will request results to be
returned in a list.
There's also the question of if we want to change {{getRows}} methods to have a
new optional {{boolean}}.
> Improve Hbase Thrift v1 to return results in sorted order
> ---------------------------------------------------------
>
> Key: HBASE-7826
> URL: https://issues.apache.org/jira/browse/HBASE-7826
> Project: HBase
> Issue Type: New Feature
> Components: Thrift
> Affects Versions: 0.94.0
> Reporter: Shivendra Pratap Singh
> Assignee: Shivendra Pratap Singh
> Priority: Minor
> Labels: Hbase, Thrift
> Attachments: hbase_7826.patch, hbase_7826.patch
>
>
> Hbase natively stores columns sorted based on the column qualifier. A scan is
> guaranteed to return sorted columns. The Java API works fine but the Thrift
> API is broken. Hbase uses TreeMap that ensures that sort order is maintained.
> However Hbase thrift specification uses a simple Map to store the data. A
> map, since it is unordered doesn't result in columns being returned in a sort
> order that is consistent with their storage in Hbase.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira