[jira] Commented: (HBASE-521) Improve client scanner interface

stack (JIRA) Mon, 24 Mar 2008 14:55:31 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581690#action_12581690
 ]


stack commented on HBASE-521:
-----------------------------

I suppose RowResults could easily be so big, they'd blow out memory on client 
or server.  Whats our defense?  That designing your request/MR, that you are 
not select too much?  (I suppose we've always had this prob.  This patch does 
not introduce it)

This issue addresses one of the items raised in our plan for 0.2.

Should we bite the bullet and change the name of the methods in HTable to be 
getScanner instead of 'obtainScanner -- just deprecate the old ones... In fact, 
we probably should do this since we're breaking the methods anyways (add 
deprecate to old obtainScanner methods).

hmmm.... but next is completely different.  Maybe we should just say that 
HTable has changed completely in 0.2, and TableMap, etc.

I like changing name from HScannerInterface to Scanner.  Change 
HInternalScannerInterface to InternalScanner?

The change in 'Index: src/java/org/apache/hadoop/hbase/util/Migrate.java' is 
odd; you just add imports?  Is that right?

For IdentityTableReduce, the interface should be <Long, BatchUpdate>, rather 
than <Text, BatchUpdate>?  The Long would be an index of some kind.  That seems 
to be the model for identity mappers... same for the TOF/TIF.  Should key be a 
Long rather than duplicate of info in RowResult/BatchUpdate?








> Improve client scanner interface
> --------------------------------
>
>                 Key: HBASE-521
>                 URL: https://issues.apache.org/jira/browse/HBASE-521
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>            Priority: Minor
>             Fix For: 0.2.0
>
>         Attachments: 521.patch
>
>
> The current client scanner interface is pretty ugly. You need to instantiate 
> an HStoreKey and SortedMap<Text, byte[]> externally and then pass them into 
> next. This is pretty bad, because for starters, the client has to choose the 
> implementation of the map when they create it, so it's extra brain cycles to 
> figure that out. HStoreKey doesn't show up anywhere else in the entire client 
> side API, but here it bubbles out of next as a way to get the row and 
> presumably the timestamp of the columns.
> I propose that we supplant HScannerInterface with Scanner, an easier-to-use 
> version for clients. Its next method would look something like:
> {code}
> public RowResult next() throws IOException;
> {code}
> This packs the data up much more cleanly, including using Cells as values 
> instead of raw byte[], meaning you have much more granular timestamp 
> information. You also don't need HStoreKey anymore.
> By breaking Scanner away from HScannerInterface, we can leave the internal 
> scanning code completely alone (keep using HStoreKeys and such) but make the 
> client cleaner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-521) Improve client scanner interface

Reply via email to