[jira] Commented: (HBASE-2793) Add ability to extract a specified list of versions of a column in a single roundtrip

ryan rawson (JIRA) Sat, 26 Jun 2010 11:55:15 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12882880#action_12882880
 ]


ryan rawson commented on HBASE-2793:
------------------------------------

#2 wontbe so bad... filters are pretty deep and will be just as efficient as
hacking scan query Matcher I think.

On Jun 26, 2010 11:45 AM, "Kannan Muthukkaruppan (JIRA)" <[email protected]>
https://issues.apache.org/jira/browse/HBASE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12882878#action_12882878]
to be:
be nice if the fix for this issue also takes advantage of that optimization
and avoids a full row scan).
seems like in this approach, you'll end up doing a full scan-- and check
against the filter for each row. There wouldn't be a way to early exit.
passed in set of versions; use the code setTimeRange() to trim down the set
of columns we look at; and apply the filter against those columns. Still not
a great approach is versions passed are spread out too much.
the same server roundtrip of course). I think it is still important to
preserve row-level consistency-- i.e. we should do a consistent read of the
all the versions within a row. The stuff Ryan has done should probably make
it easy. But I don't know this too well yet.
objects, all for the same row, and use setTimeStamp() to set the version
explicitly in each Get object. The trouble though is that the general case
of the Batch Get[] API doesn't have to support a consistency read across all
Gets in a batch; but for this case a consistent read would be the desired
semantics.
and you are interested in version 1 and 10000 ones, then point lookups will
be as good as it gets-- and should fetch just the minimal blocks needed. If
the versions happen to be on same block, even better-- the blocks should be
warm in the LRU cache. The case where this approach might not be as CPU
efficient is if the versions are fairly densely packed together, and a range
scan (#2) might have worked better. But for the case the app should probably
be using setTimeRange() API instead.
single roundtrip
-------------------------------------------------------------------------------------
column, but with several versions (e.g., each version representing an event
in a log), and we want to be able to extract specific set of versions from
the row in a single round-trip.
using setTimeStamp(ts) or a range of versions using setTimeRange(min, max).
But not a set of specified versions. It would be useful to add this ability.


> Add ability to extract a specified list of versions of a column in a single 
> roundtrip
> -------------------------------------------------------------------------------------
>
>                 Key: HBASE-2793
>                 URL: https://issues.apache.org/jira/browse/HBASE-2793
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>
> In one of the use cases we were looking at, each row contains a single 
> column, but with several versions (e.g., each version representing an event 
> in a log), and we want to be able to extract specific set of versions from 
> the row in a single round-trip.
> Currently, on a Get, one can retrieve a specific version of a column using 
> setTimeStamp(ts) or a range of versions using setTimeRange(min, max). But not 
> a set of specified versions. It would be useful to add this ability.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2793) Add ability to extract a specified list of versions of a column in a single roundtrip

Reply via email to