[
https://issues.apache.org/jira/browse/HBASE-14355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
churro morales updated HBASE-14355:
-----------------------------------
Attachment: HBASE-14355.patch
The first patch is for trunk, want to know what you guys think before I do
patches for all the other branches.
The StoreFileScanner now has a field for columnFamily which is passed into the
constructor. To keep the patch small, instead of parameterizing all callers
I get the column family from the reader.
I do a reader.getFirstKey() and then grab the column family from the Cell.
If it is null that means there are no entries in the store file and we can skip
it anyways. If it is not null I grab the column family from the first keyvalue.
I am assuming that there aren't any weird scenarios where we have key values
with different column families in the same store file which should hold true.
Would love a review on this patch.
> Scan different TimeRange for each column family
> -----------------------------------------------
>
> Key: HBASE-14355
> URL: https://issues.apache.org/jira/browse/HBASE-14355
> Project: HBase
> Issue Type: New Feature
> Components: Client, regionserver, Scanners
> Reporter: Dave Latham
> Fix For: 2.0.0, 1.3.0, 0.98.16
>
> Attachments: HBASE-14355.patch
>
>
> At present the Scan API supports only table level time range. We have
> specific use cases that will benefit from per column family time range. (See
> background discussion at
> https://mail-archives.apache.org/mod_mbox/hbase-user/201508.mbox/%3ccaa4mzom00ef5eoxstk0hetxeby8mqss61gbvgttgpaspmhq...@mail.gmail.com%3E)
> There are a couple of choices that would be good to validate. First - how to
> update the Scan API to support family and table level updates. One proposal
> would be to add Scan.setTimeRange(byte family, long minTime, long maxTime),
> then store it in a Map<byte[], TimeRange>. When executing the scan, if a
> family has a specified TimeRange, then use it, otherwise fall back to using
> the table level TimeRange. Clients using the new API against old region
> servers would not get the families correctly filterd. Old clients sending
> scans to new region servers would work correctly.
> The other question is how to get StoreFileScanner.shouldUseScanner to match
> up the proper family and time range. It has the Scan available but doesn't
> currently have available which family it is a part of. One option would be
> to try to pass down the column family in each constructor path. Another
> would be to instead alter shouldUseScanner to pass down the specific
> TimeRange to use (similar to how it currently passes down the columns to use
> which also appears to be a workaround for not having the family available).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)