[
https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190891#comment-13190891
]
Lars Hofhansl commented on HBASE-5229:
--------------------------------------
Yes, it is the 2nd time :)
This is the way to make large Rows (ala Megastore) possible as Matt suggests,
because a scan can be efficiently done inside a row (well actually a column
family).
Re: KV ordering. Looking around at the code... It is causing no problems,
because we never have to compare KVs between families. It makes sense,
RegionScannerImpl just deals with rows, and StoreScanners just deals with
StoreFileScaners, which are all of he same family... Just something I somehow
had not groked before today (I had assumed there is a consistent global
ordering between *all* KeyValues).
Instead of adding cruft to the Scan API that does not provide any new features,
I think that ColumnRangeFilter rather should just be documented more
prominently. I'll add something to the book (it's only tersely mentioned). And
I'll definitely blog about it. :)
Might still be worth exploring some outside grouping of rows (like the prefix I
mention above), because that would be more in line with the API that a client
expects. Implementing that would actually be simple: We'd just calculate the
midKey as we do now, and then we take a prefix of the midKey to do the actual
split (if the table declares - say - a four byte prefix, then we always split
on the first four bytes of the midKey instead of the full row-key).
Using wide rows with ColumnRangeFilter forces the application to reinvent many
concepts (like selected a prefix of the column to declare the "inner grouping"
and then use the remaining suffix to identify the "inner columns".)
Additionally there might be work to have HBase perform with very wide rows
(although ColumnRangeFilter can efficiently retrieve a subset of columns).
It looks like ColumnRangeFilter might be ineffective if there many versions of
the cells, as version elimination during scanning happens after Filters are
evaluated (see ScanQueryMatcher).
Maybe that's one thing to change: Allow a filter to declare whether it is
evaluated pre or post version elimination... There're usecases for both. Kannan
faces a similar problem in HBASE-5104, which I think could be solved if filters
could be optionally evaluated after the version handling. If that's something
we want to do, I'll create a jira for that and work out a patch.
> Explore building blocks for "multi-row" local transactions.
> -----------------------------------------------------------
>
> Key: HBASE-5229
> URL: https://issues.apache.org/jira/browse/HBASE-5229
> Project: HBase
> Issue Type: New Feature
> Components: client, regionserver
> Reporter: Lars Hofhansl
> Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 5229-seekto-v2.txt, 5229-seekto.txt, 5229.txt
>
>
> HBase should provide basic building blocks for multi-row local transactions.
> Local means that we do this by co-locating the data. Global (cross region)
> transactions are not discussed here.
> After a bit of discussion two solutions have emerged:
> 1. Keep the row-key for determining grouping and location and allow efficient
> intra-row scanning. A client application would then model tables as
> HBase-rows.
> 2. Define a prefix-length in HTableDescriptor that defines a grouping of
> rows. Regions will then never be split inside a grouping prefix.
> #1 is true to the current storage paradigm of HBase.
> #2 is true to the current client side API.
> I will explore these two with sample patches here.
> --------------------
> Was:
> As discussed (at length) on the dev mailing list with the HBASE-3584 and
> HBASE-5203 committed, supporting atomic cross row transactions within a
> region becomes simple.
> I am aware of the hesitation about the usefulness of this feature, but we
> have to start somewhere.
> Let's use this jira for discussion, I'll attach a patch (with tests)
> momentarily to make this concrete.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira