[
https://issues.apache.org/jira/browse/HBASE-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13744928#comment-13744928
]
Anoop Sam John commented on HBASE-9203:
---------------------------------------
bq.Since the midpoint for index table region may not be chosen for the split,
it is possible that the daughter regions of index region may have (quite)
different amount of data. How can we mitigate this effect ?
I think this wont happen. The daughter regions of index region will have
similar size proportions as that of the actual table region. For an actual
table region there are 10 entries in that and now its is split as 6,4.
Consider there one index for the data. The index region before the split will
contain 10 entries in it and after the spilt the daugthers will have 6,4
entries each. Only diff will be the way the half file reading will happen. In
case of normal table there is a clear split point wrt RK and the readers can
readup split point/ read from split point. But for the index region, both the
daugther region readers need to start from the begin position and check whether
each entry belongs to it or not and traverse. After a split the compaction
will happen using the HalfFileReader and split it into 2 physical files. So the
reader overhead is only temporal.
> Secondary index support through coprocessors
> --------------------------------------------
>
> Key: HBASE-9203
> URL: https://issues.apache.org/jira/browse/HBASE-9203
> Project: HBase
> Issue Type: New Feature
> Affects Versions: 0.98.0
> Reporter: rajeshbabu
> Assignee: rajeshbabu
> Attachments: SecondaryIndex Design.pdf
>
>
> We have been working on implementing secondary index in HBase and open
> sourced on hbase 0.94.8 version.
> The project is available on github.
> https://github.com/Huawei-Hadoop/hindex
> This Jira is to support secondary index on trunk(0.98).
> Following features will be supported.
> - multiple indexes on table,
> - multi column index,
> - index based on part of a column value,
> - equals and range condition scans using index, and
> - bulk loading data to indexed table (Indexing done with bulk load)
> Most of the kernel changes needed for secondary index is available in trunk.
> Very minimal changes needed for it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira