[
https://issues.apache.org/jira/browse/HBASE-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-1212:
-------------------------
Fix Version/s: (was: 0.20.0)
Thinking on it, this event should be extremely rare. Sequence ids are
monotonically increasing in a running regionserver. Across a cluster, two
files of the same family would have to end with same sequenceid. Then whats
the likelihood that of all regions on cluster these are the two to merge (Merge
is a little-used tool to date).
To fix, would need to look at the content of the two files and make a judgement
as to which should come before the other -- which has the most recent edits.
Maybe we could do something basic like let the file with the largest size
prevail over the smaller. Once we'd figure which file to bring to the fore, we
need to rewrite the hfile so we can change the sequence id. Since we're
rewriting one of the files at least, might as well compact them.
We could move to modification times. That should simplify this sequenceid
story. It wouldn't remove this issue. We'd still have to figure which store
file to favor if two happened to have same mod time.
In bigtable, chubby owns the storefiles/sstables. Maybe thats where we should
go so we don't have sequenceids anymore?
Moving out of 0.20.0 because this issue rare and amount of work to address is
large.
> merge tool expects regions all have different sequence ids
> ----------------------------------------------------------
>
> Key: HBASE-1212
> URL: https://issues.apache.org/jira/browse/HBASE-1212
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
>
> Currently merging two regions, the merge tool will compare their sequence
> ids. If same, it will decrement one. It needs to do this because on region
> open, files are keyed by their sequenceid; if two the same, one will erase
> the other.
> Well, with the move to the aggregating hfile format, the sequenceid is
> written when the file is created and its no longer written into an aside file
> but as metadata on to the end of the file. Changing the sequenceid is no
> longer an option.
> This issue is about figuring a solution for the rare case where two store
> files have same sequence id AND we want to merge the two regions.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.