[
https://issues.apache.org/jira/browse/HBASE-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13533300#comment-13533300
]
Jonathan Hsieh commented on HBASE-7339:
---------------------------------------
Encoding all this info into the filename was a trick used to avoid the costly
opreations of opening and reading the contents of the files -- now we can just
do a single dir listing and have enough information to link to the orginal file.
The problem now is that what are valid table names and what are valid regions
names concatentated with a '.' gets interpreted incorrectly.
true hfile name constraints: [0-9]+(?:_SeqID_[0-9]+)?
region name constraints : [a-f0-9]{16} (but we currently just use
[a-f0-9]+.)
table name constraints : [a-zA-Z0-9_][a-zA-Z0-9_.-]*
Notice that the table name constraints completely covers all region name
constraints and true hfile name constraints. (a valid hfile name is a valid
part of a table name, and a valid enc region name is a valid part of a table
name.
Currently the hfilelink filename convention is <hfile>-<region>-<table>.
Unfortunately, making a ref to this uses the name
<hfile>-<region>-<table>.<parentregion> -- the contactnation of
<table>.<parentregion> is a valid table name used to get interpreted as such.
My first thought on #3 was that if we introduced a char such as '@' to separate
the <table> from the <parentregion>, our regex parsers would simply work.
We need to add some char invalid for table names to the hfilelink or reference
filename convention.
hm.. it seems like if we changed the order of the hfile-link name we could
avoid some of the confusion -- <table>@<region>-<hfile>.<parentregion> (or some
other separator char than '@') could be used to avoid handling on the initial
filenotfoundexception but I think we'd still need a good chunk of the logic to
handle opening half-storefile reader throw a hfilelink.
> Splitting a hfilelink causes region servers to go down.
> -------------------------------------------------------
>
> Key: HBASE-7339
> URL: https://issues.apache.org/jira/browse/HBASE-7339
> Project: HBase
> Issue Type: Sub-task
> Components: snapshots
> Affects Versions: hbase-6055
> Reporter: Jonathan Hsieh
> Assignee: Jonathan Hsieh
> Priority: Blocker
> Fix For: hbase-6055
>
> Attachments: hbase-7339.patch, pre-hbase-7339.patch
>
>
> Steps:
> - Have a single region table t with 15 hfiles in it.
> - Snapshot it. (was done using online snapshot from HBASE-7321)
> - Clone a snapshot to table t'.
> - t' has its region do a post-open task that attempts to compact region.
> policy does not compact all files. (default seems to be 10)
> - after compaction we have hfile links and real hfiles mixed in the region
> - t' starts splitting
> - creating split references, opening daughers fails
> - hfile links are "split", creating hfile link daughter refs.
> {{<<hfile>\-<region>\-<table>>.<parentregion>}}
> - these "split" hfile links are interpreted as hfile links with table
> {{<table>.<parentregion>}} ->
> {{<<hfile>\-<region>>\-<<table>.<parentregion>>}} (groupings interpreted
> incorrectly)
> - Since this is after the splitting PONR, this aborts the server. It then
> spreads to the next server.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira