[ 
https://issues.apache.org/jira/browse/HBASE-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13533300#comment-13533300
 ] 

Jonathan Hsieh commented on HBASE-7339:
---------------------------------------

Encoding all this info into the filename was a trick used to avoid the costly 
opreations of opening and reading the contents of the files -- now we can just 
do a single dir listing and have enough information to link to the orginal file.

The problem now is that what are valid table names and what are valid regions 
names concatentated with a '.' gets interpreted incorrectly.  

true hfile name constraints: [0-9]+(?:_SeqID_[0-9]+)?
region name constraints    : [a-f0-9]{16}  (but we currently just use 
[a-f0-9]+.)
table name constraints     : [a-zA-Z0-9_][a-zA-Z0-9_.-]*

Notice that the table name constraints completely covers all region name 
constraints and true hfile name constraints.   (a valid hfile name is a valid 
part of a table name, and a valid enc region name is a valid part of a table 
name.

Currently the hfilelink filename convention is <hfile>-<region>-<table>.  
Unfortunately, making a ref to this uses the name 
<hfile>-<region>-<table>.<parentregion> -- the contactnation of 
<table>.<parentregion> is a valid table name used to get interpreted as such.  
My first thought on #3 was that if we introduced a char such as '@' to separate 
the <table> from the <parentregion>, our regex parsers would simply work.

We need to add some char invalid for table names to the hfilelink or reference 
filename convention.

hm.. it seems like if we changed the order of the hfile-link name we could 
avoid some of the confusion -- <table>@<region>-<hfile>.<parentregion> (or some 
other separator char than '@') could be used to avoid handling on the initial 
filenotfoundexception but I think we'd still need a good chunk of the logic to 
handle opening half-storefile reader throw a hfilelink.


                
> Splitting a hfilelink causes region servers to go down.
> -------------------------------------------------------
>
>                 Key: HBASE-7339
>                 URL: https://issues.apache.org/jira/browse/HBASE-7339
>             Project: HBase
>          Issue Type: Sub-task
>          Components: snapshots
>    Affects Versions: hbase-6055
>            Reporter: Jonathan Hsieh
>            Assignee: Jonathan Hsieh
>            Priority: Blocker
>             Fix For: hbase-6055
>
>         Attachments: hbase-7339.patch, pre-hbase-7339.patch
>
>
> Steps:
> - Have a single region table t with 15 hfiles in it.
> - Snapshot it. (was done using online snapshot from HBASE-7321)
> - Clone a snapshot to table t'. 
> - t' has its region do a post-open task that attempts to compact region.  
> policy does not compact all files. (default seems to be 10)
> - after compaction we have hfile links and real hfiles mixed in the region
> - t' starts splitting
> - creating split references, opening daughers fails 
> - hfile links are "split", creating hfile link daughter refs.  
> {{<<hfile>\-<region>\-<table>>.<parentregion>}}
> - these "split" hfile links are interpreted as hfile links with table 
> {{<table>.<parentregion>}} -> 
> {{<<hfile>\-<region>>\-<<table>.<parentregion>>}}  (groupings interpreted 
> incorrectly)
> - Since this is after the splitting PONR, this aborts the server.  It then 
> spreads to the next server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to