[
https://issues.apache.org/jira/browse/HBASE-7419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546132#comment-13546132
]
stack commented on HBASE-7419:
------------------------------
In hdfs, a path is a URI (http://en.wikipedia.org/wiki/URI_scheme). For the
URI 'scheme' part rules, this is usually "protocols" specific and the URI
parsers just do a pass-through (they are supposed to). Looking at the
Path.java class to see how it handles the scheme, I see no special casing for
'@' in Path.java. I was afraid it'd get interpreted somewhere in our parsing
code as delimiter between "username" and "hostname" (mailto: and http:
protocols). Probably best to just avoid it though if possible. '=' sounds
like it will work but is hard to read without thinking equality or assignment.
Regards the patch, it looks good. Are there enough tests of different filename
combinations? Regexes' can surprise in interesting ways.
> revisit hfilelink file name format.
> -----------------------------------
>
> Key: HBASE-7419
> URL: https://issues.apache.org/jira/browse/HBASE-7419
> Project: HBase
> Issue Type: Sub-task
> Components: Client, master, regionserver, snapshots, Zookeeper
> Reporter: Jonathan Hsieh
> Assignee: Matteo Bertozzi
> Fix For: hbase-6055, 0.96.0
>
> Attachments: HBASE-7419-v0.patch, HBASE-7419-v1.patch,
> HBASE-7419-v2.patch
>
>
> Valid table names are concatted with a '.' to a valid regions names is also a
> valid table name, and lead to the incorrect interpretation.
> {code}
> true hfile name constraints: [0-9]+(?:_SeqID_[0-9]+)?
> region name constraints : [a-f0-9]{16} (but we currently just use
> [a-f0-9]+.)
> table name constraints : [a-zA-Z0-9_][a-zA-Z0-9_.-]*
> {code}
> Notice that the table name constraints completely covers all region name
> constraints and true hfile name constraints. (a valid hfile name is a valid
> part of a table name, and a valid enc region name is a valid part of a table
> name.
> Currently the hfilelink filename convention is <hfile>-<region>-<table>.
> Unfortunately, making a ref to this uses the name
> <hfile>-<region>-<table>.<parentregion> -- the contactnation of
> <table>.<parentregion> is a valid table name used to get interpreted as such.
> The fix in HBASE-7339 requires a FileNotFoundException before going down the
> hfile link resolution path.
> Regardless of what we do, we need to add some char invalid for table names to
> the hfilelink or reference filename convention.
> Suggestion: if we changed the order of the hfile-link name we could avoid
> some of the confusion -- <table>@<region>-<hfile>.<parentregion> (or some
> other separator char than '@') could be used to avoid handling on the initial
> filenotfoundexception but I think we'd still need a good chunk of the logic
> to handle opening half-storefile reader throw a hfilelink.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira