[
https://issues.apache.org/jira/browse/HBASE-7419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544072#comment-13544072
]
Jonathan Hsieh commented on HBASE-7419:
---------------------------------------
I'm actually not clear what the concerns are about special characters outside
of the hdfs context, and when pertaining to windows or web context. I assume
that since we are on hdfs, characters valid there would be valid regardless if
they are not in the underlaying file system.
In the web case, I buy having to have to escape filenames for web lookups (no
&%\/@). Are these problems at the tooling level (hdfs dfs -ls), the web level
or at testing? what are characters that are valid in hdfs that we should
avoid, and more importantly, why?
[~eclark], [~stack] you brought up the concerns about web stuff, comments?
[~enis] any quick pointers for hdfs file name and windows compat issues?
> revisit hfilelink file name format.
> -----------------------------------
>
> Key: HBASE-7419
> URL: https://issues.apache.org/jira/browse/HBASE-7419
> Project: HBase
> Issue Type: Sub-task
> Components: Client, master, regionserver, snapshots, Zookeeper
> Reporter: Jonathan Hsieh
> Assignee: Matteo Bertozzi
> Fix For: hbase-6055, 0.96.0
>
> Attachments: HBASE-7419-v0.patch
>
>
> Valid table names are concatted with a '.' to a valid regions names is also a
> valid table name, and lead to the incorrect interpretation.
> {code}
> true hfile name constraints: [0-9]+(?:_SeqID_[0-9]+)?
> region name constraints : [a-f0-9]{16} (but we currently just use
> [a-f0-9]+.)
> table name constraints : [a-zA-Z0-9_][a-zA-Z0-9_.-]*
> {code}
> Notice that the table name constraints completely covers all region name
> constraints and true hfile name constraints. (a valid hfile name is a valid
> part of a table name, and a valid enc region name is a valid part of a table
> name.
> Currently the hfilelink filename convention is <hfile>-<region>-<table>.
> Unfortunately, making a ref to this uses the name
> <hfile>-<region>-<table>.<parentregion> -- the contactnation of
> <table>.<parentregion> is a valid table name used to get interpreted as such.
> The fix in HBASE-7339 requires a FileNotFoundException before going down the
> hfile link resolution path.
> Regardless of what we do, we need to add some char invalid for table names to
> the hfilelink or reference filename convention.
> Suggestion: if we changed the order of the hfile-link name we could avoid
> some of the confusion -- <table>@<region>-<hfile>.<parentregion> (or some
> other separator char than '@') could be used to avoid handling on the initial
> filenotfoundexception but I think we'd still need a good chunk of the logic
> to handle opening half-storefile reader throw a hfilelink.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira