[
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12978122#action_12978122
]
stack commented on HBASE-3417:
------------------------------
As discussed up on IRC, this is not backward compatible:
{code}
+ Pattern.compile("^(\\w{32})(?:\\.(.+))?$");
{code}
You can do a range IIRC 20-32 (was old length 20 chars?)
The below is a little bit messy.:
{code}
+ return new Path(dir, UUID.randomUUID().toString().replaceAll("-", "")
+ + ((suffix == null || suffix.length() <= 0) ? "" : suffix));
{code}
Up on IRC, was thinking should base64 because then it'd be more compact. See
http://stackoverflow.com/questions/772802/storing-uuid-as-base64-string. There
is also in hbase util a Base64#encodeBytes method that will take the 128 UUID
bits and emit them as base64 (Possible to get it all down to 22 chars). But
looking at the base64 vocabulary, http://en.wikipedia.org/wiki/Base64, it
includes '+' and '/' which are illegal in URL, a hdfs filepath. Base32?
http://en.wikipedia.org/wiki/Base32? But that won't work either. Has to be
multiples of 40 bits.
Maybe leave it as it comes out of UUID.toString w/ hyphens. Then its plain its
a UUID and its easier to read?
> CacheOnWrite is using the temporary output path for block names, need to use
> a more consistent block naming scheme
> ------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-3417
> URL: https://issues.apache.org/jira/browse/HBASE-3417
> Project: HBase
> Issue Type: Bug
> Components: io, regionserver
> Affects Versions: 0.92.0
> Reporter: Jonathan Gray
> Assignee: Jonathan Gray
> Priority: Critical
> Fix For: 0.92.0
>
> Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch
>
>
> Currently the block names used in the block cache are built using the
> filesystem path. However, for cache on write, the path is a temporary output
> file.
> The original COW patch actually made some modifications to block naming stuff
> to make it more consistent but did not do enough. Should add a separate
> method somewhere for generating block names using some more easily mocked
> scheme (rather than just raw path as we generate a random unique file name
> twice, once for tmp and then again when moved into place).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.