[
https://issues.apache.org/jira/browse/HBASE-8156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13608028#comment-13608028
]
stack commented on HBASE-8156:
------------------------------
Put this define beside where it is used?
DEFAULT_HA_HDFS_CLIENT_RETRIES_NUMBER
Put it in FSUtils or probably better, in HRegionFileSystem.
Is it explicitly HA retries or just 'retries'? If the latter, drop the HA from
the name.
You think methods like below belong better in FSUtils rather than in
HRegionFileSystem?
{code}
+ /**
+ * makes a call to create a directory; doesn't check whether directory
already exists.
+ * @param fs
+ * @param dirPath
+ * @return
+ * @throws IOException
+ */
+ public static boolean createDirectory(final FileSystem fs, Path dirPath)
throws IOException {
+ IOException lastIOE = null;
+ int i = 0;
+ do {
+ try {
+ return fs.mkdirs(dirPath);
+ } catch (IOException ioe) {
+ lastIOE = ioe;
+ if (fs.exists(dirPath)) return true;
+ // dir is not there, retry after some time.
+ sleepBeforeRetry("Create Directory", i);
+ }
+ } while (++i < haHdfsClientRetriesNumber);
+ throw lastIOE;
+ }
{code}
Should you say this method will retry on exception?
Is the below from the above a bit broad for a catch? Could it be narrowed?
+ } catch (IOException ioe) {
Why is the method createDirectory when it is a wrapper around mkdir? Should it
not be named the same?
Same for
- if (!fs.delete(splitdir, true)) {
+ if (!FSUtils.deleteDirectory(fs, splitdir)) {
Regards the below:
+ * With High Availability HDFS, one needs to retry non-idempotent operation
at the client level.
I would say it not specific to HA. Also, it is not a retry -- it is more an
allowance that fs ops could hiccup....
So haHdfsClientRetriesNumber should just be hdfsClientRetriesNumber
Is this a good idea:
+ static {
+ haHdfsClientRetriesNumber =
HBaseConfiguration.create().getInt("ha.hdfs.client.retries.number",
+ HConstants.DEFAULT_HA_HDFS_CLIENT_RETRIES_NUMBER);
+ if (haHdfsClientRetriesNumber <= 0) {
+ LOG.warn("The property ha.hdfs.client.retries.number should be set to a
postive value");
+ }
+ }
You are creating a whole COnfiguration -- pretty expensive -- to read a single
value. Passing in this config would be a bit bit much maybe? Would it make
sense instantiating a FSUtil? If these methods were in HRegionFileSystem
instead of here, then they'd have the config?
In rename, should you check src is still there too?
Use Threads.sleep instead of this: + Thread.sleep(baseSleepBeforeRetries *
sleepMultiplier); ? (Threads is an hbase class).
Check out the sleepBeforeRetry method. Add spaces around commas in messages....
This stuff is hard to test but we need tests.... perhaps mock up a fs?
This is good stuff Himanshu
> Support for Namenode HA for non-idempotent operations
> -----------------------------------------------------
>
> Key: HBASE-8156
> URL: https://issues.apache.org/jira/browse/HBASE-8156
> Project: HBase
> Issue Type: Sub-task
> Components: Filesystem Integration
> Affects Versions: 0.95.0
> Reporter: Himanshu Vashishtha
> Assignee: Himanshu Vashishtha
> Fix For: 0.98.0
>
> Attachments: HBase-8156-trunk-v1.patch
>
>
> In hadoop 2 HA, non-idempotent operations are not retried at the hdfs side.
> This is by design as retrying a non-idempotent operation might not be a good
> design choice for some use case.
> HBase needs to handle the retries for such operations at its end.
> With HBase-7806, there is already some work going on for file system
> abstractions. There, HReginFileSystem sits as an abstraction between region
> and FS. This jira is a move in the same direction, where it adds retry
> functionality for non-idempotent calls such as create, rename and delete in
> HRegionFileSystem class.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira