[ https://issues.apache.org/jira/browse/HBASE-20668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gavin updated HBASE-20668: -------------------------- Comment: was deleted (was: A comment with security level 'jira-users' was removed.) > Avoid permission change if ExportSnapshot's copy fails > ------------------------------------------------------ > > Key: HBASE-20668 > URL: https://issues.apache.org/jira/browse/HBASE-20668 > Project: HBase > Issue Type: Bug > Reporter: Ted Yu > Assignee: Ted Yu > Priority: Major > Fix For: 2.1.0 > > Attachments: 20668.v1.txt, 20668.v2.txt > > > The exception from FileSystem operation in the finally block of > ExportSnapshot#doWork may hide the real exception from FileUtil.copy call. > I was debugging the following error [~romil.choksi] saw during testing > ExportSnapshot : > {code:java} > 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - > run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|2018-06-01 02:40:52,358 > ERROR [main] util.AbstractHBaseTool: Error running command-line tool > 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - > run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|java.io.FileNotFoundException: > Directory/File does not exist /apps/ > hbase/data/.hbase-snapshot/.tmp/snapshot_table_334546 > 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - > run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at > org.apache.hadoop.hdfs.server.namenode.FSDirectory. > checkOwner(FSDirectory.java:1777) > 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - > run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at > org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp. > setOwner(FSDirAttrOp.java:82) > {code} > Here is corresponding code (with extra log added): > {code:java} > try { > LOG.info("Copy Snapshot Manifest from " + snapshotDir + " to " + > initialOutputSnapshotDir); > boolean ret = FileUtil.copy(inputFs, snapshotDir, outputFs, > initialOutputSnapshotDir, false, > false, conf); > LOG.info("return val = " + ret); > } catch (IOException e) { > LOG.warn("Failed to copy the snapshot directory: from=" + > snapshotDir + " to=" + initialOutputSnapshotDir, e); > throw new ExportSnapshotException("Failed to copy the snapshot > directory: from=" + > snapshotDir + " to=" + initialOutputSnapshotDir, e); > } finally { > if (filesUser != null || filesGroup != null) { > LOG.warn((filesUser == null ? "" : "Change the owner of " + > needSetOwnerDir + " to " > + filesUser) > + (filesGroup == null ? "" : ", Change the group of " + > needSetOwnerDir + " to " > + filesGroup)); > setOwner(outputFs, needSetOwnerDir, filesUser, filesGroup, true); > } > {code} > "return val = " was not seen in rerun of the test. > This is what the additional log revealed: > {code:java} > 2018-06-01 09:22:54,247|INFO|MainThread|machine.py:167 - > run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|2018-06-01 09:22:54,241 WARN > [main] snapshot.ExportSnapshot: Failed to copy the snapshot directory: > from=hdfs://ns1/apps/hbase/data/.hbase-snapshot/snapshot_table_157842 > to=hdfs://ns3/apps/hbase/data/.hbase-snapshot/.tmp/snapshot_table_157842 > 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - > run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|org.apache.hadoop.security.AccessControlException: > Permission denied: user=hbase, access=WRITE, > inode="/apps/hbase/data/.hbase-snapshot/.tmp":hrt_qa:hadoop:dr----x-wT > 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - > run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. > check(FSPermissionChecker.java:399) > 2018-06-01 09:22:54,249|INFO|MainThread|machine.py:167 - > run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. > checkPermission(FSPermissionChecker.java:255) > {code} > It turned out that the exception from {{setOwner}} call in the finally block > eclipsed the real exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005)