[
https://issues.apache.org/jira/browse/HBASE-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15619937#comment-15619937
]
Xiang Li commented on HBASE-16959:
----------------------------------
[~mbertozzi], thanks for the comment!
I am not sure if mapreduce.jobtracker.address still takes effect in Yarn. I
tried it by
(1) adding it by -D in HBase ExportSnapshot command, like hbase
-Dmapreduce.jobtracker.address=local
org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot snap1 -copy-to
file:///tmp/snap_export. (I think -D might not take effect here)
(2) modifying mapred-site.xml and then re-starting Yarn
Both do not work for me.
I also tried with yarn.resourcemanager.address=local, but apparently, it is not
correct, because Yarn client can not connect to the ResourceManager.
I might not catch up with you, or my usage is not correct, could you please do
us a favor to elaborate more?
> Export snapshot to local file system of a single node
> -----------------------------------------------------
>
> Key: HBASE-16959
> URL: https://issues.apache.org/jira/browse/HBASE-16959
> Project: HBase
> Issue Type: New Feature
> Components: snapshots
> Reporter: Xiang Li
> Priority: Critical
>
> ExportSnapshot allows uses to specify "file://" in "copy-to".
> Based on the implementation (use Map jobs), it works as follow:
> (1) The manifest of the snapshot(.hbase-snapshot) is exported to the local
> file system of the HBase client node where the command is issued
> (2) The data of the snapshot(archive) is exported to the local file system
> of the nodes where the map jobs run, so spread everywhere.
> *That causes 2 problems we meet so far:*
> (1) The last step to verify the snapshot integrity fails, due to that not all
> the data can be found on the HBase client node where the command is issued.
> "-no-target-verify" can be of help here to suppress the verification, but it
> is not a good idea
> (2) When the HBase client (where the command is issued) is also a NodeManager
> of Yarn, and it happens to have a map job (to write data of snapshot) running
> on it, the "copy-to" directory will be created firstly when writing the
> manifest by user=hbase and then user=yarn(if it is not controlled) will try
> to write data into it. If the directory permission is not set properly, let
> say, umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is
> created with no write permission(777-022=755, so rwxr-xr-x) for the same
> group, user=yarn can not write data into the "copy-to" directory, as it is
> created by user=hbase. We have the following exception
> {code}
> Error: java.io.IOException: Mkdirs failed to create
> file:/tmp/snap_export/archive/data/default/table_xxx/regionid_xxx/info
> (exists=false,
> cwd=file:/hadoop/yarn/local/usercache/hbase/appcache/application_1477577812726_0001/container_1477577812726_0001_01_000004)
> at
> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:449)
> at
> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787)
> at
> org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:275)
> at
> org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:193)
> at
> org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:119)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {code}
> We can control the permission to resolve that, but it is not a good idea
> either.
> *Proposal*
> If exporting to "file://", add reduce to aggregate all "distributed" data of
> the snapshot to the HBase client node where the command is issued, to be
> together with the manifest of the snapshot. That can resolve the verification
> problem above in (1)
> For problem (2), have no idea so far
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)