[
https://issues.apache.org/jira/browse/HBASE-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiang Li updated HBASE-16959:
-----------------------------
Description:
ExportSnapshot allows uses to specify "file://" in "copy-to".
Based on the implementation (use Map jobs), it works as follow
(1) The manifest of the snapshot(.hbase-snapshot) is exported to the local file
system of the HBase client node where the command is issued
(2) The data of the snapshot(archive) is exported to the local file system of
the nodes where the map jobs run, so spread everywhere.
That causes 2 problems we meet so far:
(1) The last step to verify the snapshot integrity fails, due to that not all
the data can be found on the HBase client node where the command is issued.
"-no-target-verify" can be of help here to suppress the verification, but it is
not a good idea
(2) When the HBase client (where the command is issued) is also a NodeManager
of Yarn, and it happens to have a map job (to write data of snapshot) running
on it, the "copy-to" directory will be created firstly when writing the
manifest by user=hbase and then user=yarn(if it is not controlled) will try to
write data into it. If the directory permission is not set properly, let say,
umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is created
with no write permission(777-022=755, so rwxr-xr-x) for the same group,
user=yarn can not write data into the "copy-to" directory, as it is created by
user=hbase. We have the following exception
{code}
Error: java.io.IOException: Mkdirs failed to create
file:/tmp/snap_export/archive/data/default/table_xxx/regionid_xxx/info
(exists=false,
cwd=file:/hadoop/yarn/local/usercache/hbase/appcache/application_1477577812726_0001/container_1477577812726_0001_01_000004)
at
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:449)
at
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787)
at
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:275)
at
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:193)
at
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:119)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
{code}
We can control the permission to resolve that, but it is not a good idea either.
*Propoal*
Add reduce to move all "distributed" data of the snapshot to the HBase client
node where the command is issued, to be together with the manifest of the
snapshot. That can resolve the verification problem above in (1)
For problem (2), have no idea so far
was:
Current ExportSnapshot allows uses to specify "file://" in "copy-to", but it
works as follow
(1) The manifest of the snapshot(.hbase-snapshot) is exported to local file
system of the HBase client node where the command is issued
(2) The data
> Export snapshot to local file system
> ------------------------------------
>
> Key: HBASE-16959
> URL: https://issues.apache.org/jira/browse/HBASE-16959
> Project: HBase
> Issue Type: New Feature
> Components: snapshots
> Reporter: Xiang Li
> Priority: Critical
>
> ExportSnapshot allows uses to specify "file://" in "copy-to".
> Based on the implementation (use Map jobs), it works as follow
> (1) The manifest of the snapshot(.hbase-snapshot) is exported to the local
> file system of the HBase client node where the command is issued
> (2) The data of the snapshot(archive) is exported to the local file system
> of the nodes where the map jobs run, so spread everywhere.
> That causes 2 problems we meet so far:
> (1) The last step to verify the snapshot integrity fails, due to that not all
> the data can be found on the HBase client node where the command is issued.
> "-no-target-verify" can be of help here to suppress the verification, but it
> is not a good idea
> (2) When the HBase client (where the command is issued) is also a NodeManager
> of Yarn, and it happens to have a map job (to write data of snapshot) running
> on it, the "copy-to" directory will be created firstly when writing the
> manifest by user=hbase and then user=yarn(if it is not controlled) will try
> to write data into it. If the directory permission is not set properly, let
> say, umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is
> created with no write permission(777-022=755, so rwxr-xr-x) for the same
> group, user=yarn can not write data into the "copy-to" directory, as it is
> created by user=hbase. We have the following exception
> {code}
> Error: java.io.IOException: Mkdirs failed to create
> file:/tmp/snap_export/archive/data/default/table_xxx/regionid_xxx/info
> (exists=false,
> cwd=file:/hadoop/yarn/local/usercache/hbase/appcache/application_1477577812726_0001/container_1477577812726_0001_01_000004)
> at
> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:449)
> at
> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787)
> at
> org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:275)
> at
> org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:193)
> at
> org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:119)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {code}
> We can control the permission to resolve that, but it is not a good idea
> either.
> *Propoal*
> Add reduce to move all "distributed" data of the snapshot to the HBase client
> node where the command is issued, to be together with the manifest of the
> snapshot. That can resolve the verification problem above in (1)
> For problem (2), have no idea so far
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)