[ 
https://issues.apache.org/jira/browse/HBASE-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiang Li updated HBASE-16959:
-----------------------------
    Description: 
ExportSnapshot allows uses to specify "file://" in "copy-to".
Based on the implementation (use Map jobs), it works as follow
(1) The manifest of the snapshot(.hbase-snapshot) is exported to the local file 
system of the HBase client node where the command is issued
(2) The data of the snapshot(archive)  is exported to the local file system of 
the nodes where the map jobs run, so spread everywhere.

*That causes 2 problems we meet so far:*
(1) The last step to verify the snapshot integrity fails, due to that not all 
the data can be found on the HBase client node where the command is issued. 
"-no-target-verify" can be of help here to suppress the verification, but it is 
not a good idea
(2) When the HBase client (where the command is issued) is also a NodeManager 
of Yarn, and it happens to have a map job (to write data of snapshot) running 
on it, the "copy-to" directory will be created firstly when writing the 
manifest by user=hbase and then user=yarn(if it is not controlled) will try to 
write data into it. If the directory permission is not set properly, let say, 
umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is created 
with no write permission(777-022=755, so rwxr-xr-x) for the same group, 
user=yarn can not write data into the "copy-to" directory, as it is created by 
user=hbase. We have the following exception
{code}
Error: java.io.IOException: Mkdirs failed to create 
file:/tmp/snap_export/archive/data/default/table_xxx/regionid_xxx/info 
(exists=false, 
cwd=file:/hadoop/yarn/local/usercache/hbase/appcache/application_1477577812726_0001/container_1477577812726_0001_01_000004)
        at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:449)
        at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787)
        at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:275)
        at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:193)
        at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:119)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
{code}
We can control the permission to resolve that, but it is not a good idea either.

*Proposal*
If exporting to "file://", add reduce to aggregate all "distributed" data of 
the snapshot to the HBase client node where the command is issued, to be 
together with the manifest of the snapshot. That can resolve the verification 
problem above in (1)
For problem (2), have no idea so far

  was:
ExportSnapshot allows uses to specify "file://" in "copy-to".
Based on the implementation (use Map jobs), it works as follow
(1) The manifest of the snapshot(.hbase-snapshot) is exported to the local file 
system of the HBase client node where the command is issued
(2) The data of the snapshot(archive)  is exported to the local file system of 
the nodes where the map jobs run, so spread everywhere.

*That causes 2 problems we meet so far:*
(1) The last step to verify the snapshot integrity fails, due to that not all 
the data can be found on the HBase client node where the command is issued. 
"-no-target-verify" can be of help here to suppress the verification, but it is 
not a good idea
(2) When the HBase client (where the command is issued) is also a NodeManager 
of Yarn, and it happens to have a map job (to write data of snapshot) running 
on it, the "copy-to" directory will be created firstly when writing the 
manifest by user=hbase and then user=yarn(if it is not controlled) will try to 
write data into it. If the directory permission is not set properly, let say, 
umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is created 
with no write permission(777-022=755, so rwxr-xr-x) for the same group, 
user=yarn can not write data into the "copy-to" directory, as it is created by 
user=hbase. We have the following exception
{code}
Error: java.io.IOException: Mkdirs failed to create 
file:/tmp/snap_export/archive/data/default/table_xxx/regionid_xxx/info 
(exists=false, 
cwd=file:/hadoop/yarn/local/usercache/hbase/appcache/application_1477577812726_0001/container_1477577812726_0001_01_000004)
        at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:449)
        at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787)
        at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:275)
        at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:193)
        at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:119)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
{code}
We can control the permission to resolve that, but it is not a good idea either.

*Propoal*
If exporting to "file://", add reduce to aggregate all "distributed" data of 
the snapshot to the HBase client node where the command is issued, to be 
together with the manifest of the snapshot. That can resolve the verification 
problem above in (1)
For problem (2), have no idea so far


> Export snapshot to local file system of a single node
> -----------------------------------------------------
>
>                 Key: HBASE-16959
>                 URL: https://issues.apache.org/jira/browse/HBASE-16959
>             Project: HBase
>          Issue Type: New Feature
>          Components: snapshots
>            Reporter: Xiang Li
>            Priority: Critical
>
> ExportSnapshot allows uses to specify "file://" in "copy-to".
> Based on the implementation (use Map jobs), it works as follow
> (1) The manifest of the snapshot(.hbase-snapshot) is exported to the local 
> file system of the HBase client node where the command is issued
> (2) The data of the snapshot(archive)  is exported to the local file system 
> of the nodes where the map jobs run, so spread everywhere.
> *That causes 2 problems we meet so far:*
> (1) The last step to verify the snapshot integrity fails, due to that not all 
> the data can be found on the HBase client node where the command is issued. 
> "-no-target-verify" can be of help here to suppress the verification, but it 
> is not a good idea
> (2) When the HBase client (where the command is issued) is also a NodeManager 
> of Yarn, and it happens to have a map job (to write data of snapshot) running 
> on it, the "copy-to" directory will be created firstly when writing the 
> manifest by user=hbase and then user=yarn(if it is not controlled) will try 
> to write data into it. If the directory permission is not set properly, let 
> say, umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is 
> created with no write permission(777-022=755, so rwxr-xr-x) for the same 
> group, user=yarn can not write data into the "copy-to" directory, as it is 
> created by user=hbase. We have the following exception
> {code}
> Error: java.io.IOException: Mkdirs failed to create 
> file:/tmp/snap_export/archive/data/default/table_xxx/regionid_xxx/info 
> (exists=false, 
> cwd=file:/hadoop/yarn/local/usercache/hbase/appcache/application_1477577812726_0001/container_1477577812726_0001_01_000004)
>       at 
> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:449)
>       at 
> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435)
>       at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
>       at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890)
>       at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787)
>       at 
> org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:275)
>       at 
> org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:193)
>       at 
> org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:119)
>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>       at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>       at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {code}
> We can control the permission to resolve that, but it is not a good idea 
> either.
> *Proposal*
> If exporting to "file://", add reduce to aggregate all "distributed" data of 
> the snapshot to the HBase client node where the command is issued, to be 
> together with the manifest of the snapshot. That can resolve the verification 
> problem above in (1)
> For problem (2), have no idea so far



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to