[
https://issues.apache.org/jira/browse/HBASE-21077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ted Yu updated HBASE-21077:
---------------------------
Resolution: Fixed
Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
Status: Resolved (was: Patch Available)
Thanks for the patch, Vlad.
Corrected the duration in comment when committing.
> MR job launched by hbase incremental backup command failed with
> FileNotFoundException
> -------------------------------------------------------------------------------------
>
> Key: HBASE-21077
> URL: https://issues.apache.org/jira/browse/HBASE-21077
> Project: HBase
> Issue Type: Bug
> Reporter: Vladimir Rodionov
> Assignee: Vladimir Rodionov
> Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21077-v1.patch, HBASE-21077-v2.patch,
> HBASE-21077-v3.patch
>
>
> Discovered during internal testing by Romil Choksi.
> MR job launched by hbase incremental backup command failed with
> FileNotFoundException
> from test console log
> {code}
> 2018-06-12 04:27:31,160|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:31,160 INFO
> [main] mapreduce.JobSubmitter: Submitting tokens for job:
> job_1528766389356_0044
> 2018-06-12 04:27:31,186|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:31,184 INFO
> [main] mapreduce.JobSubmitter: Executing with tokens: [Kind:
> HDFS_DELEGATION_TOKEN, Service: ha-hdfs:ns1, Ident: (token for hbase:
> HDFS_DELEGATION_TOKEN [email protected], renewer=yarn, realUser=,
> issueDate=1528777648605, maxDate=1529382448605, sequenceNumber=175,
> masterKeyId=2), Kind: kms-dt, Service: 172.27.68.203:9393, Ident: (kms-dt
> owner=hbase, renewer=yarn, realUser=, issueDate=1528777649149,
> maxDate=1529382449149, sequenceNumber=49, masterKeyId=2), Kind:
> HBASE_AUTH_TOKEN, Service: bc71e347-78ff-4f95-af44-006f9b549a84, Ident:
> (org.apache.hadoop.hbase.security.token.AuthenticationTokenIdentifier@5),
> Kind: kms-dt, Service: 172.27.52.14:9393, Ident: (kms-dt owner=hbase,
> renewer=yarn, realUser=, issueDate=1528777648918, maxDate=1529382448918,
> sequenceNumber=50, masterKeyId=2)]
> 2018-06-12 04:27:31,477|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:31,477 INFO
> [main] conf.Configuration: found resource resource-types.xml at
> file:/etc/hadoop/3.0.0.0-1476/0/resource-types.xml
> 2018-06-12 04:27:31,527|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:31,527 INFO
> [main] impl.TimelineClientImpl: Timeline service address:
> ctr-e138-1518143905142-359429-01-000004.hwx.site:8190
> 2018-06-12 04:27:32,563|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:32,562 INFO
> [main] impl.YarnClientImpl: Submitted application
> application_1528766389356_0044
> 2018-06-12 04:27:32,634|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:32,634 INFO
> [main] mapreduce.Job: The url to track the job:
> https://ctr-e138-1518143905142-359429-01-000003.hwx.site:8090/proxy/application_1528766389356_0044/
> 2018-06-12 04:27:32,635|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:32,635 INFO
> [main] mapreduce.Job: Running job: job_1528766389356_0044
> 2018-06-12 04:27:44,807|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:44,806 INFO
> [main] mapreduce.Job: Job job_1528766389356_0044 running in uber mode : false
> 2018-06-12 04:27:44,809|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:44,809 INFO
> [main] mapreduce.Job: map 0% reduce 0%
> 2018-06-12 04:27:54,926|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:54,925 INFO
> [main] mapreduce.Job: map 5% reduce 0%
> 2018-06-12 04:27:56,950|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:56,950 INFO
> [main] mapreduce.Job: Task Id : attempt_1528766389356_0044_m_000002_0,
> Status : FAILED
> 2018-06-12 04:27:56,979|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|Error:
> java.io.FileNotFoundException: File does not exist:
> hdfs://ns1/apps/hbase/data/oldWALs/ctr-e138-1518143905142-359429-01-000004.hwx.site%2C16020%2C1528776085205.1528776160915
> 2018-06-12 04:27:56,979|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1583)
> 2018-06-12 04:27:56,980|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1576)
> 2018-06-12 04:27:56,980|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> 2018-06-12 04:27:56,980|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1591)
> 2018-06-12 04:27:56,980|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
> 2018-06-12 04:27:56,981|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:165)
> 2018-06-12 04:27:56,981|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:289)
> 2018-06-12 04:27:56,981|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:271)
> 2018-06-12 04:27:56,981|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:259)
> 2018-06-12 04:27:56,981|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:395)
> 2018-06-12 04:27:56,982|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> org.apache.hadoop.hbase.wal.AbstractFSWALProvider.openReader(AbstractFSWALProvider.java:449)
> 2018-06-12 04:27:56,982|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> org.apache.hadoop.hbase.mapreduce.WALInputFormat$WALRecordReader.openReader(WALInputFormat.java:166)
> 2018-06-12 04:27:56,982|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> org.apache.hadoop.hbase.mapreduce.WALInputFormat$WALRecordReader.initialize(WALInputFormat.java:158)
> 2018-06-12 04:27:56,983|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560)
> 2018-06-12 04:27:56,983|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798)
> 2018-06-12 04:27:56,983|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
> 2018-06-12 04:27:56,983|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
> 2018-06-12 04:27:56,983|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> java.security.AccessController.doPrivileged(Native Method)
> 2018-06-12 04:27:56,984|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> javax.security.auth.Subject.doAs(Subject.java:422)
> 2018-06-12 04:27:56,984|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1686)
> 2018-06-12 04:27:56,984|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|at
> org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> 2018-06-12 04:27:56,984|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|
> {code}
> But looks like it did find the file on hdfs, test script runs incremental
> backup command as HBase user.
> {code}
> 2018-06-12 04:27:30,756|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:30,755
> DEBUG [main] mapreduce.WALInputFormat: Scanning
> hdfs://ns1/apps/hbase/data/oldWALs/ctr-e138-1518143905142-359429-01-000004.hwx.site%2C16020%2C1528776085205.1528776160915
> for WAL files
> 2018-06-12 04:27:30,758|INFO|MainThread|machine.py:167 -
> run()||GUID=cb1d85c9-023c-4bc5-bf87-9c231c917eab|2018-06-12 04:27:30,758 INFO
> [main] mapreduce.WALInputFormat: Found:
> HdfsLocatedFileStatus{path=hdfs://ns1/apps/hbase/data/oldWALs/ctr-e138-1518143905142-359429-01-000004.hwx.site%2C16020%2C1528776085205.1528776160915;
> isDirectory=false; length=18031; replication=3; blocksize=268435456;
> modification_time=1528776689363; access_time=1528776160921; owner=hbase;
> group=hdfs; permission=rwx--x--x; isSymlink=false; hasAcl=false;
> isEncrypted=false; isErasureCoded=false}
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)