Namit Maheshwari created HDDS-637: ------------------------------------- Summary: Not able to access the part-r-00000 file after the MR job succeeds Key: HDDS-637 URL: https://issues.apache.org/jira/browse/HDDS-637 Project: Hadoop Distributed Data Store Issue Type: Bug Affects Versions: 0.3.0 Reporter: Namit Maheshwari
Run a MR job {code:java} -bash-4.2$ /usr/hdp/current/hadoop-client/bin/hadoop jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar wordcount /tmp/mr_jobs/input/ o3://bucket2.volume2/mr_jobDD 18/10/12 01:00:23 INFO conf.Configuration: Removed undeclared tags: 18/10/12 01:00:24 INFO conf.Configuration: Removed undeclared tags: 18/10/12 01:00:24 INFO conf.Configuration: Removed undeclared tags: 18/10/12 01:00:25 INFO conf.Configuration: Removed undeclared tags: 18/10/12 01:00:25 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2 18/10/12 01:00:25 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /user/hdfs/.staging/job_1539295307098_0003 18/10/12 01:00:27 INFO input.FileInputFormat: Total input files to process : 1 18/10/12 01:00:27 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library 18/10/12 01:00:27 INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo library [hadoop-lzo rev 5d6248d8d690f8456469979213ab2e9993bfa2e9] 18/10/12 01:00:27 INFO mapreduce.JobSubmitter: number of splits:1 18/10/12 01:00:28 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1539295307098_0003 18/10/12 01:00:28 INFO mapreduce.JobSubmitter: Executing with tokens: [] 18/10/12 01:00:28 INFO conf.Configuration: Removed undeclared tags: 18/10/12 01:00:28 INFO conf.Configuration: found resource resource-types.xml at file:/etc/hadoop/3.0.3.0-63/0/resource-types.xml 18/10/12 01:00:28 INFO conf.Configuration: Removed undeclared tags: 18/10/12 01:00:28 INFO impl.YarnClientImpl: Submitted application application_1539295307098_0003 18/10/12 01:00:29 INFO mapreduce.Job: The url to track the job: http://ctr-e138-1518143905142-510793-01-000005.hwx.site:8088/proxy/application_1539295307098_0003/ 18/10/12 01:00:29 INFO mapreduce.Job: Running job: job_1539295307098_0003 18/10/12 01:00:35 INFO mapreduce.Job: Job job_1539295307098_0003 running in uber mode : false 18/10/12 01:00:35 INFO mapreduce.Job: map 0% reduce 0% 18/10/12 01:00:44 INFO mapreduce.Job: map 100% reduce 0% 18/10/12 01:00:57 INFO mapreduce.Job: map 100% reduce 67% 18/10/12 01:00:59 INFO mapreduce.Job: map 100% reduce 100% 18/10/12 01:00:59 INFO mapreduce.Job: Job job_1539295307098_0003 completed successfully 18/10/12 01:00:59 INFO conf.Configuration: Removed undeclared tags: 18/10/12 01:00:59 INFO mapreduce.Job: Counters: 58 File System Counters FILE: Number of bytes read=6332 FILE: Number of bytes written=532585 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=215876 HDFS: Number of bytes written=0 HDFS: Number of read operations=2 HDFS: Number of large read operations=0 HDFS: Number of write operations=0 O3: Number of bytes read=0 O3: Number of bytes written=0 O3: Number of read operations=0 O3: Number of large read operations=0 O3: Number of write operations=0 Job Counters Launched map tasks=1 Launched reduce tasks=1 Rack-local map tasks=1 Total time spent by all maps in occupied slots (ms)=25392 Total time spent by all reduces in occupied slots (ms)=103584 Total time spent by all map tasks (ms)=6348 Total time spent by all reduce tasks (ms)=12948 Total vcore-milliseconds taken by all map tasks=6348 Total vcore-milliseconds taken by all reduce tasks=12948 Total megabyte-milliseconds taken by all map tasks=26001408 Total megabyte-milliseconds taken by all reduce tasks=106070016 Map-Reduce Framework Map input records=716 Map output records=32019 Map output bytes=343475 Map output materialized bytes=6332 Input split bytes=121 Combine input records=32019 Combine output records=461 Reduce input groups=461 Reduce shuffle bytes=6332 Reduce input records=461 Reduce output records=461 Spilled Records=922 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=359 CPU time spent (ms)=11800 Physical memory (bytes) snapshot=3018502144 Virtual memory (bytes) snapshot=14470242304 Total committed heap usage (bytes)=3521642496 Peak Map Physical memory (bytes)=2518896640 Peak Map Virtual memory (bytes)=5397549056 Peak Reduce Physical memory (bytes)=499605504 Peak Reduce Virtual memory (bytes)=9072693248 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=215755 File Output Format Counters Bytes Written=0 18/10/12 01:00:59 INFO conf.Configuration: Removed undeclared tags: -bash-4.2$ {code} Below exception is seen in SCM logs {code:java} 2018-10-12 01:00:51,142 INFO org.apache.hadoop.hdds.scm.pipelines.ratis.RatisManagerImpl: Allocating a new ratis pipeline of size: 3 id: pipelineId=85cfe1ff-a47c-4ee7-aadb-9c7a42d80192 2018-10-12 01:00:51,988 INFO org.apache.hadoop.hdds.scm.pipelines.PipelineSelector: closing pipeline pipelineId=85cfe1ff-a47c-4ee7-aadb-9c7a42d80192. 2018-10-12 01:00:51,988 INFO org.apache.hadoop.hdds.scm.pipelines.PipelineSelector: Finalizing pipeline. pipelineID: pipelineId=85cfe1ff-a47c-4ee7-aadb-9c7a42d80192 2018-10-12 01:00:51,989 ERROR org.apache.hadoop.hdds.scm.pipelines.PipelineStateManager: Failed to update pipeline state pipelineId=85cfe1ff-a47c-4ee7-aadb-9c7a42d80192, reason: invalid state transition from state: CREATING upon event: FINALIZE. 2018-10-12 01:00:51,991 INFO org.apache.hadoop.hdds.scm.pipelines.PipelineCloseHandler: failed to close pipeline:pipelineId=85cfe1ff-a47c-4ee7-aadb-9c7a42d80192 org.apache.hadoop.hdds.scm.exceptions.SCMException: Failed to update pipeline state pipelineId=85cfe1ff-a47c-4ee7-aadb-9c7a42d80192, reason: invalid state transition from state: CREATING upon event: FINALIZE. at org.apache.hadoop.hdds.scm.pipelines.PipelineStateManager.updatePipelineState(PipelineStateManager.java:128) at org.apache.hadoop.hdds.scm.pipelines.PipelineSelector.updatePipelineState(PipelineSelector.java:464) at org.apache.hadoop.hdds.scm.pipelines.PipelineSelector.finalizePipeline(PipelineSelector.java:310) at org.apache.hadoop.hdds.scm.pipelines.PipelineCloseHandler.onMessage(PipelineCloseHandler.java:44) at org.apache.hadoop.hdds.scm.pipelines.PipelineCloseHandler.onMessage(PipelineCloseHandler.java:30) at org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:85) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2018-10-12 01:00:53,297 INFO org.apache.hadoop.hdds.scm.pipelines.ratis.RatisManagerImpl: Allocating a new ratis pipeline of size: 3 id: pipelineId=68051c76-53c3-47e5-b9c6-d7dc2a3e44c1 2018-10-12 01:00:53,847 INFO org.apache.hadoop.hdds.scm.pipelines.ratis.RatisManagerImpl: Allocating a new ratis pipeline of size: 3 id: pipelineId=25f738e7-692e-4201-882e-2dcf08ef9340 2018-10-12 01:00:56,959 INFO org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer: Object type container id 2 op create new stage complete 2018-10-12 01:01:09,990 INFO org.apache.hadoop.hdds.scm.pipelines.PipelineSelector: closing pipeline pipelineId=68051c76-53c3-47e5-b9c6-d7dc2a3e44c1. 2018-10-12 01:01:09,990 INFO org.apache.hadoop.hdds.scm.pipelines.PipelineSelector: Finalizing pipeline. pipelineID: pipelineId=68051c76-53c3-47e5-b9c6-d7dc2a3e44c1 2018-10-12 01:01:09,991 INFO org.apache.hadoop.hdds.scm.container.CloseContainerEventHandler: Close container Event triggered for container : 7 2018-10-12 01:01:09,992 INFO org.apache.hadoop.hdds.scm.container.CloseContainerEventHandler: Close container Event triggered for container : 10 2018-10-12 01:01:09,993 INFO org.apache.hadoop.hdds.scm.container.CloseContainerEventHandler: Close container Event triggered for container : 13 2018-10-12 01:01:09,993 INFO org.apache.hadoop.hdds.scm.container.CloseContainerEventHandler: Close container Event triggered for container : 16 2018-10-12 01:01:09,993 INFO org.apache.hadoop.hdds.scm.container.CloseContainerEventHandler: Close container Event triggered for container : 2 2018-10-12 01:01:09,997 INFO org.apache.hadoop.hdds.scm.container.CloseContainerEventHandler: Close container Event triggered for container : 19 2018-10-12 01:01:09,997 INFO org.apache.hadoop.hdds.scm.container.CloseContainerEventHandler: Close container Event triggered for container : 4 2018-10-12 01:02:10,114 INFO org.apache.hadoop.hdds.scm.pipelines.PipelineSelector: Closing pipeline. pipelineID: pipelineId=68051c76-53c3-47e5-b9c6-d7dc2a3e44c1{code} Not able to access the part-r-00000 file {code:java} -bash-4.2$ hdfs dfs -ls o3://bucket2.volume2/mr_jobDD 18/10/12 01:01:44 INFO conf.Configuration: Removed undeclared tags: 18/10/12 01:01:45 INFO conf.Configuration: Removed undeclared tags: Found 2 items -rw-rw-rw- 1 hdfs hdfs 0 2018-10-12 01:00 o3://bucket2.volume2/mr_jobDD/_SUCCESS -rw-rw-rw- 1 hdfs hdfs 5017 1970-07-19 09:05 o3://bucket2.volume2/mr_jobDD/part-r-00000 18/10/12 01:01:45 INFO conf.Configuration: Removed undeclared tags: -bash-4.2$ hdfs dfs -cat o3://bucket2.volume2/mr_jobDD/part-r-00000 18/10/12 01:02:09 INFO conf.Configuration: Removed undeclared tags: 18/10/12 01:02:10 INFO conf.Configuration: Removed undeclared tags: cat: ContainerID 2 does not exist 18/10/12 01:02:12 INFO conf.Configuration: Removed undeclared tags: -bash-4.2${code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org