[ 
https://issues.apache.org/jira/browse/HDDS-609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated HDDS-609:
--------------------------------
    Summary: SCM does not exit chill mode as it expects DNs to report 
containers in ALLOCATED state  (was: Mapreduce example fails with Allocate 
block failed, error:INTERNAL_ERROR)

> SCM does not exit chill mode as it expects DNs to report containers in 
> ALLOCATED state
> --------------------------------------------------------------------------------------
>
>                 Key: HDDS-609
>                 URL: https://issues.apache.org/jira/browse/HDDS-609
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Namit Maheshwari
>            Priority: Major
>
> {code:java}
> -bash-4.2$ /usr/hdp/current/hadoop-client/bin/hadoop jar 
> /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar 
> wordcount /tmp/mr_jobs/input/ o3://bucket2.volume2/mr_job5
> 18/10/09 23:37:07 INFO conf.Configuration: Removed undeclared tags:
> 18/10/09 23:37:08 INFO conf.Configuration: Removed undeclared tags:
> 18/10/09 23:37:08 INFO conf.Configuration: Removed undeclared tags:
> 18/10/09 23:37:08 INFO client.AHSProxy: Connecting to Application History 
> server at ctr-e138-1518143905142-510793-01-000004.hwx.site/172.27.79.197:10200
> 18/10/09 23:37:08 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm2
> 18/10/09 23:37:09 INFO mapreduce.JobResourceUploader: Disabling Erasure 
> Coding for path: /user/hdfs/.staging/job_1539125785626_0007
> 18/10/09 23:37:09 INFO input.FileInputFormat: Total input files to process : 1
> 18/10/09 23:37:09 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
> 18/10/09 23:37:09 INFO lzo.LzoCodec: Successfully loaded & initialized 
> native-lzo library [hadoop-lzo rev 5d6248d8d690f8456469979213ab2e9993bfa2e9]
> 18/10/09 23:37:09 INFO mapreduce.JobSubmitter: number of splits:1
> 18/10/09 23:37:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
> job_1539125785626_0007
> 18/10/09 23:37:09 INFO mapreduce.JobSubmitter: Executing with tokens: []
> 18/10/09 23:37:10 INFO conf.Configuration: Removed undeclared tags:
> 18/10/09 23:37:10 INFO conf.Configuration: found resource resource-types.xml 
> at file:/etc/hadoop/3.0.3.0-63/0/resource-types.xml
> 18/10/09 23:37:10 INFO conf.Configuration: Removed undeclared tags:
> 18/10/09 23:37:10 INFO impl.YarnClientImpl: Submitted application 
> application_1539125785626_0007
> 18/10/09 23:37:10 INFO mapreduce.Job: The url to track the job: 
> http://ctr-e138-1518143905142-510793-01-000005.hwx.site:8088/proxy/application_1539125785626_0007/
> 18/10/09 23:37:10 INFO mapreduce.Job: Running job: job_1539125785626_0007
> 18/10/09 23:37:17 INFO mapreduce.Job: Job job_1539125785626_0007 running in 
> uber mode : false
> 18/10/09 23:37:17 INFO mapreduce.Job: map 0% reduce 0%
> 18/10/09 23:37:24 INFO mapreduce.Job: map 100% reduce 0%
> 18/10/09 23:37:29 INFO mapreduce.Job: Task Id : 
> attempt_1539125785626_0007_r_000000_0, Status : FAILED
> Error: java.io.IOException: Allocate block failed, error:INTERNAL_ERROR
> at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.allocateBlock(OzoneManagerProtocolClientSideTranslatorPB.java:576)
> at 
> org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.allocateNewBlock(ChunkGroupOutputStream.java:475)
> at 
> org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.handleWrite(ChunkGroupOutputStream.java:271)
> at 
> org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.write(ChunkGroupOutputStream.java:250)
> at 
> org.apache.hadoop.fs.ozone.OzoneFSOutputStream.write(OzoneFSOutputStream.java:47)
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:57)
> at java.io.DataOutputStream.write(DataOutputStream.java:107)
> at 
> org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.writeObject(TextOutputFormat.java:78)
> at 
> org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.write(TextOutputFormat.java:93)
> at 
> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:559)
> at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
> at 
> org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.write(WrappedReducer.java:105)
> at 
> org.apache.hadoop.examples.WordCount$IntSumReducer.reduce(WordCount.java:64)
> at 
> org.apache.hadoop.examples.WordCount$IntSumReducer.reduce(WordCount.java:52)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:628)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:390)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> 18/10/09 23:37:35 INFO mapreduce.Job: Task Id : 
> attempt_1539125785626_0007_r_000000_1, Status : FAILED
> Error: java.io.IOException: Allocate block failed, error:INTERNAL_ERROR
> at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.allocateBlock(OzoneManagerProtocolClientSideTranslatorPB.java:576)
> at 
> org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.allocateNewBlock(ChunkGroupOutputStream.java:475)
> at 
> org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.handleWrite(ChunkGroupOutputStream.java:271)
> at 
> org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.write(ChunkGroupOutputStream.java:250)
> at 
> org.apache.hadoop.fs.ozone.OzoneFSOutputStream.write(OzoneFSOutputStream.java:47)
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:57)
> at java.io.DataOutputStream.write(DataOutputStream.java:107)
> at 
> org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.writeObject(TextOutputFormat.java:78)
> at 
> org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.write(TextOutputFormat.java:93)
> at 
> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:559)
> at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
> at 
> org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.write(WrappedReducer.java:105)
> at 
> org.apache.hadoop.examples.WordCount$IntSumReducer.reduce(WordCount.java:64)
> at 
> org.apache.hadoop.examples.WordCount$IntSumReducer.reduce(WordCount.java:52)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:628)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:390)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> 18/10/09 23:37:42 INFO mapreduce.Job: Task Id : 
> attempt_1539125785626_0007_r_000000_2, Status : FAILED
> Error: java.io.IOException: Allocate block failed, error:INTERNAL_ERROR
> at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.allocateBlock(OzoneManagerProtocolClientSideTranslatorPB.java:576)
> at 
> org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.allocateNewBlock(ChunkGroupOutputStream.java:475)
> at 
> org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.handleWrite(ChunkGroupOutputStream.java:271)
> at 
> org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.write(ChunkGroupOutputStream.java:250)
> at 
> org.apache.hadoop.fs.ozone.OzoneFSOutputStream.write(OzoneFSOutputStream.java:47)
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:57)
> at java.io.DataOutputStream.write(DataOutputStream.java:107)
> at 
> org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.writeObject(TextOutputFormat.java:78)
> at 
> org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.write(TextOutputFormat.java:93)
> at 
> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:559)
> at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
> at 
> org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.write(WrappedReducer.java:105)
> at 
> org.apache.hadoop.examples.WordCount$IntSumReducer.reduce(WordCount.java:64)
> at 
> org.apache.hadoop.examples.WordCount$IntSumReducer.reduce(WordCount.java:52)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:628)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:390)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> 18/10/09 23:37:49 INFO mapreduce.Job: map 100% reduce 100%
> 18/10/09 23:37:50 INFO mapreduce.Job: Job job_1539125785626_0007 failed with 
> state FAILED due to: Task failed task_1539125785626_0007_r_000000
> Job failed as tasks failed. failedMaps:0 failedReduces:1 killedMaps:0 
> killedReduces: 0
> 18/10/09 23:37:50 INFO conf.Configuration: Removed undeclared tags:
> 18/10/09 23:37:51 INFO mapreduce.Job: Counters: 45
> File System Counters
> FILE: Number of bytes read=0
> FILE: Number of bytes written=266505
> FILE: Number of read operations=0
> FILE: Number of large read operations=0
> FILE: Number of write operations=0
> HDFS: Number of bytes read=215876
> HDFS: Number of bytes written=0
> HDFS: Number of read operations=2
> HDFS: Number of large read operations=0
> HDFS: Number of write operations=0
> O3: Number of bytes read=0
> O3: Number of bytes written=0
> O3: Number of read operations=0
> O3: Number of large read operations=0
> O3: Number of write operations=0
> Job Counters
> Failed reduce tasks=4
> Launched map tasks=1
> Launched reduce tasks=4
> Rack-local map tasks=1
> Total time spent by all maps in occupied slots (ms)=18816
> Total time spent by all reduces in occupied slots (ms)=128016
> Total time spent by all map tasks (ms)=4704
> Total time spent by all reduce tasks (ms)=16002
> Total vcore-milliseconds taken by all map tasks=4704
> Total vcore-milliseconds taken by all reduce tasks=16002
> Total megabyte-milliseconds taken by all map tasks=19267584
> Total megabyte-milliseconds taken by all reduce tasks=131088384
> Map-Reduce Framework
> Map input records=716
> Map output records=32019
> Map output bytes=343475
> Map output materialized bytes=6332
> Input split bytes=121
> Combine input records=32019
> Combine output records=461
> Spilled Records=461
> Failed Shuffles=0
> Merged Map outputs=0
> GC time elapsed (ms)=96
> CPU time spent (ms)=3300
> Physical memory (bytes) snapshot=2528358400
> Virtual memory (bytes) snapshot=5420421120
> Total committed heap usage (bytes)=2714238976
> Peak Map Physical memory (bytes)=2528358400
> Peak Map Virtual memory (bytes)=5420421120
> File Input Format Counters
> Bytes Read=215755
> 18/10/09 23:37:51 INFO conf.Configuration: Removed undeclared tags:
> {code}
> SCM logs
> {code:java}
> 2018-10-09 23:37:28,984 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 4 on 9863, call Call#101 Retry#0 
> org.apache.hadoop.ozone.protocol.ScmBlockLocationProtocol.allocateScmBlock 
> from 172.27.56.9:33814
> org.apache.hadoop.hdds.scm.exceptions.SCMException: ChillModePrecheck failed 
> for allocateBlock
> at 
> org.apache.hadoop.hdds.scm.server.ChillModePrecheck.check(ChillModePrecheck.java:38)
> at 
> org.apache.hadoop.hdds.scm.server.ChillModePrecheck.check(ChillModePrecheck.java:30)
> at org.apache.hadoop.hdds.scm.ScmUtils.preCheck(ScmUtils.java:42)
> at 
> org.apache.hadoop.hdds.scm.block.BlockManagerImpl.allocateBlock(BlockManagerImpl.java:191)
> at 
> org.apache.hadoop.hdds.scm.server.SCMBlockProtocolServer.allocateBlock(SCMBlockProtocolServer.java:143)
> at 
> org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.allocateScmBlock(ScmBlockLocationProtocolServerSideTranslatorPB.java:74)
> at 
> org.apache.hadoop.hdds.protocol.proto.ScmBlockLocationProtocolProtos$ScmBlockLocationProtocolService$2.callBlockingMethod(ScmBlockLocationProtocolProtos.java:6255)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
> 2018-10-09 23:37:35,232 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 4 on 9863, call Call#103 Retry#0 
> org.apache.hadoop.ozone.protocol.ScmBlockLocationProtocol.allocateScmBlock 
> from 172.27.56.9:33814
> org.apache.hadoop.hdds.scm.exceptions.SCMException: ChillModePrecheck failed 
> for allocateBlock
> at 
> org.apache.hadoop.hdds.scm.server.ChillModePrecheck.check(ChillModePrecheck.java:38)
> at 
> org.apache.hadoop.hdds.scm.server.ChillModePrecheck.check(ChillModePrecheck.java:30)
> at org.apache.hadoop.hdds.scm.ScmUtils.preCheck(ScmUtils.java:42)
> at 
> org.apache.hadoop.hdds.scm.block.BlockManagerImpl.allocateBlock(BlockManagerImpl.java:191)
> at 
> org.apache.hadoop.hdds.scm.server.SCMBlockProtocolServer.allocateBlock(SCMBlockProtocolServer.java:143)
> at 
> org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.allocateScmBlock(ScmBlockLocationProtocolServerSideTranslatorPB.java:74)
> at 
> org.apache.hadoop.hdds.protocol.proto.ScmBlockLocationProtocolProtos$ScmBlockLocationProtocolService$2.callBlockingMethod(ScmBlockLocationProtocolProtos.java:6255)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
> 2018-10-09 23:37:42,044 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 3 on 9863, call Call#105 Retry#0 
> org.apache.hadoop.ozone.protocol.ScmBlockLocationProtocol.allocateScmBlock 
> from 172.27.56.9:33814
> org.apache.hadoop.hdds.scm.exceptions.SCMException: ChillModePrecheck failed 
> for allocateBlock
> at 
> org.apache.hadoop.hdds.scm.server.ChillModePrecheck.check(ChillModePrecheck.java:38)
> at 
> org.apache.hadoop.hdds.scm.server.ChillModePrecheck.check(ChillModePrecheck.java:30)
> at org.apache.hadoop.hdds.scm.ScmUtils.preCheck(ScmUtils.java:42)
> at 
> org.apache.hadoop.hdds.scm.block.BlockManagerImpl.allocateBlock(BlockManagerImpl.java:191)
> at 
> org.apache.hadoop.hdds.scm.server.SCMBlockProtocolServer.allocateBlock(SCMBlockProtocolServer.java:143)
> at 
> org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.allocateScmBlock(ScmBlockLocationProtocolServerSideTranslatorPB.java:74)
> at 
> org.apache.hadoop.hdds.protocol.proto.ScmBlockLocationProtocolProtos$ScmBlockLocationProtocolService$2.callBlockingMethod(ScmBlockLocationProtocolProtos.java:6255)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
> 2018-10-09 23:37:48,656 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 3 on 9863, call Call#107 Retry#0 
> org.apache.hadoop.ozone.protocol.ScmBlockLocationProtocol.allocateScmBlock 
> from 172.27.56.9:33814
> org.apache.hadoop.hdds.scm.exceptions.SCMException: ChillModePrecheck failed 
> for allocateBlock
> at 
> org.apache.hadoop.hdds.scm.server.ChillModePrecheck.check(ChillModePrecheck.java:38)
> at 
> org.apache.hadoop.hdds.scm.server.ChillModePrecheck.check(ChillModePrecheck.java:30)
> at org.apache.hadoop.hdds.scm.ScmUtils.preCheck(ScmUtils.java:42)
> at 
> org.apache.hadoop.hdds.scm.block.BlockManagerImpl.allocateBlock(BlockManagerImpl.java:191)
> at 
> org.apache.hadoop.hdds.scm.server.SCMBlockProtocolServer.allocateBlock(SCMBlockProtocolServer.java:143)
> at 
> org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.allocateScmBlock(ScmBlockLocationProtocolServerSideTranslatorPB.java:74)
> at 
> org.apache.hadoop.hdds.protocol.proto.ScmBlockLocationProtocolProtos$ScmBlockLocationProtocolService$2.callBlockingMethod(ScmBlockLocationProtocolProtos.java:6255)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to