[jira] [Assigned] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-10 Thread Surendra Singh Lilhore (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore reassigned HDFS-13768:
-

Assignee: Surendra Singh Lilhore  (was: Ranith Sardar)

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Assigned] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-08-22 Thread Ranith Sardar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ranith Sardar reassigned HDFS-13768:


Assignee: Ranith Sardar

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Ranith Sardar
>Priority: Major
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org