[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-10-08 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642695#comment-16642695
 ] 

Yiqun Lin commented on HDFS-13768:
--

Thanks for sharing the results. +1 for the v03 patch for branch-2. Committing...

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: HDFS-13768-branch-2.01.patch, 
> HDFS-13768-branch-2.02.patch, HDFS-13768-branch-2.03.patch, 
> HDFS-13768.01-branch-2.patch, HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.03.patch, HDFS-13768.04.patch, HDFS-13768.05.patch, 
> HDFS-13768.06.patch, HDFS-13768.07.patch, HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-10-08 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16641595#comment-16641595
 ] 

Surendra Singh Lilhore commented on HDFS-13768:
---

Thanks [~linyiqun] for review.

I uploaded V3 patch for branch-2. I missed on change. Both the tests are 
passing with V3 patch
{noformat}
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsVolumeList
[INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.833 s 
- in org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsVolumeList
[INFO] Running org.apache.hadoop.hdfs.server.namenode.TestFileTruncate
[INFO] Tests run: 19, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 92.182 
s - in org.apache.hadoop.hdfs.server.namenode.TestFileTruncate
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 25, Failures: 0, Errors: 0, Skipped: 0
{noformat}

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: HDFS-13768-branch-2.01.patch, 
> HDFS-13768-branch-2.02.patch, HDFS-13768-branch-2.03.patch, 
> HDFS-13768.01-branch-2.patch, HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.03.patch, HDFS-13768.04.patch, HDFS-13768.05.patch, 
> HDFS-13768.06.patch, HDFS-13768.07.patch, HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-10-07 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16641276#comment-16641276
 ] 

Yiqun Lin commented on HDFS-13768:
--

The change LGTM. [~surendrasingh], could you please verify following related 
UTs:
 * TestFsVolumeList#testAddRplicaProcessorForAddingReplicaInMap
 * TestFileTruncate(since we have some change of util class 
FsDatasetTestUtils#getStoredGenerationStamp that used under this test)

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: HDFS-13768-branch-2.01.patch, 
> HDFS-13768-branch-2.02.patch, HDFS-13768.01-branch-2.patch, 
> HDFS-13768.01.patch, HDFS-13768.02.patch, HDFS-13768.03.patch, 
> HDFS-13768.04.patch, HDFS-13768.05.patch, HDFS-13768.06.patch, 
> HDFS-13768.07.patch, HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-10-07 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16641191#comment-16641191
 ] 

Surendra Singh Lilhore commented on HDFS-13768:
---

Attached V2 patch for branch-2

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: HDFS-13768-branch-2.01.patch, 
> HDFS-13768-branch-2.02.patch, HDFS-13768.01-branch-2.patch, 
> HDFS-13768.01.patch, HDFS-13768.02.patch, HDFS-13768.03.patch, 
> HDFS-13768.04.patch, HDFS-13768.05.patch, HDFS-13768.06.patch, 
> HDFS-13768.07.patch, HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-10-05 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639516#comment-16639516
 ] 

Yiqun Lin commented on HDFS-13768:
--

I have filed HDFS-13962 for track remaining work. Just some minor work, I 
assign to you, [~surendrasingh]. I will help take the review there also, :).

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: HDFS-13768-branch-2.01.patch, 
> HDFS-13768.01-branch-2.patch, HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.03.patch, HDFS-13768.04.patch, HDFS-13768.05.patch, 
> HDFS-13768.06.patch, HDFS-13768.07.patch, HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-10-05 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639504#comment-16639504
 ] 

Surendra Singh Lilhore commented on HDFS-13768:
---

Thansk [~linyiqun]

I will fix the review comment in next patch.
{quote} I plan to file another JIRA to fix some places still needed to improve 
for trunk.
{quote}
Sure, pls file new jira...

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: HDFS-13768-branch-2.01.patch, 
> HDFS-13768.01-branch-2.patch, HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.03.patch, HDFS-13768.04.patch, HDFS-13768.05.patch, 
> HDFS-13768.06.patch, HDFS-13768.07.patch, HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-10-05 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639451#comment-16639451
 ] 

Yiqun Lin commented on HDFS-13768:
--

Thanks [~surendrasingh] for attaching the patch for branch-2. When I reviewed 
this patch, I found some differences between these two patches. Also caught 
some minor changes I was missing for trunk patch.

For the branch-2 patch, some minor comments:

*ReplicaMap#addAndGet*
 For the logic consistentency, I prefer to use following way to get old replica 
info instead of {{m.get(replicaInfo);}}
{code:java}
ReplicaInfo oldReplicaInfo = m.get(new Block(replicaInfo.getBlockId()));
{code}

*BlockPoolSlice*
 Can we add the additional null check of {{addReplicaThreadPool}} before 
invoking initializeAddReplicaPool ? That can avoid synchronized lock acquiring 
behaviour if pool is already initialized.

*FsDatasetImplTestUtils#getStoredGenerationStamp*
We should also use {{FILE_COMPARATOR}} to sort the list files.

*TestFsVolumeList.java*
Following one line looks unaligned, can you format this line?
{code}
+RamDiskReplicaTracker ramDiskReplicaMap = RamDiskReplicaTracker
+.getInstance(conf, fsDataset);
+FsVolumeImpl vol = (FsVolumeImpl) fsDataset.getFsVolumeReferences().get(0);
+   String bpid = cluster.getNamesystem().getBlockPoolId();
+// It will create BlockPoolSlice.AddReplicaProcessor task's and lunch in
+// ForkJoinPool recursively
{code}

I found the wrong patch name for branch-2, the right way is 
HDFS-13768-branch-2.xx.patch. Attach the same patch again. I plan to file 
another JIRA to fix some places still needed to improve for trunk.

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: HDFS-13768.01-branch-2.patch, HDFS-13768.01.patch, 
> HDFS-13768.02.patch, HDFS-13768.03.patch, HDFS-13768.04.patch, 
> HDFS-13768.05.patch, HDFS-13768.06.patch, HDFS-13768.07.patch, 
> HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-10-04 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638050#comment-16638050
 ] 

Surendra Singh Lilhore commented on HDFS-13768:
---

Attached branch-2 patch..

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: HDFS-13768.01-branch-2.patch, HDFS-13768.01.patch, 
> HDFS-13768.02.patch, HDFS-13768.03.patch, HDFS-13768.04.patch, 
> HDFS-13768.05.patch, HDFS-13768.06.patch, HDFS-13768.07.patch, 
> HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-10-02 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635132#comment-16635132
 ] 

Surendra Singh Lilhore commented on HDFS-13768:
---

Thanks [~linyiqun].

 
{quote} would you mind attaching the patch for branch-2?
{quote}
Sure, I will attach soon...

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.03.patch, HDFS-13768.04.patch, HDFS-13768.05.patch, 
> HDFS-13768.06.patch, HDFS-13768.07.patch, HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-10-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16634848#comment-16634848
 ] 

Hudson commented on HDFS-13768:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15090 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/15090/])
HDFS-13768. Adding replicas to volume map makes DataNode start slowly. (yqlin: 
rev 5689355783de005ebc604f4403dc5129a286bfca)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/ReplicaMap.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsVolumeList.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetUtil.java
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java


>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Fix For: 3.2.0, 3.1.2
>
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.03.patch, HDFS-13768.04.patch, HDFS-13768.05.patch, 
> HDFS-13768.06.patch, HDFS-13768.07.patch, HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-10-01 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16634838#comment-16634838
 ] 

Yiqun Lin commented on HDFS-13768:
--

I have committed this to trunk and branch-3.1, but there are some conflicts 
when backporting to branch-2. [~surendrasingh], would you mind attaching the 
patch for branch-2? I think this will be nice to have in 2.x versions.

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.03.patch, HDFS-13768.04.patch, HDFS-13768.05.patch, 
> HDFS-13768.06.patch, HDFS-13768.07.patch, HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-10-01 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16634835#comment-16634835
 ] 

Yiqun Lin commented on HDFS-13768:
--

The failed UTs are not related. +1. Committing this.

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.03.patch, HDFS-13768.04.patch, HDFS-13768.05.patch, 
> HDFS-13768.06.patch, HDFS-13768.07.patch, HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631396#comment-16631396
 ] 

Hadoop QA commented on HDFS-13768:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 28s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 11s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 48s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}141m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.client.impl.TestBlockReaderLocal |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | HDFS-13768 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12941631/HDFS-13768.07.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 08c78a2857ae 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 5c8d907 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25159/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25159/testReport/ |
| Max. process+thread count | 4043 (vs. ulimit of 1) |
| modules | C: 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-27 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631305#comment-16631305
 ] 

Yiqun Lin commented on HDFS-13768:
--

Thanks [~surendrasingh] for sharing the data. The improvement looks great!
Jenkins is fine now, I'd like to attach the same patch of v06 patch to 
re-trigger this.
+1 from me. I will fold off the commit a couple of days in case [~arpitagarwal] 
or other guys have comments for this.


>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.03.patch, HDFS-13768.04.patch, HDFS-13768.05.patch, 
> HDFS-13768.06.patch, HDFS-13768.07.patch, HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-27 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630576#comment-16630576
 ] 

Surendra Singh Lilhore commented on HDFS-13768:
---

Thanks [~linyiqun].  



Attached v6 patch and fixed whitespaces warnings.
{quote}BTW, [~surendrasingh], would you mind making a new test based on latest 
patch? I am curious about current rate compared with the data you gave before.
{quote}
After latest path it look 3465ms. its almost 80% faster compare to initial time 
(16772ms).

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.03.patch, HDFS-13768.04.patch, HDFS-13768.05.patch, 
> HDFS-13768.06.patch, HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-26 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629686#comment-16629686
 ] 

Yiqun Lin commented on HDFS-13768:
--

Thanks [~surendrasingh] for addressing the comments. The latest patch almost 
looks good to me. Only following generated whitespaces warnings we need to fix. 
+1 for others. 
{noformat}
./hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml:1288: 
block in volume. Default value for this configuration is max of
./hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml:1289: 
(volume * number of bp_service, number of processor)
{noformat}
BTW, [~surendrasingh], would you mind making a new test based on latest patch? 
I am curious about current rate compared with the data you gave before.
{quote}Without this fix DN took 16772ms to load 101260 blocks
 With this patch and 8 processor(default) its took 9823ms
{quote}

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.03.patch, HDFS-13768.04.patch, HDFS-13768.05.patch, 
> HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-26 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628977#comment-16628977
 ] 

Hadoop QA commented on HDFS-13768:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 33s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 2 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 45s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}126m 23s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}193m 33s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.ha.TestBootstrapStandby |
|   | hadoop.hdfs.server.namenode.sps.TestBlockStorageMovementAttemptedItems |
|   | hadoop.hdfs.TestLeaseRecovery2 |
|   | hadoop.hdfs.client.impl.TestBlockReaderLocal |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | HDFS-13768 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12941381/HDFS-13768.05.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux eead017d7a2c 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e5287a4 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| whitespace | 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-26 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628681#comment-16628681
 ] 

Surendra Singh Lilhore commented on HDFS-13768:
---

Attached updated patch v5. Fixed all above comments.

Please review..

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.03.patch, HDFS-13768.04.patch, HDFS-13768.05.patch, 
> HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-26 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628302#comment-16628302
 ] 

Yiqun Lin commented on HDFS-13768:
--

{quote}
we need old replica to resolve duplicate replica, So we can't change this 
method.
{quote}
Agree, I am missing for this, [~surendrasingh].

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.03.patch, HDFS-13768.04.patch, HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-25 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627630#comment-16627630
 ] 

Arpit Agarwal commented on HDFS-13768:
--

bq.  FsDatasetUtil.getGenrationStampFromFile(), This method loop over the all 
the file to find out the meta file. This is time consuming. We can sort the 
list of file's and in sorted list next to block file metafile will be there, so 
we no need to loop and find out the meta file.
That's a nice find [~surendrasingh].

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.03.patch, HDFS-13768.04.patch, HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-25 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627486#comment-16627486
 ] 

Surendra Singh Lilhore commented on HDFS-13768:
---

Thanks [~linyiqun] and [~knanasi] for review.
{quote}*ReplicaMap#addAndGet*
 For the new method {{addAndGet}}, I would prefer to return a boolean type 
value to see if new replicaInfo was added. For example, if old replicainfo 
exists, we return false. And then we don't need to check the reference of 
returned replicaInfo in {{BlockPoolSlice#addReplicaToReplicasMap}}, it looks a 
little confused.
{quote}
we need old replica to resolve duplicate replica, So we can't change this 
method.

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.03.patch, HDFS-13768.04.patch, HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-24 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626406#comment-16626406
 ] 

Kitti Nanasi commented on HDFS-13768:
-

Thanks [~surendrasingh] for the patch and [~linyiqun] for reviewing it!

I have some minor comments:
- TestDataNodeVolumeFailureReporting test failure does seem related to the patch
- I think it would be better to rename the forkJoinTasks from f1, f2 to 
something meaningful, like the comments before them and then the comments can 
be deleted
- TestFsVolumeList#testAddRplicaProcessorForAddingReplicaInMap could have a 
timeout

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.03.patch, HDFS-13768.04.patch, HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-24 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625737#comment-16625737
 ] 

Yiqun Lin commented on HDFS-13768:
--

Thanks [~surendrasingh] for updating the patch! Some minor comments from me:

*ReplicaMap#addAndGet*
 For the new method {{addAndGet}}, I would prefer to return a boolean type 
value to see if new replicaInfo was added. For example, if old replicainfo 
exists, we return false. And then we don't need to check the reference of 
returned replicaInfo in {{BlockPoolSlice#addReplicaToReplicasMap}}, it looks a 
little confused.

*FsDatasetUtil#getGenerationStampFromFile*
 * Following loop seems can be deleted if I am understanding correctly. Since 
list files were ordered, the meta file only will be the next file if it really 
exists.
{code:java}
for (int j = 0; j < listdir.length; j++) {
  String path = listdir[j].getName();
  if (!path.startsWith(blockName)) {
continue;
  }
  if (blockFile.getCanonicalPath().equals(listdir[j].getCanonicalPath())) {
continue;
  }
  return Block.getGenerationStamp(listdir[j].getName());
}
{code}

 * Please complete the javadoc for this method.

Other minor comments
* FsVolumeList.java: There need a white space after 'pool'.
* hdfs-default.xml: The description of the this config should be updated since 
we use max of (volume * number of bp_service,  number of processor) actually. 
Not always the value we configured.
* TestFsVolumeList.java
1. We can use {{Configuration#setInt}} to set {{DFS_REPLICATION_KEY}}.
2.I prefer to explicitly reset the thread pool size (e.g. 5 or 10) for ensure 
making adding replica behaviour concurrently running in ForkJoinPool.
* Please fix findbugs error and white space warning happened in line 
{{DFSConfigKeys.java:368}}.

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.03.patch, HDFS-13768.04.patch, HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624974#comment-16624974
 ] 

Hadoop QA commented on HDFS-13768:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  7s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  6s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m  
3s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 93m  7s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}149m 20s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Possible doublecheck on 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addReplicaThreadPool
 in new 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice(String, 
FsVolumeImpl, File, Configuration, Timer)  At BlockPoolSlice.java:new 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice(String, 
FsVolumeImpl, File, Configuration, Timer)  At BlockPoolSlice.java:[lines 
186-188] |
| Failed junit tests | hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized |
|   | hadoop.hdfs.TestLeaseRecovery2 |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | HDFS-13768 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12940946/HDFS-13768.04.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-22 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624923#comment-16624923
 ] 

Surendra Singh Lilhore commented on HDFS-13768:
---

fixed checkstyle..

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.03.patch, HDFS-13768.04.patch, HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624795#comment-16624795
 ] 

Hadoop QA commented on HDFS-13768:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  8s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 56s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 4 new + 715 unchanged - 0 fixed = 719 total (was 715) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 20s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
59s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 23s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}149m 10s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Possible doublecheck on 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addReplicaThreadPool
 in new 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice(String, 
FsVolumeImpl, File, Configuration, Timer)  At BlockPoolSlice.java:new 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice(String, 
FsVolumeImpl, File, Configuration, Timer)  At BlockPoolSlice.java:[lines 
186-188] |
| Failed junit tests | hadoop.hdfs.TestLeaseRecovery2 |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | HDFS-13768 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12940917/HDFS-13768.03.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-22 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624745#comment-16624745
 ] 

Surendra Singh Lilhore commented on HDFS-13768:
---

Thanks [~linyiqun].

Attached updated patch. Please review...

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.03.patch, HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-18 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618683#comment-16618683
 ] 

Yiqun Lin commented on HDFS-13768:
--

Makes sense to me. Thanks for looking into this, [~surendrasingh].

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-18 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618651#comment-16618651
 ] 

Surendra Singh Lilhore commented on HDFS-13768:
---

[~linyiqun] and [~arpitagarwal], I found some more improvment point.


 !screenshot-1.png!

If you see this screenshot {{FsDatasetUtil.getGenrationStampFromFile()}} and 
{{lock()}} is taking so much time. This can be improved.

1. {{FsDatasetUtil.getGenrationStampFromFile()}}, This method loop over the all 
the file to find out the meta file. This is time consuming.
 We can sort the list of file's and in sorted list next to block file metafile 
will be there, so we no need to loop and find out the meta file.
{code:java}
  /**
   * Find the meta-file for the specified block file
   * and then return the generation stamp from the name of the meta-file.
   */
  static long getGenerationStampFromFile(List files, File blockFile, int 
index)
  throws IOException {
String blockName = blockFile.getName();
if ((index + 1) < files.size()) {
  // Check if next index file is meta file
  String metaFile = files.get(index + 1).getName();
  if (metaFile.startsWith(blockName)) {
return Block.getGenerationStamp(metaFile);
  }
}
//Search meta file in list 
for (int j = 0; j < files.size(); j++) {
  ..
  ..
  return Block.getGenerationStamp(files.get(j).getName());
}
FsDatasetImpl.LOG.warn("Block " + blockFile + " does not have a metafile!");
return HdfsConstants.GRANDFATHER_GENERATION_STAMP;
  }
{code}

After this change *60% datanode statup time* is redused in my cluster.

2. {{BlockPoolSlice.addReplicaToReplicasMap()}}, in this method first its find 
the oldReplica from {{ReplicaMap}} and if it is null then it will add the 
ReplicaInfo in {{ReplicaMap}}. For this its need to get the lock two time. Its 
better to add one method in ReplicaMap for this work, so this work can be done 
in one lock. 

*Example:*
{code:java}
  /**
   * Add a replica's meta information into the map, if already exist return the
   * old replicaInfo
   * 
   * @param bpid
   * @param replicaInfo
   * @return
   */
  ReplicaInfo addAndGet(String bpid, ReplicaInfo replicaInfo) {
checkBlockPool(bpid);
checkBlock(replicaInfo);
try (AutoCloseableLock l = lock.acquire()) {
  FoldedTreeSet set = map.get(bpid);
  if (set == null) {
// Add an entry for block pool if it does not exist already
set = new FoldedTreeSet<>();
map.put(bpid, set);
  }
  ReplicaInfo oldReplicaInfo = set.get(replicaInfo.getBlockId(),
  LONG_AND_BLOCK_COMPARATOR);
  if (oldReplicaInfo != null) {
return oldReplicaInfo;
  } else {
set.add(replicaInfo);
  }
  return replicaInfo;
}
  }
{code}

If you both are agree for this change then I will add this in next patch.

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618628#comment-16618628
 ] 

Hadoop QA commented on HDFS-13768:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} HDFS-13768 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-13768 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/25090/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.patch, screenshot-1.png
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-18 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618618#comment-16618618
 ] 

Surendra Singh Lilhore commented on HDFS-13768:
---

Hi [~arpitagarwal]

bq. the default should probably be the same as today i.e. number of volumes.

To keep it same as today we need to give default value = *valume's X 
blockPoolSlice's*

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-17 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16617817#comment-16617817
 ] 

Arpit Agarwal commented on HDFS-13768:
--

Hi [~surendrasingh], the default should probably be the same as today i.e. 
number of volumes.

Else we risk degrading performance when number of disks > number of processors.

 

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-17 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16617143#comment-16617143
 ] 

Surendra Singh Lilhore commented on HDFS-13768:
---

Thanks [~linyiqun] for review...

bq. This comment seems not fully addressed . I mean we can also make 
AddReplicaProcessor in an Asynchronous mode.
Sorry I missed it. 

bq. Could you please add the UT for this improvement?
yes, in next patch I will fix both comments.
 

 

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-17 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16617137#comment-16617137
 ] 

Yiqun Lin commented on HDFS-13768:
--

Thanks [~surendrasingh] for updating the patch. Almost looks good to me. Two 
comments from me:
{quote}The adding replica operation can still speed up. Since the ReplicaMap is 
thread-safe to add replicas, we can also make AddReplicaProcessor of dir 
finalizedDir and rbwDir in an async runnable way. And getting their Future 
objects and wait for completion.
{quote}
This comment seems not fully addressed . I mean we can also make 
AddReplicaProcessor in an Asynchronous mode. The call ForkJoinPool#invoke is a 
sync way. Based on the v02, patch, we can improve this like following:
{code:java}
  // add finalized replicas
  AddReplicaProcessor task = new AddReplicaProcessor(volumeMap,
  finalizedDir, lazyWriteReplicaMap, true, exceptions, subTaskQueue);
  ForkJoinTask f1 = addReplicaThreadPool.submit(task);
  
  // add rbw replicas
  task = new AddReplicaProcessor(volumeMap, rbwDir, lazyWriteReplicaMap,
  false, exceptions, subTaskQueue);
  ForkJoinTask f2 = addReplicaThreadPool.submit(task);

  try {
f1.get();
f2.get();
  } catch (InterruptedException | ExecutionException e) {
// exception handle
  }
  
  //wait for all the tasks to finish.
  waitForSubTaskToFinish(subTaskQueue, exceptions);
{code}
Could you please add the UT for this improvement?

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-12 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612134#comment-16612134
 ] 

Hadoop QA commented on HDFS-13768:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 18s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 57s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 2 new + 466 unchanged - 0 fixed = 468 total (was 466) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 14s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m  
3s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 81m  
4s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}138m 47s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Possible doublecheck on 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addReplicaThreadPool
 in new 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice(String, 
FsVolumeImpl, File, Configuration, Timer)  At BlockPoolSlice.java:new 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice(String, 
FsVolumeImpl, File, Configuration, Timer)  At BlockPoolSlice.java:[lines 
176-178] |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | HDFS-13768 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939392/HDFS-13768.02.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-12 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611948#comment-16611948
 ] 

Surendra Singh Lilhore commented on HDFS-13768:
---

Thanks [~linyiqun] and [~arpitagarwal] for review..

Attached updated patch. Fixed all the above comments given by [~linyiqun].
{quote}Looks like this is going to use number processors x num disks threads by 
default.
{quote}
No, pool object is static. So it will use only number processors threads in 
pool by default.
{quote}Any idea what kind of speedup you get with lower number of threads. e.g. 
2?
{quote} * Without this fix DN took *16772ms* to load 101260 blocks
 * With this patch and 8 processor(default) its took *9823ms*
 * Configured pool size to 2 then its took *8766ms*

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.02.patch, 
> HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-11 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611051#comment-16611051
 ] 

Arpit Agarwal commented on HDFS-13768:
--

The patch did not apply cleanly for me. Can you please rebase it?

Looks like this is going to use number processors x num disks threads by 
default. Any idea what kind of speedup you get with lower number of threads. 
e.g. 2?

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-11 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610178#comment-16610178
 ] 

Yiqun Lin commented on HDFS-13768:
--

The patch almost looks good to me. Some initial comments:

*DFSConfigKeys.java*
* I prefer to rename {{dfs.datanode.volumes.replica.add.threads}} to 
{{dfs.datanode.volumes.replica-add.threadpool.size}}
* The default value of pool size 
{{Runtime.getRuntime().availableProcessors();}} will be better placed in class 
{{BlockPoolSlice}} not in config class.

*BlockPoolSlice.java*
* The adding replica operation can still speed up. Since the ReplicaMap is 
thread-safe to add replicas, we can also make {{AddReplicaProcessor}} of dir 
finalizedDir and rbwDir in an async runnable way. And getting their Future 
objects and wait for completion.
* We can use {{MultipleIOException}} to return list IOExceptions, like 
following way
{code}
if (!exceptions.isEmpty()) {
   throw exceptions.get(0);
}
{code}
to 
{code}
if (!exceptions.isEmpty()) {
   throw MultipleIOException.createIOException(exceptions);
}
{code}
* Please complete the javadoc of the method {{waitForSubTaskToFinish}}.

Please fix other checkstyle issues and rebase the patch.

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610138#comment-16610138
 ] 

Hadoop QA commented on HDFS-13768:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 12s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 57s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 11 new + 466 unchanged - 0 fixed = 477 total (was 466) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 6 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 13m 
33s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
30s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}108m 32s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}169m 17s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Possible doublecheck on 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.forkJoinPool
 in new 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice(String, 
FsVolumeImpl, File, Configuration, Timer)  At BlockPoolSlice.java:new 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice(String, 
FsVolumeImpl, File, Configuration, Timer)  At BlockPoolSlice.java:[lines 
173-175] |
| Failed junit tests | hadoop.hdfs.TestLeaseRecovery2 |
|   | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations |
|   | hadoop.hdfs.server.namenode.ha.TestHAAppend |
|   | hadoop.hdfs.TestMaintenanceState |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13768 |
| JIRA Patch URL | 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610125#comment-16610125
 ] 

Hadoop QA commented on HDFS-13768:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 25s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 47s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 12 new + 16 unchanged - 0 fixed = 28 total (was 16) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}109m 44s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}171m 11s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSUpgradeFromImage |
|   | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.TestPersistBlocks |
|   | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
|   | hadoop.hdfs.TestDatanodeStartupFixesLegacyStorageIDs |
|   | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade |
|   | hadoop.hdfs.TestDatanodeLayoutUpgrade |
|   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.server.datanode.TestDeleteBlockPool |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13768 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12937996/HDFS-13768.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 69ec7d898a3c 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-10 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610057#comment-16610057
 ] 

Yiqun Lin commented on HDFS-13768:
--

Thanks for working this, [~RANith] and [~surendrasingh].
I will get a chance to have a review recently, :).

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-10 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610055#comment-16610055
 ] 

Surendra Singh Lilhore commented on HDFS-13768:
---

Added initial patch, I will add test case in next patch. 

*+_Initial test report_+* 
 # Before fix : Restarted datanode with 101260 block and it's took *16203ms*
 # After fix :  Restarted datanode with 101260 block and it's took *9693ms*

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-13768.01.patch, HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-10 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610040#comment-16610040
 ] 

Surendra Singh Lilhore commented on HDFS-13768:
---

Discussed with [~RANith] offline, assigning jira to my self, I will upload 
updated patch..

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Ranith Sardar
>Priority: Major
> Attachments: HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191)
> {noformat}
> One improvement maybe we can use ForkJoinPool to do this recursive task, 
> rather than a sync way. This will be a great improvement because it can 
> greatly speed up recovery process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-04 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602728#comment-16602728
 ] 

Surendra Singh Lilhore commented on HDFS-13768:
---

Thanks [~RANith] for patch.

Some review comment 

1. You can start you first task in BlockPoolSlice#getVolumeMap() and then 
BlockPoolSlice#addToReplicasMap() can submit the sub task when item is 
directory.
{code}
if (!success) {
  // add finalized replicas
  addToReplicasMap(volumeMap, finalizedDir, lazyWriteReplicaMap, true);
  // add rbw replicas
  addToReplicasMap(volumeMap, rbwDir, lazyWriteReplicaMap, false);
}
{code}

This can be changed to 
{code}
 if (!success) {
   // add finalized replicas
  AddReplicaProcessor task = new AddReplicaProcessor(volumeMap,
  finalizedDir, lazyWriteReplicaMap, true);
  forkJoinPool.invoke(task);
   // add rbw replicas
  task = new AddReplicaProcessor(volumeMap, rbwDir, lazyWriteReplicaMap,
  false);
  forkJoinPool.invoke(task);
 }
{code}

2. You used {{forkJoinPool.invoke()}} in BlockPoolSlice#addToReplicasMap(). It 
will wait till the task is finished. You need to use here {{task.fork()}}.

{code}
+forkJoinPool.invoke(new AddReplicaProcessor(volumeMap, file,
+lazyWriteReplicaMap, isFinalized));
{code}

instead of this use

{code}
+AddReplicaProcessor task = new AddReplicaProcessor(volumeMap, file,
+lazyWriteReplicaMap, isFinalized);
+task.fork();
{code}

2. line 78, you shutdown the {{forkJoinPool}}. This may be by mistake. Pls 
check once. 

{code}
+  forkJoinPool.shutdown();
{code}

3. What if the exception occur's in {{AddReplicaProcessor}} ?, you need to 
report it back to the next level.

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Ranith Sardar
>Priority: Major
> Attachments: HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-03 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16601796#comment-16601796
 ] 

Hadoop QA commented on HDFS-13768:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 47s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 12 new + 16 unchanged - 0 fixed = 28 total (was 16) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 15s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 96m 53s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}154m 40s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSUpgradeFromImage |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
|   | hadoop.hdfs.TestPersistBlocks |
|   | hadoop.hdfs.TestDatanodeStartupFixesLegacyStorageIDs |
|   | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade |
|   | hadoop.hdfs.TestDatanodeLayoutUpgrade |
|   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.server.datanode.TestDeleteBlockPool |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13768 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12937996/HDFS-13768.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 439cf7342730 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ff036e4 |
| 

[jira] [Commented] (HDFS-13768) Adding replicas to volume map makes DataNode start slowly

2018-09-01 Thread Ranith Sardar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599573#comment-16599573
 ] 

Ranith Sardar commented on HDFS-13768:
--

According to code structure, each thread of volumes will call getVolumeMap and 
add replicas under the given directory to the volume map is recursive call. As 
we can see, thread result will be in hold until all thread completes their 
performances. For recursive call in addToReplicasMap() method, will take more 
time for more sub-dirs (depended).

*Solutions:*
 * Declare a common ForkJoinPool in BlockPoolSlice.
 * Considering all call for addToReplicasMap() as a single task. Submitting 
them to common ForkJoinPool.
 * forkpoolSize is considered as half of core size, because processor will busy 
with other works.

>  Adding replicas to volume map makes DataNode start slowly 
> ---
>
> Key: HDFS-13768
> URL: https://issues.apache.org/jira/browse/HDFS-13768
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Yiqun Lin
>Assignee: Ranith Sardar
>Priority: Major
> Attachments: HDFS-13768.patch
>
>
> We find DN starting so slowly when rolling upgrade our cluster. When we 
> restart DNs, the DNs start so slowly and not register to NN immediately. And 
> this cause a lots of following error:
> {noformat}
> DataXceiver error processing WRITE_BLOCK operation  src: /xx.xx.xx.xx:64360 
> dst: /xx.xx.xx.xx:50010
> java.io.IOException: Not ready to serve the block pool, 
> BP-1508644862-xx.xx.xx.xx-1493781183457.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looking into the logic of DN startup, it will do the initial block pool 
> operation before the registration. And during initializing block pool 
> operation, we found the adding replicas to volume map is the most expensive 
> operation.  Related log:
> {noformat}
> 2018-07-26 10:46:23,771 INFO [Thread-105] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/1/dfs/dn/current: 242722ms
> 2018-07-26 10:46:26,231 INFO [Thread-109] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/5/dfs/dn/current: 245182ms
> 2018-07-26 10:46:32,146 INFO [Thread-112] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/8/dfs/dn/current: 251097ms
> 2018-07-26 10:47:08,283 INFO [Thread-106] 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to 
> add replicas to map for block pool BP-1508644862-xx.xx.xx.xx-1493781183457 on 
> volume /home/hard_disk/2/dfs/dn/current: 287235ms
> {noformat}
> Currently DN uses independent thread to scan and add replica for each volume, 
> but we still need to wait the slowest thread to finish its work. So the main 
> problem here is that we could make the thread to run faster.
> The jstack we get when DN blocking in the adding replica:
> {noformat}
> "Thread-113" #419 daemon prio=5 os_prio=0 tid=0x7f40879ff000 nid=0x145da 
> runnable [0x7f4043a38000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.list(Native Method)
>   at java.io.File.list(File.java:1122)
>   at java.io.File.listFiles(File.java:1207)
>   at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1165)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342)
>   at 
>