[jira] [Updated] (HDFS-16751) WebUI FileSystem explorer could delete wrong file by mistake

2022-08-29 Thread Walter Su (Jira)
[ https://issues.apache.org/jira/browse/HDFS-16751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-16751: - Summary: WebUI FileSystem explorer could delete wrong file by mistake (was: WebUI FileSystem explorer

[jira] [Created] (HDFS-16751) WebUI FileSystem explorer file Deletion could delete wrong file by mistake

2022-08-29 Thread Walter Su (Jira)
Walter Su created HDFS-16751: Summary: WebUI FileSystem explorer file Deletion could delete wrong file by mistake Key: HDFS-16751 URL: https://issues.apache.org/jira/browse/HDFS-16751 Project: Hadoop

[jira] [Created] (HDFS-16644) java.io.IOException Invalid token in javax.security.sasl.qop

2022-06-29 Thread Walter Su (Jira)
Walter Su created HDFS-16644: Summary: java.io.IOException Invalid token in javax.security.sasl.qop Key: HDFS-16644 URL: https://issues.apache.org/jira/browse/HDFS-16644 Project: Hadoop HDFS

[jira] [Commented] (HDFS-10383) Safely close resources in DFSTestUtil

2016-05-15 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15284143#comment-15284143 ] Walter Su commented on HDFS-10383: -- bq. IOUtils#cleanup swallows it in the finally block. Great work! And

[jira] [Commented] (HDFS-10220) Namenode failover due to too long loking in LeaseManager.Monitor

2016-05-08 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15275875#comment-15275875 ] Walter Su commented on HDFS-10220: -- The last patch looks pretty good. +1 once the test nits get

[jira] [Commented] (HDFS-10340) data node sudden killed

2016-04-28 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261938#comment-15261938 ] Walter Su commented on HDFS-10340: -- I don't think it's an issue. SIGTERM comes from the outside. The

[jira] [Commented] (HDFS-9958) BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed storages.

2016-04-27 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261388#comment-15261388 ] Walter Su commented on HDFS-9958: - bq. I think only DFSClient currently reports storageID. No, it doesn't.

[jira] [Commented] (HDFS-9958) BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed storages.

2016-04-27 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15259918#comment-15259918 ] Walter Su commented on HDFS-9958: - Failed tests are not related. Will commit shortly if there's no further

[jira] [Commented] (HDFS-5280) Corrupted meta files on data nodes prevents DFClient from connecting to data nodes and updating corruption status to name node.

2016-04-26 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15259490#comment-15259490 ] Walter Su commented on HDFS-5280: - There's other IOExceptions will cause readBlock RPC call fails, then

[jira] [Commented] (HDFS-9958) BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed storages.

2016-04-26 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15259457#comment-15259457 ] Walter Su commented on HDFS-9958: - {code} @@ -1320,11 +1320,22 @@ public void

[jira] [Commented] (HDFS-10220) Namenode failover due to too long loking in LeaseManager.Monitor

2016-04-26 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15259384#comment-15259384 ] Walter Su commented on HDFS-10220: -- bq. I think it add some readability and also because it is used

[jira] [Commented] (HDFS-10220) Namenode failover due to too long loking in LeaseManager.Monitor

2016-04-26 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15257898#comment-15257898 ] Walter Su commented on HDFS-10220: -- Thanks [~ashangit] for the update. repeat one of my previous

[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order

2016-04-24 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255787#comment-15255787 ] Walter Su commented on HDFS-10301: -- bq. BR ids are monotonically increasing. The id values are random

[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order

2016-04-24 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255772#comment-15255772 ] Walter Su commented on HDFS-10301: -- Thank you for your explanation. I learned a lot. > BlockReport

[jira] [Commented] (HDFS-10220) Namenode failover due to too long loking in LeaseManager.Monitor

2016-04-22 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253523#comment-15253523 ] Walter Su commented on HDFS-10220: -- I mean, saving administrators the trouble to tune this. > Namenode

[jira] [Commented] (HDFS-10220) Namenode failover due to too long loking in LeaseManager.Monitor

2016-04-21 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253354#comment-15253354 ] Walter Su commented on HDFS-10220: -- You are right. The only question I have is I have no idea if the

[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order

2016-04-21 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253236#comment-15253236 ] Walter Su commented on HDFS-10301: -- I like your idea of counting storages with same reportId, and no

[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order

2016-04-21 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253204#comment-15253204 ] Walter Su commented on HDFS-10301: -- The handler threads will wait anyway, either waiting the queue

[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order

2016-04-21 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253181#comment-15253181 ] Walter Su commented on HDFS-10301: -- bq. Enabling HDFS-9198 will fifo process BRs. It doesn't solve this

[jira] [Updated] (HDFS-10301) Blocks removed by thousands due to falsely detected zombie storages

2016-04-21 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-10301: - Assignee: (was: Walter Su) > Blocks removed by thousands due to falsely detected zombie storages >

[jira] [Commented] (HDFS-10220) Namenode failover due to too long loking in LeaseManager.Monitor

2016-04-20 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251150#comment-15251150 ] Walter Su commented on HDFS-10220: -- 1. isMaxFilesCheckedToReleaseLease is not requirted to be a function.

[jira] [Commented] (HDFS-5280) Corrupted meta files on data nodes prevents DFClient from connecting to data nodes and updating corruption status to name node.

2016-04-20 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249649#comment-15249649 ] Walter Su commented on HDFS-5280: - +1 for catching the exception. The same exception will cause

[jira] [Commented] (HDFS-9958) BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed storages.

2016-04-19 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249312#comment-15249312 ] Walter Su commented on HDFS-9958: - bq. we fix countNodes().corruptReplicas() to return the number after

[jira] [Created] (HDFS-10316) revisit corrupt replicas count

2016-04-19 Thread Walter Su (JIRA)
Walter Su created HDFS-10316: Summary: revisit corrupt replicas count Key: HDFS-10316 URL: https://issues.apache.org/jira/browse/HDFS-10316 Project: Hadoop HDFS Issue Type: Bug

[jira] [Updated] (HDFS-9744) TestDirectoryScanner#testThrottling occasionally time out after 300 seconds

2016-04-19 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9744: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was:

[jira] [Commented] (HDFS-9744) TestDirectoryScanner#testThrottling occasionally time out after 300 seconds

2016-04-19 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247547#comment-15247547 ] Walter Su commented on HDFS-9744: - +1. will commit shortly. > TestDirectoryScanner#testThrottling

[jira] [Updated] (HDFS-10284) o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode fails intermittently

2016-04-19 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-10284: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.9.0 Status: Resolved

[jira] [Commented] (HDFS-10284) o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode fails intermittently

2016-04-19 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247468#comment-15247468 ] Walter Su commented on HDFS-10284: -- +1. will commit shortly. >

[jira] [Commented] (HDFS-10291) TestShortCircuitLocalRead failing

2016-04-19 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247445#comment-15247445 ] Walter Su commented on HDFS-10291: -- cherry-picked to trunk. > TestShortCircuitLocalRead failing >

[jira] [Commented] (HDFS-9958) BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed storages.

2016-04-19 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247432#comment-15247432 ] Walter Su commented on HDFS-9958: - Thanks [~kshukla] for the update. I've noticed

[jira] [Updated] (HDFS-10301) Blocks removed by thousands due to falsely detected zombie storages

2016-04-19 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-10301: - Assignee: Walter Su Status: Patch Available (was: Open) Upload a patch. Kindly review. > Blocks

[jira] [Updated] (HDFS-10301) Blocks removed by thousands due to falsely detected zombie storages

2016-04-19 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-10301: - Attachment: HDFS-10301.01.patch > Blocks removed by thousands due to falsely detected zombie storages >

[jira] [Commented] (HDFS-10301) Blocks removed by thousands due to falsely detected zombie storages

2016-04-18 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247129#comment-15247129 ] Walter Su commented on HDFS-10301: -- Oh, I see. In this case, the reports are not splitted. And because

[jira] [Commented] (HDFS-9684) DataNode stopped sending heartbeat after getting OutOfMemoryError form DataTransfer thread.

2016-04-18 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247037#comment-15247037 ] Walter Su commented on HDFS-9684: - My previous comment is incorrect. It turns out that the MR tasks

[jira] [Commented] (HDFS-10301) Blocks removed by thousands due to falsely detected zombie storages

2016-04-18 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246996#comment-15246996 ] Walter Su commented on HDFS-10301: -- 1. IPC reader is single-thread by default. If it's multi-threaded,

[jira] [Updated] (HDFS-10275) TestDataNodeMetrics failing intermittently due to TotalWriteTime counted incorrectly

2016-04-18 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-10275: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.7.3 Status: Resolved

[jira] [Commented] (HDFS-10275) TestDataNodeMetrics failing intermittently due to TotalWriteTime counted incorrectly

2016-04-18 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245569#comment-15245569 ] Walter Su commented on HDFS-10275: -- sorry I didn't see that. The patch LGTM. +1. > TestDataNodeMetrics

[jira] [Commented] (HDFS-10275) TestDataNodeMetrics failing intermittently due to TotalWriteTime counted incorrectly

2016-04-18 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245394#comment-15245394 ] Walter Su commented on HDFS-10275: -- Good analysis! I think a better way to do this is to use a real

[jira] [Commented] (HDFS-10284) o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode fails intermittently

2016-04-18 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245332#comment-15245332 ] Walter Su commented on HDFS-10284: -- bq. I think it's due to mocking fsn while being concurrently accessed

[jira] [Commented] (HDFS-10291) TestShortCircuitLocalRead failing

2016-04-17 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245154#comment-15245154 ] Walter Su commented on HDFS-10291: -- +1. > TestShortCircuitLocalRead failing >

[jira] [Updated] (HDFS-9412) getBlocks occupies FSLock and takes too long to complete

2016-04-17 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9412: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was:

[jira] [Commented] (HDFS-9412) getBlocks occupies FSLock and takes too long to complete

2016-04-14 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240996#comment-15240996 ] Walter Su commented on HDFS-9412: - {{TestBalancer}} passes locally. +1 for the last patch. > getBlocks

[jira] [Commented] (HDFS-9412) getBlocks occupies FSLock and takes too long to complete

2016-04-13 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240455#comment-15240455 ] Walter Su commented on HDFS-9412: - Thank you for updating. The test {{TestGetBlocks}} failed. Do you mind

[jira] [Commented] (HDFS-9412) getBlocks occupies FSLock and takes too long to complete

2016-04-13 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239103#comment-15239103 ] Walter Su commented on HDFS-9412: - One thread holding a readLock too long is very like holding a writeLock.

[jira] [Updated] (HDFS-9772) TestBlockReplacement#testThrottler doesn't work as expected

2016-04-13 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9772: Labels: test (was: ) Priority: Minor (was: Major) Issue Type: Test (was: Bug) >

[jira] [Updated] (HDFS-9772) TestBlockReplacement#testThrottler doesn't work as expected

2016-04-13 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9772: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.7.3 Status: Resolved (was:

[jira] [Updated] (HDFS-9772) TestBlockReplacement#testThrottler doesn't work as expected

2016-04-13 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9772: Summary: TestBlockReplacement#testThrottler doesn't work as expected (was:

[jira] [Commented] (HDFS-9772) TestBlockReplacement#testThrottler use falut variable to calculate bandwidth

2016-04-13 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238984#comment-15238984 ] Walter Su commented on HDFS-9772: - +1. > TestBlockReplacement#testThrottler use falut variable to

[jira] [Commented] (HDFS-9825) Balancer should not terminate if only one of the namenodes has error

2016-04-13 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238891#comment-15238891 ] Walter Su commented on HDFS-9825: - The patch looks pretty good. Could you rebase it? And one question:

[jira] [Commented] (HDFS-9476) TestDFSUpgradeFromImage#testUpgradeFromRel1BBWImage occasionally fail

2016-04-13 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238658#comment-15238658 ] Walter Su commented on HDFS-9476: - +1. > TestDFSUpgradeFromImage#testUpgradeFromRel1BBWImage occasionally

[jira] [Commented] (HDFS-9826) Erasure Coding: Postpone the recovery work for a configurable time period

2016-04-12 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238589#comment-15238589 ] Walter Su commented on HDFS-9826: - Good thought. And I think current implementation {{LowRedundancyBlocks}}

[jira] [Commented] (HDFS-9918) Erasure Coding: Sort located striped blocks based on decommissioned states

2016-04-09 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233561#comment-15233561 ] Walter Su commented on HDFS-9918: - +1. Thanks, [~rakesh_r]. > Erasure Coding: Sort located striped blocks

[jira] [Commented] (HDFS-7661) [umbrella] support hflush and hsync for erasure coded files

2016-04-08 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232103#comment-15232103 ] Walter Su commented on HDFS-7661: - Great design/discussion. Since we come back to discuss the use cases,

[jira] [Commented] (HDFS-9918) Erasure Coding: Sort located striped blocks based on decommissioned states

2016-04-07 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231512#comment-15231512 ] Walter Su commented on HDFS-9918: - The patch looks pretty good. Thanks [~rakesh_r]. tiny suggestions: 1.

[jira] [Commented] (HDFS-9918) Erasure Coding: Sort located striped blocks based on decommissioned states

2016-03-30 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15219241#comment-15219241 ] Walter Su commented on HDFS-9918: - The optimization works for {{BlockInfoStriped}}. A missing block

[jira] [Commented] (HDFS-9918) Erasure Coding: Sort located striped blocks based on decommissioned states

2016-03-29 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217456#comment-15217456 ] Walter Su commented on HDFS-9918: - I see the difference. To achieve your goal, we need a new comparator,

[jira] [Commented] (HDFS-9918) Erasure Coding: Sort located striped blocks based on decommissioned states

2016-03-29 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217217#comment-15217217 ] Walter Su commented on HDFS-9918: - bq. 1. Index in the logical block group. 2.Decomm status 3.Distance to

[jira] [Commented] (HDFS-9918) Erasure Coding: Sort located striped blocks based on decommissioned states

2016-03-29 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215796#comment-15215796 ] Walter Su commented on HDFS-9918: - We also need to sort locations by distance. It's unlikely but from 2

[jira] [Updated] (HDFS-10182) Hedged read might overwrite user's buf

2016-03-29 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-10182: - Fix Version/s: 2.6.5 committed to branch-2.6 > Hedged read might overwrite user's buf >

[jira] [Updated] (HDFS-10182) Hedged read might overwrite user's buf

2016-03-28 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-10182: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.7.3 Status: Resolved

[jira] [Commented] (HDFS-10182) Hedged read might overwrite user's buf

2016-03-28 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213926#comment-15213926 ] Walter Su commented on HDFS-10182: -- +1. I'll commit it shortly. > Hedged read might overwrite user's buf

[jira] [Commented] (HDFS-9952) Expose FSNamesystem lock wait time as metrics

2016-03-28 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213915#comment-15213915 ] Walter Su commented on HDFS-9952: - Thanks [~vinayrpet] for updating. Just one minor suggestion: We should

[jira] [Commented] (HDFS-10182) Hedged read might overwrite user's buf

2016-03-19 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-10182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201350#comment-15201350 ] Walter Su commented on HDFS-10182: -- And because {{cancelAll(futures);}} doesn't interrupt the first

[jira] [Commented] (HDFS-9952) Expose FSNamesystem lock wait time as metrics

2016-03-18 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198770#comment-15198770 ] Walter Su commented on HDFS-9952: - bq. MutableRate#add is synchronized in an extremely critical code path

[jira] [Commented] (HDFS-9684) DataNode stopped sending heartbeat after getting OutOfMemoryError form DataTransfer thread.

2016-03-15 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194914#comment-15194914 ] Walter Su commented on HDFS-9684: - I have seen a case DN got command from NN to transfer huge numbers of

[jira] [Updated] (HDFS-8211) DataNode UUID is always null in the JMX counter

2016-03-14 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-8211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8211: Priority: Major (was: Minor) Changes Priority to Major. As [~qwertymaniac] pointed out, this patch

[jira] [Commented] (HDFS-9822) Erasure Coding: Avoids scheduling multiple reconstruction tasks for a striped block at the same time

2016-03-09 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15186974#comment-15186974 ] Walter Su commented on HDFS-9822: - bq. I am still a little confused how this error happens. Me too. I don't

[jira] [Commented] (HDFS-7866) Erasure coding: NameNode manages multiple erasure coding policies

2016-03-08 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184856#comment-15184856 ] Walter Su commented on HDFS-7866: - Sorry for the confusion, now the javadoc looks verbose. But thanks for

[jira] [Commented] (HDFS-7866) Erasure coding: NameNode manages multiple erasure coding policies

2016-03-07 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184267#comment-15184267 ] Walter Su commented on HDFS-7866: - 1. Not only javadoc, what I mean was separating the logic of

[jira] [Commented] (HDFS-7866) Erasure coding: NameNode manages multiple erasure coding policies

2016-03-07 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183118#comment-15183118 ] Walter Su commented on HDFS-7866: - What do you think let it diverge instead of forcing unification? It

[jira] [Commented] (HDFS-9803) Proactively refresh ShortCircuitCache entries to avoid latency spikes

2016-02-19 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15155386#comment-15155386 ] Walter Su commented on HDFS-9803: - Is it related to HDFS-5637? Which version of hdfs-client module you use?

[jira] [Updated] (HDFS-9716) o.a.h.hdfs.TestRecoverStripedFile fails intermittently in trunk

2016-02-19 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9716: Resolution: Cannot Reproduce Status: Resolved (was: Patch Available) HDFS-9755 covers the same fix.

[jira] [Commented] (HDFS-9816) Erasure Coding: allow to use multiple EC policies in striping related tests [Part 3]

2016-02-17 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151796#comment-15151796 ] Walter Su commented on HDFS-9816: - bq. Then we can move the current hard coded suite to TestBlockRecovery.

[jira] [Commented] (HDFS-9816) Erasure Coding: allow to use multiple EC policies in striping related tests [Part 3]

2016-02-17 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150432#comment-15150432 ] Walter Su commented on HDFS-9816: - The safe length may change if steps 3,4 is included(See HDFS-9173). I

[jira] [Updated] (HDFS-9347) Invariant assumption in TestQuorumJournalManager.shutdown() is wrong

2016-02-08 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9347: Fix Version/s: 2.6.5 2.7.3 > Invariant assumption in TestQuorumJournalManager.shutdown()

[jira] [Updated] (HDFS-9752) Permanent write failures may happen to slow writers during datanode rolling upgrades

2016-02-08 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9752: Attachment: HDFS-9752-branch-2.7.03.patch HDFS-9752-branch-2.6.03.patch > Permanent write

[jira] [Commented] (HDFS-9347) Invariant assumption in TestQuorumJournalManager.shutdown() is wrong

2016-02-08 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138264#comment-15138264 ] Walter Su commented on HDFS-9347: - Thanks [~jojochuang] for the work. I just cherry-picked it to branch-2.7

[jira] [Commented] (HDFS-9752) Permanent write failures may happen to slow writers during datanode rolling upgrades

2016-02-08 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138306#comment-15138306 ] Walter Su commented on HDFS-9752: - Thanks all for reviewing the patch. The patch depends on HDFS-9347. I

[jira] [Updated] (HDFS-9752) Permanent write failures may happen to slow writers during datanode rolling upgrades

2016-02-06 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9752: Attachment: HDFS-9752.03.patch bq. The test can just verify that pipelineRecoveryCount is not incremented

[jira] [Updated] (HDFS-9752) Permanent write failures may happen to slow writers during datanode rolling upgrades

2016-02-05 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9752: Attachment: HDFS-9752.02.patch Thanks for the advises. Uploaded 02 patch. The test now takes ~30s. But it's

[jira] [Commented] (HDFS-9752) Permanent write failures may happen to slow writers during datanode rolling upgrades

2016-02-05 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15134131#comment-15134131 ] Walter Su commented on HDFS-9752: - Hi, [~xiaobingo]. I think the 'write failure' means the outputstream

[jira] [Created] (HDFS-9748) When addExpectedReplicasToPending is called twice, pendingReplications should avoid duplication

2016-02-03 Thread Walter Su (JIRA)
Walter Su created HDFS-9748: --- Summary: When addExpectedReplicasToPending is called twice, pendingReplications should avoid duplication Key: HDFS-9748 URL: https://issues.apache.org/jira/browse/HDFS-9748

[jira] [Updated] (HDFS-9748) When addExpectedReplicasToPending is called twice, pendingReplications should avoid duplication

2016-02-03 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9748: Affects Version/s: 2.8.0 Status: Patch Available (was: Open) > When

[jira] [Updated] (HDFS-9748) When addExpectedReplicasToPending is called twice, pendingReplications should avoid duplication

2016-02-03 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9748: Attachment: HDFS-9748.01.patch > When addExpectedReplicasToPending is called twice, pendingReplications

[jira] [Updated] (HDFS-9748) When addExpectedReplicasToPending is called twice, pendingReplications should avoid duplication

2016-02-03 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9748: Description: 1. When completeFile() is called, addExpectedReplicasToPending() will be called (HDFS-8999).

[jira] [Updated] (HDFS-9752) Permanent write failures may happen to slow writers during datanode rolling upgrades

2016-02-03 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9752: Attachment: HDFS-9752.01.patch Thanks [~kihwal] for reporting this. Uploaded 01 patch, kindly review. The

[jira] [Updated] (HDFS-9752) Permanent write failures may happen to slow writers during datanode rolling upgrades

2016-02-03 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9752: Assignee: Walter Su Status: Patch Available (was: Open) > Permanent write failures may happen to slow

[jira] [Commented] (HDFS-9716) o.a.h.hdfs.TestRecoverStripedFile fails intermittently in trunk

2016-02-02 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15127932#comment-15127932 ] Walter Su commented on HDFS-9716: - There are 2 ways to make BlockGroup under-replicated: 1.shutdown DN, or

[jira] [Assigned] (HDFS-9716) o.a.h.hdfs.TestRecoverStripedFile fails intermittently in trunk

2016-02-02 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su reassigned HDFS-9716: --- Assignee: Walter Su > o.a.h.hdfs.TestRecoverStripedFile fails intermittently in trunk >

[jira] [Updated] (HDFS-9716) o.a.h.hdfs.TestRecoverStripedFile fails intermittently in trunk

2016-02-02 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9716: Labels: test (was: ) Affects Version/s: (was: 2.8.0) Status: Patch

[jira] [Commented] (HDFS-9716) o.a.h.hdfs.TestRecoverStripedFile fails intermittently in trunk

2016-02-02 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128008#comment-15128008 ] Walter Su commented on HDFS-9716: - You can easily reproduce the failure by delay line 345 with a breakpoint

[jira] [Updated] (HDFS-9716) o.a.h.hdfs.TestRecoverStripedFile fails intermittently in trunk

2016-02-02 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9716: Attachment: HDFS-9716.01.patch > o.a.h.hdfs.TestRecoverStripedFile fails intermittently in trunk >

[jira] [Commented] (HDFS-9716) o.a.h.hdfs.TestRecoverStripedFile fails intermittently in trunk

2016-02-02 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128003#comment-15128003 ] Walter Su commented on HDFS-9716: - {{MiniDFSCluster.getBlockFile}} on _dead DN_ is called before replica

[jira] [Updated] (HDFS-9716) o.a.h.hdfs.TestRecoverStripedFile fails intermittently in trunk

2016-02-02 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9716: Attachment: HDFS-9716.02.patch Thanks [~drankye]. Uploaded 02 patch to address that. >

[jira] [Commented] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block

2016-01-14 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15098152#comment-15098152 ] Walter Su commented on HDFS-9646: - {code} + bufferSize, (int)(maxTargetLength -

[jira] [Commented] (HDFS-9534) Add CLI command to clear storage policy from a path.

2016-01-14 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-9534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15101365#comment-15101365 ] Walter Su commented on HDFS-9534: - Thanks [~xiaobingo]. 1. I think the original design doesn't mean to make

[jira] [Commented] (HDFS-7661) Erasure coding: support hflush and hsync

2016-01-06 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085202#comment-15085202 ] Walter Su commented on HDFS-7661: - You totally miss my point. A successful flush is a guarantee that the

[jira] [Commented] (HDFS-7661) Erasure coding: support hflush and hsync

2016-01-05 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15084527#comment-15084527 ] Walter Su commented on HDFS-7661: - According to the description, 1. 3 parity blocks should be updated in

[jira] [Commented] (HDFS-7661) Erasure coding: support hflush and hsync

2016-01-04 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082272#comment-15082272 ] Walter Su commented on HDFS-7661: - bq. But, the older parity internal block comes back later, then we have

[jira] [Commented] (HDFS-8430) Erasure coding: update DFSClient.getFileChecksum() logic for stripe files

2016-01-04 Thread Walter Su (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082228#comment-15082228 ] Walter Su commented on HDFS-8430: - Thanks [~szetszwo] for clarifying. {{New Algorithm 2}} looks good. And

  1   2   3   4   5   6   7   8   9   10   >