[jira] [Commented] (HDFS-15410) Add separated config file fedbalance-default.xml for fedbalance tool

2020-06-19 Thread Yiqun Lin (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140960#comment-17140960 ] Yiqun Lin commented on HDFS-15410: -- Besides [~elgoiri]'s review comment, some more reivew comments from

[jira] [Updated] (HDFS-15423) RBF: WebHDFS create shouldn't choose DN from all sub-clusters

2020-06-19 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HDFS-15423: Component/s: webhdfs > RBF: WebHDFS create shouldn't choose DN from all sub-clusters >

[jira] [Created] (HDFS-15423) RBF: WebHDFS create shouldn't choose DN from all sub-clusters

2020-06-19 Thread Chao Sun (Jira)
Chao Sun created HDFS-15423: --- Summary: RBF: WebHDFS create shouldn't choose DN from all sub-clusters Key: HDFS-15423 URL: https://issues.apache.org/jira/browse/HDFS-15423 Project: Hadoop HDFS

[jira] [Commented] (HDFS-13082) cookieverf mismatch error over NFS gateway on Linux

2020-06-19 Thread Daniel Howard (Jira)
[ https://issues.apache.org/jira/browse/HDFS-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140885#comment-17140885 ] Daniel Howard commented on HDFS-13082: -- I am running into this as well, but the AIX compatibility

[jira] [Updated] (HDFS-15422) Reported IBR is partially replaced with stored info when queuing.

2020-06-19 Thread Kihwal Lee (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-15422: -- Description: When queueing an IBR (incremental block report) on a standby namenode, some of the

[jira] [Commented] (HDFS-15415) Reduce locking in Datanode DirectoryScanner

2020-06-19 Thread Hadoop QA (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140812#comment-17140812 ] Hadoop QA commented on HDFS-15415: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (HDFS-15415) Reduce locking in Datanode DirectoryScanner

2020-06-19 Thread Stephen O'Donnell (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140779#comment-17140779 ] Stephen O'Donnell commented on HDFS-15415: -- If a block is RBW or RUR before the snapshot of

[jira] [Updated] (HDFS-15417) RBF: Get the datanode report from cache for federation WebHDFS operations

2020-06-19 Thread Ye Ni (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ye Ni updated HDFS-15417: - Description: *Why* For WebHDFS CREATE, OPEN, APPEND and GETFILECHECKSUM operations, router or namenode needs

[jira] [Updated] (HDFS-15417) RBF: Get the datanode report from cache for federation WebHDFS operations

2020-06-19 Thread Ye Ni (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ye Ni updated HDFS-15417: - Summary: RBF: Get the datanode report from cache for federation WebHDFS operations (was: RBF: Lazy get the

[jira] [Updated] (HDFS-15417) RBF: Lazy get the datanode report for federation WebHDFS operations

2020-06-19 Thread Ye Ni (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ye Ni updated HDFS-15417: - Priority: Major (was: Minor) > RBF: Lazy get the datanode report for federation WebHDFS operations >

[jira] [Commented] (HDFS-15416) DataStorage#addStorageLocations() should add more reasonable information verification.

2020-06-19 Thread Jira
[ https://issues.apache.org/jira/browse/HDFS-15416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140720#comment-17140720 ] Íñigo Goiri commented on HDFS-15416: Let's go with the patch here instead of the PR. Can we add a

[jira] [Commented] (HDFS-15410) Add separated config file fedbalance-default.xml for fedbalance tool

2020-06-19 Thread Jira
[ https://issues.apache.org/jira/browse/HDFS-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140718#comment-17140718 ] Íñigo Goiri commented on HDFS-15410: We can fix the checkstyle. What about adding hdfs as a prefix

[jira] [Commented] (HDFS-15415) Reduce locking in Datanode DirectoryScanner

2020-06-19 Thread hemanthboyina (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140699#comment-17140699 ] hemanthboyina commented on HDFS-15415: -- thanks [~sodonnell] for your analysis  after taking the

[jira] [Updated] (HDFS-15415) Reduce locking in Datanode DirectoryScanner

2020-06-19 Thread Stephen O'Donnell (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen O'Donnell updated HDFS-15415: - Status: Patch Available (was: Open) > Reduce locking in Datanode DirectoryScanner >

[jira] [Commented] (HDFS-15422) Reported IBR is partially replaced with stored info when queuing.

2020-06-19 Thread Kihwal Lee (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140618#comment-17140618 ] Kihwal Lee commented on HDFS-15422: --- The fix is simple. {code} @@ -2578,10 +2578,7 @@ private

[jira] [Created] (HDFS-15422) Reported IBR is partially replaced with stored info when queuing.

2020-06-19 Thread Kihwal Lee (Jira)
Kihwal Lee created HDFS-15422: - Summary: Reported IBR is partially replaced with stored info when queuing. Key: HDFS-15422 URL: https://issues.apache.org/jira/browse/HDFS-15422 Project: Hadoop HDFS

[jira] [Comment Edited] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode

2020-06-19 Thread Kihwal Lee (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140607#comment-17140607 ] Kihwal Lee edited comment on HDFS-15421 at 6/19/20, 2:56 PM: - Example of a

[jira] [Updated] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode

2020-06-19 Thread Kihwal Lee (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-15421: -- Priority: Blocker (was: Critical) > IBR leak causes standby NN to be stuck in safe mode >

[jira] [Commented] (HDFS-14941) Potential editlog race condition can cause corrupted file

2020-06-19 Thread Kihwal Lee (Jira)
[ https://issues.apache.org/jira/browse/HDFS-14941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140608#comment-17140608 ] Kihwal Lee commented on HDFS-14941: --- Filed HDFS-1542 with more details. > Potential editlog race

[jira] [Commented] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode

2020-06-19 Thread Kihwal Lee (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140607#comment-17140607 ] Kihwal Lee commented on HDFS-15421: --- Example of a leak itself: (single replica shown for simplicity)

[jira] [Commented] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode

2020-06-19 Thread Kihwal Lee (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140597#comment-17140597 ] Kihwal Lee commented on HDFS-15421: --- This is an example of "stuck safe mode" from one of our small test

[jira] [Created] (HDFS-15421) IBR leak causes standby NN to be stuck in safe mode

2020-06-19 Thread Kihwal Lee (Jira)
Kihwal Lee created HDFS-15421: - Summary: IBR leak causes standby NN to be stuck in safe mode Key: HDFS-15421 URL: https://issues.apache.org/jira/browse/HDFS-15421 Project: Hadoop HDFS Issue

[jira] [Commented] (HDFS-15415) Reduce locking in Datanode DirectoryScanner

2020-06-19 Thread Stephen O'Donnell (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140477#comment-17140477 ] Stephen O'Donnell commented on HDFS-15415: -- I have annotated the main loop the DirectoryScanner

[jira] [Commented] (HDFS-15415) Reduce locking in Datanode DirectoryScanner

2020-06-19 Thread Stephen O'Donnell (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140397#comment-17140397 ] Stephen O'Donnell commented on HDFS-15415: -- Uploaded initial patch to remove the unnecessary

[jira] [Updated] (HDFS-15415) Reduce locking in Datanode DirectoryScanner

2020-06-19 Thread Stephen O'Donnell (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen O'Donnell updated HDFS-15415: - Attachment: HDFS-15415.001.patch > Reduce locking in Datanode DirectoryScanner >

[jira] [Comment Edited] (HDFS-15419) RBF: Router should retry communicate with NN when cluster is unavailable using configurable time interval

2020-06-19 Thread bhji123 (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140300#comment-17140300 ] bhji123 edited comment on HDFS-15419 at 6/19/20, 7:33 AM: -- Yes, router is just a

[jira] [Commented] (HDFS-15419) RBF: Router should retry communicate with NN when cluster is unavailable using configurable time interval

2020-06-19 Thread bhji123 (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140300#comment-17140300 ] bhji123 commented on HDFS-15419: Yes, router is just a proxy, and it's also a server. Clients can decide 

[jira] [Commented] (HDFS-15410) Add separated config file fedbalance-default.xml for fedbalance tool

2020-06-19 Thread Hadoop QA (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140299#comment-17140299 ] Hadoop QA commented on HDFS-15410: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Issue Comment Deleted] (HDFS-15419) RBF: Router should retry communicate with NN when cluster is unavailable using configurable time interval

2020-06-19 Thread bhji123 (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bhji123 updated HDFS-15419: --- Comment: was deleted (was: Yes, but clients may not configured appropriately. But if router can retry too,

[jira] [Commented] (HDFS-15419) RBF: Router should retry communicate with NN when cluster is unavailable using configurable time interval

2020-06-19 Thread bhji123 (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140283#comment-17140283 ] bhji123 commented on HDFS-15419: Yes, but clients may not configured appropriately. But if router can

[jira] [Commented] (HDFS-15419) RBF: Router should retry communicate with NN when cluster is unavailable using configurable time interval

2020-06-19 Thread Yuxuan Wang (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140277#comment-17140277 ] Yuxuan Wang commented on HDFS-15419: [~ayushtkn] Thanks for your reply. IIRC, now router will retry

[jira] [Comment Edited] (HDFS-15419) RBF: Router should retry communicate with NN when cluster is unavailable using configurable time interval

2020-06-19 Thread Ayush Saxena (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140258#comment-17140258 ] Ayush Saxena edited comment on HDFS-15419 at 6/19/20, 6:48 AM: --- The present

[jira] [Commented] (HDFS-15419) RBF: Router should retry communicate with NN when cluster is unavailable using configurable time interval

2020-06-19 Thread Ayush Saxena (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140258#comment-17140258 ] Ayush Saxena commented on HDFS-15419: - The present code is to have failover is because the router

[jira] [Commented] (HDFS-15419) RBF: Router should retry communicate with NN when cluster is unavailable using configurable time interval

2020-06-19 Thread Yuxuan Wang (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140249#comment-17140249 ] Yuxuan Wang commented on HDFS-15419: [~bhji123] Well, I more agree with [~ayushtkn]. And I think we

[jira] [Commented] (HDFS-15419) RBF: Router should retry communicate with NN when cluster is unavailable using configurable time interval

2020-06-19 Thread bhji123 (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140242#comment-17140242 ] bhji123 commented on HDFS-15419: hi, Yuxuan. In this case, if clients timeout and nn is still

[jira] [Commented] (HDFS-15410) Add separated config file fedbalance-default.xml for fedbalance tool

2020-06-19 Thread Jinglun (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140240#comment-17140240 ] Jinglun commented on HDFS-15410: Hi [~elgoiri], thanks your nice comments ! Refer the fedbalance-site.xml

[jira] [Updated] (HDFS-15410) Add separated config file fedbalance-default.xml for fedbalance tool

2020-06-19 Thread Jinglun (Jira)
[ https://issues.apache.org/jira/browse/HDFS-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinglun updated HDFS-15410: --- Attachment: HDFS-15410.002.patch > Add separated config file fedbalance-default.xml for fedbalance tool >