[jira] [Updated] (HDFS-15151) Use TransmitFile for file to socket data transfer

2020-01-30 Thread Lukas Majercak (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-15151:
--
Status: Patch Available  (was: In Progress)

> Use TransmitFile for file to socket data transfer
> -
>
> Key: HDFS-15151
> URL: https://issues.apache.org/jira/browse/HDFS-15151
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode
>Affects Versions: 3.3.0, 3.1.4, 3.2.2, 3.3.1, 3.4.0
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
>  Labels: NIO, Windows, datanode
>
> Proposing to give an option to use TransmitFile Windows function for file to 
> socket data transfer. 
> https://docs.microsoft.com/en-us/windows/win32/api/mswsock/nf-mswsock-transmitfile



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15151) Use TransmitFile for file to socket data transfer

2020-01-30 Thread Lukas Majercak (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-15151:
--
Component/s: datanode

> Use TransmitFile for file to socket data transfer
> -
>
> Key: HDFS-15151
> URL: https://issues.apache.org/jira/browse/HDFS-15151
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode
>Affects Versions: 3.3.0, 3.1.4, 3.2.2, 3.3.1, 3.4.0
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
>  Labels: NIO, Windows, datanode
>
> Proposing to give an option to use TransmitFile Windows function for file to 
> socket data transfer. 
> https://docs.microsoft.com/en-us/windows/win32/api/mswsock/nf-mswsock-transmitfile



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15151) Use TransmitFile for file to socket data transfer

2020-01-30 Thread Lukas Majercak (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-15151:
--
Labels: NIO Windows datanode  (was: )

> Use TransmitFile for file to socket data transfer
> -
>
> Key: HDFS-15151
> URL: https://issues.apache.org/jira/browse/HDFS-15151
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.3.0, 3.1.4, 3.2.2, 3.3.1, 3.4.0
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
>  Labels: NIO, Windows, datanode
>
> Proposing to give an option to use TransmitFile Windows function for file to 
> socket data transfer. 
> https://docs.microsoft.com/en-us/windows/win32/api/mswsock/nf-mswsock-transmitfile



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-15151) Use TransmitFile for file to socket data transfer

2020-01-30 Thread Lukas Majercak (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-15151 started by Lukas Majercak.
-
> Use TransmitFile for file to socket data transfer
> -
>
> Key: HDFS-15151
> URL: https://issues.apache.org/jira/browse/HDFS-15151
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.3.0, 3.1.4, 3.2.2, 3.3.1, 3.4.0
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
>
> Proposing to give an option to use TransmitFile Windows function for file to 
> socket data transfer. 
> https://docs.microsoft.com/en-us/windows/win32/api/mswsock/nf-mswsock-transmitfile



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15151) Use TransmitFile for file to socket data transfer

2020-01-30 Thread Lukas Majercak (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-15151:
--
Affects Version/s: 3.4.0
   3.3.1
   3.2.2
   3.1.4
   3.3.0

> Use TransmitFile for file to socket data transfer
> -
>
> Key: HDFS-15151
> URL: https://issues.apache.org/jira/browse/HDFS-15151
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.3.0, 3.1.4, 3.2.2, 3.3.1, 3.4.0
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
>
> Proposing to give an option to use TransmitFile Windows function for file to 
> socket data transfer. 
> https://docs.microsoft.com/en-us/windows/win32/api/mswsock/nf-mswsock-transmitfile



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15151) Use TransmitFile for file to socket data transfer

2020-01-30 Thread Lukas Majercak (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-15151:
--
Description: 
Proposing to give an option to use TransmitFile Windows function for file to 
socket data transfer. 
https://docs.microsoft.com/en-us/windows/win32/api/mswsock/nf-mswsock-transmitfile

  was:
Proposing to give an option to use TransmitFile Windows function for file to 
socket data transfer. 



> Use TransmitFile for file to socket data transfer
> -
>
> Key: HDFS-15151
> URL: https://issues.apache.org/jira/browse/HDFS-15151
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
>
> Proposing to give an option to use TransmitFile Windows function for file to 
> socket data transfer. 
> https://docs.microsoft.com/en-us/windows/win32/api/mswsock/nf-mswsock-transmitfile



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15151) Use TransmitFile for file to socket data transfer

2020-01-30 Thread Lukas Majercak (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-15151:
--
Description: Proposing to give an option to use TransmitFile 

> Use TransmitFile for file to socket data transfer
> -
>
> Key: HDFS-15151
> URL: https://issues.apache.org/jira/browse/HDFS-15151
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
>
> Proposing to give an option to use TransmitFile 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15151) Use TransmitFile for file to socket data transfer

2020-01-30 Thread Lukas Majercak (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-15151:
--
Description: 
Proposing to give an option to use TransmitFile Windows function for file to 
socket data transfer. 


  was:Proposing to give an option to use TransmitFile 


> Use TransmitFile for file to socket data transfer
> -
>
> Key: HDFS-15151
> URL: https://issues.apache.org/jira/browse/HDFS-15151
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
>
> Proposing to give an option to use TransmitFile Windows function for file to 
> socket data transfer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15151) Use TransmitFile for file to socket data transfer

2020-01-30 Thread Lukas Majercak (Jira)
Lukas Majercak created HDFS-15151:
-

 Summary: Use TransmitFile for file to socket data transfer
 Key: HDFS-15151
 URL: https://issues.apache.org/jira/browse/HDFS-15151
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Lukas Majercak
Assignee: Lukas Majercak






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15055) Hedging clones client's buffer

2019-12-12 Thread Lukas Majercak (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-15055:
--
Priority: Minor  (was: Major)

> Hedging clones client's buffer
> --
>
> Key: HDFS-15055
> URL: https://issues.apache.org/jira/browse/HDFS-15055
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.9.2, 3.3.0, 3.2.1, 2.9.3, 3.2.2
>Reporter: Lukas Majercak
>Priority: Minor
>
> Currently, DFSInputStream clones the buffer passed from the caller for every 
> request, this can have severe impact on the performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-15055) Hedging clones client's buffer

2019-12-12 Thread Lukas Majercak (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak reopened HDFS-15055:
---

> Hedging clones client's buffer
> --
>
> Key: HDFS-15055
> URL: https://issues.apache.org/jira/browse/HDFS-15055
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.9.2, 3.3.0, 3.2.1, 2.9.3, 3.2.2
>Reporter: Lukas Majercak
>Priority: Major
>
> Currently, DFSInputStream clones the buffer passed from the caller for every 
> request, this can have severe impact on the performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15055) Hedging clones client's buffer

2019-12-12 Thread Lukas Majercak (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16995256#comment-16995256
 ] 

Lukas Majercak commented on HDFS-15055:
---

Although I feel like this still could be an issue. Potentially we'll create up 
to a blocksize sized buffer for every single hedged request.

> Hedging clones client's buffer
> --
>
> Key: HDFS-15055
> URL: https://issues.apache.org/jira/browse/HDFS-15055
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.9.2, 3.3.0, 3.2.1, 2.9.3, 3.2.2
>Reporter: Lukas Majercak
>Priority: Major
>
> Currently, DFSInputStream clones the buffer passed from the caller for every 
> request, this can have severe impact on the performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15055) Hedging clones client's buffer

2019-12-12 Thread Lukas Majercak (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16995254#comment-16995254
 ] 

Lukas Majercak commented on HDFS-15055:
---

Closing as we actually create a separate buffer for the length requested.

> Hedging clones client's buffer
> --
>
> Key: HDFS-15055
> URL: https://issues.apache.org/jira/browse/HDFS-15055
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.9.2, 3.3.0, 3.2.1, 2.9.3, 3.2.2
>Reporter: Lukas Majercak
>Priority: Major
>
> Currently, DFSInputStream clones the buffer passed from the caller for every 
> request, this can have severe impact on the performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15055) Hedging clones client's buffer

2019-12-12 Thread Lukas Majercak (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak resolved HDFS-15055.
---
Resolution: Not A Problem

> Hedging clones client's buffer
> --
>
> Key: HDFS-15055
> URL: https://issues.apache.org/jira/browse/HDFS-15055
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.9.2, 3.3.0, 3.2.1, 2.9.3, 3.2.2
>Reporter: Lukas Majercak
>Priority: Major
>
> Currently, DFSInputStream clones the buffer passed from the caller for every 
> request, this can have severe impact on the performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15055) Hedging clones client's buffer

2019-12-12 Thread Lukas Majercak (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-15055:
--
Description: Currently, DFSInputStream clones the buffer passed from the 
caller for every request, this can have severe impact on the performance.  
(was: Currently, DFSInputStream clones the buffer passed from the caller for 
every request, this can have severe impact on the performance (imagine cloning 
a 1GB buffer).)

> Hedging clones client's buffer
> --
>
> Key: HDFS-15055
> URL: https://issues.apache.org/jira/browse/HDFS-15055
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.9.2, 3.3.0, 3.2.1, 2.9.3, 3.2.2
>Reporter: Lukas Majercak
>Priority: Major
>
> Currently, DFSInputStream clones the buffer passed from the caller for every 
> request, this can have severe impact on the performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15055) Hedging clones client's buffer

2019-12-12 Thread Lukas Majercak (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-15055:
--
Priority: Major  (was: Critical)

> Hedging clones client's buffer
> --
>
> Key: HDFS-15055
> URL: https://issues.apache.org/jira/browse/HDFS-15055
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.9.2, 3.3.0, 3.2.1, 2.9.3, 3.2.2
>Reporter: Lukas Majercak
>Priority: Major
>
> Currently, DFSInputStream clones the buffer passed from the caller for every 
> request, this can have severe impact on the performance (imagine cloning a 
> 1GB buffer).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15055) Hedging clones client's buffer

2019-12-12 Thread Lukas Majercak (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-15055:
--
Description: Currently, DFSInputStream clones the buffer passed from the 
caller for every request, this can have severe impact on the performance 
(imagine cloning a 1GB buffer).  (was: _emphasized text_)

> Hedging clones client's buffer
> --
>
> Key: HDFS-15055
> URL: https://issues.apache.org/jira/browse/HDFS-15055
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.9.2, 3.3.0, 3.2.1, 2.9.3, 3.2.2
>Reporter: Lukas Majercak
>Priority: Critical
>
> Currently, DFSInputStream clones the buffer passed from the caller for every 
> request, this can have severe impact on the performance (imagine cloning a 
> 1GB buffer).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15055) Hedging clones client's buffer

2019-12-12 Thread Lukas Majercak (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-15055:
--
Description: _emphasized text_

> Hedging clones client's buffer
> --
>
> Key: HDFS-15055
> URL: https://issues.apache.org/jira/browse/HDFS-15055
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.9.2, 3.3.0, 3.2.1, 2.9.3, 3.2.2
>Reporter: Lukas Majercak
>Priority: Critical
>
> _emphasized text_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15055) Hedging clones client's buffer

2019-12-12 Thread Lukas Majercak (Jira)
Lukas Majercak created HDFS-15055:
-

 Summary: Hedging clones client's buffer
 Key: HDFS-15055
 URL: https://issues.apache.org/jira/browse/HDFS-15055
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 3.2.1, 2.9.2, 3.3.0, 2.9.3, 3.2.2
Reporter: Lukas Majercak






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14882) Consider DataNode load when #getBlockLocation

2019-09-30 Thread Lukas Majercak (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941143#comment-16941143
 ] 

Lukas Majercak commented on HDFS-14882:
---

Overall looks okay to me, seems like an improvement of 
dfs.namenode.avoid.read.highload.datanode (+ .threshold). I only wish we could 
also use some sort of an estimate of load that we've already scheduled on each 
DN, not just xceivers reported by them. 

> Consider DataNode load when #getBlockLocation
> -
>
> Key: HDFS-14882
> URL: https://issues.apache.org/jira/browse/HDFS-14882
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-14882.001.patch
>
>
> Currently, we consider load of datanode when #chooseTarget for writer, 
> however not consider it for reader. Thus, the process slot of datanode could 
> be occupied by #BlockSender for reader, and disk/network will be busy 
> workload, then meet some slow node exception. IIRC same case is reported 
> times. Based on the fact, I propose to consider load for reader same as it 
> did #chooseTarget for writer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12288) Fix DataNode's xceiver count calculation

2019-09-09 Thread Lukas Majercak (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16925976#comment-16925976
 ] 

Lukas Majercak commented on HDFS-12288:
---

[~zhangchen] not working on this right now, feel free to pick it up.

> Fix DataNode's xceiver count calculation
> 
>
> Key: HDFS-12288
> URL: https://issues.apache.org/jira/browse/HDFS-12288
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Attachments: HDFS-12288.001.patch, HDFS-12288.002.patch
>
>
> The problem with the ThreadGroup.activeCount() method is that the method is 
> only a very rough estimate, and in reality returns the total number of 
> threads in the thread group as opposed to the threads actually running.
> In some DNs, we saw this to return 50~ for a long time, even though the 
> actual number of DataXceiver threads was next to none.
> This is a big issue as we use the xceiverCount to make decisions on the NN 
> for choosing replication source DN or returning DNs to clients for R/W.
> The plan is to reuse the DataNodeMetrics.dataNodeActiveXceiversCount value 
> which only accounts for actual number of DataXcevier threads currently 
> running and thus represents the load on the DN much better.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14545) RBF: Router should support GetUserMappingsProtocol

2019-06-14 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16863822#comment-16863822
 ] 

Lukas Majercak commented on HDFS-14545:
---

Thanks [~ayushtkn], LGTM

> RBF: Router should support GetUserMappingsProtocol
> --
>
> Key: HDFS-14545
> URL: https://issues.apache.org/jira/browse/HDFS-14545
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14545-HDFS-13891-01.patch, 
> HDFS-14545-HDFS-13891-02.patch, HDFS-14545-HDFS-13891-03.patch, 
> HDFS-14545-HDFS-13891-04.patch, HDFS-14545-HDFS-13891-05.patch, 
> HDFS-14545-HDFS-13891-06.patch, HDFS-14545-HDFS-13891-07.patch, 
> HDFS-14545-HDFS-13891-08.patch, HDFS-14545-HDFS-13891-09.patch, 
> HDFS-14545-HDFS-13891-10.patch, HDFS-14545-HDFS-13891.000.patch
>
>
> We should be able to check the groups for a user from a Router.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14545) RBF: Router should support GetUserMappingsProtocol

2019-06-07 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858912#comment-16858912
 ] 

Lukas Majercak commented on HDFS-14545:
---

ConnectionPool lines 410, 411. Would be nice to either change "clazz0" to 
something like "clazzProtoPb" or remove these variables altogether. 

Nitpicks: 
- RouterRpcServer line 361 missing space before "=" 
- RouterUserProtocol line 45: you don't need .getName() there? Also maybe 
separating static and non static members visually
- TestRouterUserMappings line 295: this assert length == 2 seems kinda vague, 
can we pass in the actual groups ?


> RBF: Router should support GetUserMappingsProtocol
> --
>
> Key: HDFS-14545
> URL: https://issues.apache.org/jira/browse/HDFS-14545
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14545-HDFS-13891-01.patch, 
> HDFS-14545-HDFS-13891-02.patch, HDFS-14545-HDFS-13891-03.patch, 
> HDFS-14545-HDFS-13891-04.patch, HDFS-14545-HDFS-13891-05.patch, 
> HDFS-14545-HDFS-13891-06.patch, HDFS-14545-HDFS-13891-07.patch, 
> HDFS-14545-HDFS-13891-08.patch, HDFS-14545-HDFS-13891-09.patch, 
> HDFS-14545-HDFS-13891.000.patch
>
>
> We should be able to check the groups for a user from a Router.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14447) RBF: Router should support RefreshUserMappingsProtocol

2019-05-16 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841554#comment-16841554
 ] 

Lukas Majercak commented on HDFS-14447:
---

patch09 lgtm

> RBF: Router should support RefreshUserMappingsProtocol
> --
>
> Key: HDFS-14447
> URL: https://issues.apache.org/jira/browse/HDFS-14447
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Affects Versions: 3.1.0
>Reporter: Shen Yinjie
>Assignee: Shen Yinjie
>Priority: Major
> Fix For: HDFS-13891
>
> Attachments: HDFS-14447-HDFS-13891.01.patch, 
> HDFS-14447-HDFS-13891.02.patch, HDFS-14447-HDFS-13891.03.patch, 
> HDFS-14447-HDFS-13891.04.patch, HDFS-14447-HDFS-13891.05.patch, 
> HDFS-14447-HDFS-13891.06.patch, HDFS-14447-HDFS-13891.07.patch, 
> HDFS-14447-HDFS-13891.08.patch, HDFS-14447-HDFS-13891.09.patch, error.png
>
>
> HDFS with RBF
> We configure hadoop.proxyuser.xx.yy ,then execute hdfs dfsadmin 
> -Dfs.defaultFS=hdfs://router-fed -refreshSuperUserGroupsConfiguration,
>  it throws "Unknown protocol: ...RefreshUserMappingProtocol".
> RouterAdminServer should support RefreshUserMappingsProtocol , or a proxyuser 
> client would be refused to impersonate.As shown in the screenshot



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14447) RBF: Router should support RefreshUserMappingsProtocol

2019-05-14 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839720#comment-16839720
 ] 

Lukas Majercak commented on HDFS-14447:
---

There are a couple of syntax inconsistencies in 06.patch, such as lines: 303, 
317, 369, 370, 375, 380 in TestRefreshUserMappingsWithRouters. But other than 
that the patch lgtm

> RBF: Router should support RefreshUserMappingsProtocol
> --
>
> Key: HDFS-14447
> URL: https://issues.apache.org/jira/browse/HDFS-14447
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Affects Versions: 3.1.0
>Reporter: Shen Yinjie
>Assignee: Shen Yinjie
>Priority: Major
> Fix For: HDFS-13891
>
> Attachments: HDFS-14447-HDFS-13891.01.patch, 
> HDFS-14447-HDFS-13891.02.patch, HDFS-14447-HDFS-13891.03.patch, 
> HDFS-14447-HDFS-13891.04.patch, HDFS-14447-HDFS-13891.05.patch, 
> HDFS-14447-HDFS-13891.06.patch, error.png
>
>
> HDFS with RBF
> We configure hadoop.proxyuser.xx.yy ,then execute hdfs dfsadmin 
> -Dfs.defaultFS=hdfs://router-fed -refreshSuperUserGroupsConfiguration,
>  it throws "Unknown protocol: ...RefreshUserMappingProtocol".
> RouterAdminServer should support RefreshUserMappingsProtocol , or a proxyuser 
> client would be refused to impersonate.As shown in the screenshot



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2019-05-07 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16834936#comment-16834936
 ] 

Lukas Majercak commented on HDFS-14134:
---

Hi [~John Smith]. I'm not sure if it's that simple.

 Say you have 2 NNs, where nn1 is active, nn2 is standby. If your current 
target is nn2, but we send a request to both, you can get responses like:
nn1 - RETRY
nn2 - RETRY_AND_FAILOVER

In this case, the ideal scenario would be to failover to nn1, but what you're 
proposing would not do that. 

I think this obviously has room for improvement, as in some cases RETRY > 
FAILOVER is reasonable, but I'd like to refrain from increasing the scope of 
this JIRA. Maybe we can create a new one and have the discussion there?

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134.006.patch, HDFS-14134.007.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf, 
> HDFS-14134_retrypolicy_change_proposal_1.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2019-05-03 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832730#comment-16832730
 ] 

Lukas Majercak commented on HDFS-14134:
---

Hi [~John Smith]. I'm not sure I'm following, what's the concern with 
StandbyException triggering FAILOVER_AND_RETRY with the logic in patch 007 ?

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134.006.patch, HDFS-14134.007.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf, 
> HDFS-14134_retrypolicy_change_proposal_1.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14326) Add CorruptFilesCount to JMX

2019-02-28 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780939#comment-16780939
 ] 

Lukas Majercak commented on HDFS-14326:
---

Can we add more test coverage? We could just assert the expected length 
throughout the tests in TestListCorruptFileBlocks .

> Add CorruptFilesCount to JMX
> 
>
> Key: HDFS-14326
> URL: https://issues.apache.org/jira/browse/HDFS-14326
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: fs, metrics, namenode
>Reporter: Danny Becker
>Assignee: Danny Becker
>Priority: Minor
> Attachments: HDFS-14326.000.patch
>
>
> Add CorruptFilesCount to JMX



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2019-01-11 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16740810#comment-16740810
 ] 

Lukas Majercak commented on HDFS-14134:
---

+ more people [~szetszwo], [~jingzhao].

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134.006.patch, HDFS-14134.007.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf, 
> HDFS-14134_retrypolicy_change_proposal_1.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2019-01-11 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16740802#comment-16740802
 ] 

Lukas Majercak commented on HDFS-14134:
---

Thanks [~knanasi]. 

[~atm], [~eli], [~sureshms], [~sanjay.radia], [~xgong], [~jianhe]; I see you 
guys worked on this part of the codebase before, anyone available to review 
this? Thanks!

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134.006.patch, HDFS-14134.007.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf, 
> HDFS-14134_retrypolicy_change_proposal_1.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2019-01-10 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16739821#comment-16739821
 ] 

Lukas Majercak commented on HDFS-14134:
---

Patch 007 to fix checkstyle + whitespace warnings.

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134.006.patch, HDFS-14134.007.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf, 
> HDFS-14134_retrypolicy_change_proposal_1.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2019-01-10 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14134:
--
Attachment: HDFS-14134.007.patch

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134.006.patch, HDFS-14134.007.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf, 
> HDFS-14134_retrypolicy_change_proposal_1.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2019-01-10 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14134:
--
Attachment: HDFS-14134.006.patch

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134.006.patch, HDFS-14134_retrypolicy_change_proposal.pdf, 
> HDFS-14134_retrypolicy_change_proposal_1.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2019-01-10 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16739719#comment-16739719
 ] 

Lukas Majercak commented on HDFS-14134:
---

Added patch006 together with HDFS-14134_retrypolicy_change_proposal_1.pdf to 
explain the changes. After the discussion, it seems like just changing the 
logic for remote IOExceptions together with the priority for retry actions will 
be enough. Could you review this [~knanasi] ? Thanks!

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134.006.patch, HDFS-14134_retrypolicy_change_proposal.pdf, 
> HDFS-14134_retrypolicy_change_proposal_1.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2019-01-10 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14134:
--
Attachment: (was: HDFS-14134.006.patch)

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134.006.patch, HDFS-14134_retrypolicy_change_proposal.pdf, 
> HDFS-14134_retrypolicy_change_proposal_1.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2019-01-10 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14134:
--
Attachment: HDFS-14134_retrypolicy_change_proposal_1.pdf

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134.006.patch, HDFS-14134_retrypolicy_change_proposal.pdf, 
> HDFS-14134_retrypolicy_change_proposal_1.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2019-01-10 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14134:
--
Attachment: HDFS-14134.006.patch

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134.006.patch, HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2019-01-09 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738702#comment-16738702
 ] 

Lukas Majercak commented on HDFS-14134:
---

Also note that previously, if a hedging request got FAILOVER_RETRY and some 
request got SocketExc on nonidempotent operation (e.g. FAIL), the client would 
still pick FAILOVER_RETRY over FAIL, so i think we are fixing an issue here as 
well.

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2019-01-09 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738700#comment-16738700
 ] 

Lukas Majercak commented on HDFS-14134:
---

I see, that makes sense, I'm happy to change
SocketException (non-idempotent)
IOException (non-idempotent)
back to being FAIL.

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-13 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720564#comment-16720564
 ] 

Lukas Majercak commented on HDFS-14134:
---

I'll go through that discussion.

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-13 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720506#comment-16720506
 ] 

Lukas Majercak commented on HDFS-14134:
---

I agree non-remote IOExceptions could be network related, but this is covered 
right? Non-remote IOExceptions are retried with this change, no matter whether 
the operation is idempotent.

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-13 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720507#comment-16720507
 ] 

Lukas Majercak commented on HDFS-14134:
---

I'd argue that this change is even safer, because previously the retry action 
would be FAIL for:
SocketExceptions (non-idempotent)
Non-remote IOExceptions (non-idempotent)

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-12 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719444#comment-16719444
 ] 

Lukas Majercak commented on HDFS-14134:
---

Retrying failed idempotent operations might be safe, but it surely is wasteful. 
Check the test I wrote for getXAttr, the client just retries the same exception 
over and over for no reason. We have to have a concept of nonretriable 
exceptions in HDFS, and I feel like RemoteException of an idempotent operation 
is a very good start. 

The previous design was very strange, they chose to FAIL if the operation was 
not idempotent and the exception was not Remote, which does not make a lot of 
sense to me.

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-11 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717754#comment-16717754
 ] 

Lukas Majercak commented on HDFS-14134:
---

Why should we retry if the operation is idempotent and the exception is 
remoteexception? The definition of an idempotent operation is that it will have 
the same outcome next time as well right? In that case, we should just fail 
fast. Check 
TestRequestHedgingProxyProvider.testIdempotentOperationShouldNotGetStuckInRetries

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-10 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14134:
--
Attachment: HDFS-14134.005.patch

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-10 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715863#comment-16715863
 ] 

Lukas Majercak commented on HDFS-14134:
---

Patch 005 to fix minor checkstyle issue in UnreliableImplementation

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-10 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14134:
--
Attachment: HDFS-14134.004.patch

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-10 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715705#comment-16715705
 ] 

Lukas Majercak commented on HDFS-14134:
---

Added tests to cover all cases (SocketExc, IOException, RemoteException, 
non/idempotent) in patch004. Anyone available to review?

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134.004.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-10 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14134:
--
Attachment: (was: HDFS-14134.003.patch)

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-10 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715682#comment-16715682
 ] 

Lukas Majercak commented on HDFS-14134:
---

Fixed TestFailoverProxy as well, still might need to add more tests to cover 
all the exceptions.

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-10 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14134:
--
Attachment: HDFS-14134.003.patch

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-10 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14134:
--
Attachment: HDFS-14134.003.patch

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-10 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715579#comment-16715579
 ] 

Lukas Majercak commented on HDFS-14134:
---

I realized I needed to change the mock expectations to fix 
TestLoadBalancingKMSClientProvider. Added patch003 to fix that.

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134.003.patch, HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-10 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715545#comment-16715545
 ] 

Lukas Majercak commented on HDFS-14134:
---

Thanks for the review [~knanasi]. I've uploaded patch002 to include non-remote 
IOException handling + fix TestDefaultRetryPolicy. Seems like this should also 
fix TestLoadBalancingKMSClientProvider. I'll then fix and add more tests in 
TestFailoverProxy.

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-10 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14134:
--
Attachment: HDFS-14134.002.patch

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-07 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713440#comment-16713440
 ] 

Lukas Majercak commented on HDFS-14134:
---

The unit tests are expected to fail, I can fix them once we agree on how the 
retry policy should behave.

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-07 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713358#comment-16713358
 ] 

Lukas Majercak commented on HDFS-14134:
---

For the retry policy changes, maybe it would make sense to just RETRY when the 
exception is RemoteException and the operation is not idempotent/atmostonce.

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-07 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14134:
--
Attachment: HDFS-14134.001.patch

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-07 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14134:
--
Attachment: (was: HDFS-14134.001.patch)

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-07 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713326#comment-16713326
 ] 

Lukas Majercak commented on HDFS-14134:
---

Reuploaded the patch, because this guy Yetus took my pdf file as patch.

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-07 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713314#comment-16713314
 ] 

Lukas Majercak commented on HDFS-14134:
---

Added HDFS-14134_retrypolicy_change_proposal.pdf to illustrate the proposed 
changes in the FailoverOnNetworkExceptionRetry retry policy.

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-07 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14134:
--
Attachment: HDFS-14134_retrypolicy_change_proposal.pdf

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch, 
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-07 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14134:
--
Attachment: HDFS-14134.001.patch

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where file does 
> not have the attribute, NN throws an IOException with message "could not find 
> attr". The current client retry policy determines the action for that to be 
> FAILOVER_AND_RETRY. The client then fails over and retries until it reaches 
> the maximum number of retries. Supposedly, the client should be able to tell 
> that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-07 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14134:
--
Attachment: HDFS-14134.001.patch

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-07 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14134:
--
Attachment: (was: HDFS-14134.001.patch)

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-07 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14134:
--
Description: 
Currently, some operations that throw IOException on the NameNode are evaluated 
by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail fast.

For example, when calling getXAttr("user.some_attr", file") where the file does 
not have the attribute, NN throws an IOException with message "could not find 
attr". The current client retry policy determines the action for that to be 
FAILOVER_AND_RETRY. The client then fails over and retries until it reaches the 
maximum number of retries. Supposedly, the client should be able to tell that 
this exception is normal and fail fast. 

Moreover, even if the action was FAIL, the RetryInvocationHandler looks at all 
the retry actions from all requests, and FAILOVER_AND_RETRY takes precedence 
over FAIL action.

  was:
Currently, some operations that throw IOException on the NameNode are evaluated 
by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail fast.

For example, when calling getXAttr("user.some_attr", file") where file does not 
have the attribute, NN throws an IOException with message "could not find 
attr". The current client retry policy determines the action for that to be 
FAILOVER_AND_RETRY. The client then fails over and retries until it reaches the 
maximum number of retries. Supposedly, the client should be able to tell that 
this exception is normal and fail fast. 

Moreover, even if the action was FAIL, the RetryInvocationHandler looks at all 
the retry actions from all requests, and FAILOVER_AND_RETRY takes precedence 
over FAIL action.


> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file 
> does not have the attribute, NN throws an IOException with message "could not 
> find attr". The current client retry policy determines the action for that to 
> be FAILOVER_AND_RETRY. The client then fails over and retries until it 
> reaches the maximum number of retries. Supposedly, the client should be able 
> to tell that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-07 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713233#comment-16713233
 ] 

Lukas Majercak commented on HDFS-14134:
---

Added a patch to demonstrate the issue.

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-14134.001.patch
>
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where file does 
> not have the attribute, NN throws an IOException with message "could not find 
> attr". The current client retry policy determines the action for that to be 
> FAILOVER_AND_RETRY. The client then fails over and retries until it reaches 
> the maximum number of retries. Supposedly, the client should be able to tell 
> that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-07 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14134:
--
Description: 
Currently, some operations that throw IOException on the NameNode are evaluated 
by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail fast.

For example, when calling getXAttr("user.some_attr", file") where file does not 
have the attribute, NN throws an IOException with message "could not find 
attr". The current client retry policy determines the action for that to be 
FAILOVER_AND_RETRY. The client then fails over and retries until it reaches the 
maximum number of retries. Supposedly, the client should be able to tell that 
this exception is normal and fail fast. 

Moreover, even if the action was FAIL, the RetryInvocationHandler looks at all 
the retry actions from all requests, and FAILOVER_AND_RETRY takes precedence 
over FAIL action.

> Idempotent operations throwing RemoteException should not be retried by the 
> client
> --
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client, ipc
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
>
> Currently, some operations that throw IOException on the NameNode are 
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail 
> fast.
> For example, when calling getXAttr("user.some_attr", file") where file does 
> not have the attribute, NN throws an IOException with message "could not find 
> attr". The current client retry policy determines the action for that to be 
> FAILOVER_AND_RETRY. The client then fails over and retries until it reaches 
> the maximum number of retries. Supposedly, the client should be able to tell 
> that this exception is normal and fail fast. 
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at 
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes 
> precedence over FAIL action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14134) Idempotent operations throwing RemoteException should not be retried by the client

2018-12-07 Thread Lukas Majercak (JIRA)
Lukas Majercak created HDFS-14134:
-

 Summary: Idempotent operations throwing RemoteException should not 
be retried by the client
 Key: HDFS-14134
 URL: https://issues.apache.org/jira/browse/HDFS-14134
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs, hdfs-client, ipc
Reporter: Lukas Majercak
Assignee: Lukas Majercak






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14043) Tolerate corrupted seen_txid file

2018-11-05 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675934#comment-16675934
 ] 

Lukas Majercak commented on HDFS-14043:
---

Yes, I ran the TestSaveNamespace in my local trunk version + we ran all the 
hdfs tests with this patch applied on top of our internal 2.9 version.

> Tolerate corrupted seen_txid file
> -
>
> Key: HDFS-14043
> URL: https://issues.apache.org/jira/browse/HDFS-14043
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Affects Versions: 2.9.2, 3.1.2, 2.9.3
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Attachments: HDFS-14043.001.patch, HDFS-14043.002.patch, 
> HDFS-14043.003.patch
>
>
> We already tolerate IOExceptions when reading seen_txid file from namenode's 
> dirs. So we take the maximum txid of all the *readable* namenode dirs. We 
> should extend this to when the file is corrupted. Currently, 
> PersistentLongFile.readFile throws NumberFormatException in this case and the 
> whole NN crashes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14043) Tolerate corrupted seen_txid file

2018-11-05 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675766#comment-16675766
 ] 

Lukas Majercak commented on HDFS-14043:
---

Added patch003 to fix checkstyle errors.

> Tolerate corrupted seen_txid file
> -
>
> Key: HDFS-14043
> URL: https://issues.apache.org/jira/browse/HDFS-14043
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Affects Versions: 2.9.2, 3.1.2, 2.9.3
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Attachments: HDFS-14043.001.patch, HDFS-14043.002.patch, 
> HDFS-14043.003.patch
>
>
> We already tolerate IOExceptions when reading seen_txid file from namenode's 
> dirs. So we take the maximum txid of all the *readable* namenode dirs. We 
> should extend this to when the file is corrupted. Currently, 
> PersistentLongFile.readFile throws NumberFormatException in this case and the 
> whole NN crashes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14043) Tolerate corrupted seen_txid file

2018-11-05 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14043:
--
Attachment: HDFS-14043.003.patch

> Tolerate corrupted seen_txid file
> -
>
> Key: HDFS-14043
> URL: https://issues.apache.org/jira/browse/HDFS-14043
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Affects Versions: 2.9.2, 3.1.2, 2.9.3
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Attachments: HDFS-14043.001.patch, HDFS-14043.002.patch, 
> HDFS-14043.003.patch
>
>
> We already tolerate IOExceptions when reading seen_txid file from namenode's 
> dirs. So we take the maximum txid of all the *readable* namenode dirs. We 
> should extend this to when the file is corrupted. Currently, 
> PersistentLongFile.readFile throws NumberFormatException in this case and the 
> whole NN crashes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14043) Tolerate corrupted seen_txid file

2018-11-05 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14043:
--
Attachment: HDFS-14043.002.patch

> Tolerate corrupted seen_txid file
> -
>
> Key: HDFS-14043
> URL: https://issues.apache.org/jira/browse/HDFS-14043
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Affects Versions: 2.9.2, 3.1.2, 2.9.3
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Attachments: HDFS-14043.001.patch, HDFS-14043.002.patch
>
>
> We already tolerate IOExceptions when reading seen_txid file from namenode's 
> dirs. So we take the maximum txid of all the *readable* namenode dirs. We 
> should extend this to when the file is corrupted. Currently, 
> PersistentLongFile.readFile throws NumberFormatException in this case and the 
> whole NN crashes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14043) Tolerate corrupted seen_txid file

2018-11-05 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675677#comment-16675677
 ] 

Lukas Majercak commented on HDFS-14043:
---

Added patch002 that should apply to trunk.

> Tolerate corrupted seen_txid file
> -
>
> Key: HDFS-14043
> URL: https://issues.apache.org/jira/browse/HDFS-14043
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Affects Versions: 2.9.2, 3.1.2, 2.9.3
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Attachments: HDFS-14043.001.patch, HDFS-14043.002.patch
>
>
> We already tolerate IOExceptions when reading seen_txid file from namenode's 
> dirs. So we take the maximum txid of all the *readable* namenode dirs. We 
> should extend this to when the file is corrupted. Currently, 
> PersistentLongFile.readFile throws NumberFormatException in this case and the 
> whole NN crashes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14043) Tolerate corrupted seen_txid file

2018-11-01 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672077#comment-16672077
 ] 

Lukas Majercak edited comment on HDFS-14043 at 11/1/18 7:38 PM:


[~cmccabe] could you review this, as this seems to be related to HDFS-3004.


was (Author: lukmajercak):
[~cmccabe] could you review this, as this seems to be related to 
https://issues.apache.org/jira/browse/HDFS-3004

> Tolerate corrupted seen_txid file
> -
>
> Key: HDFS-14043
> URL: https://issues.apache.org/jira/browse/HDFS-14043
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Affects Versions: 2.9.2, 3.1.2, 2.9.3
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Attachments: HDFS-14043.001.patch
>
>
> We already tolerate IOExceptions when reading seen_txid file from namenode's 
> dirs. So we take the maximum txid of all the *readable* namenode dirs. We 
> should extend this to when the file is corrupted. Currently, 
> PersistentLongFile.readFile throws NumberFormatException in this case and the 
> whole NN crashes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14043) Tolerate corrupted seen_txid file

2018-11-01 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672077#comment-16672077
 ] 

Lukas Majercak commented on HDFS-14043:
---

[~cmccabe] could you review this, as this seems to be related to 
https://issues.apache.org/jira/browse/HDFS-3004

> Tolerate corrupted seen_txid file
> -
>
> Key: HDFS-14043
> URL: https://issues.apache.org/jira/browse/HDFS-14043
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Affects Versions: 2.9.2, 3.1.2, 2.9.3
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Attachments: HDFS-14043.001.patch
>
>
> We already tolerate IOExceptions when reading seen_txid file from namenode's 
> dirs. So we take the maximum txid of all the *readable* namenode dirs. We 
> should extend this to when the file is corrupted. Currently, 
> PersistentLongFile.readFile throws NumberFormatException in this case and the 
> whole NN crashes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14043) Tolerate corrupted seen_txid file

2018-11-01 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14043:
--
Description: We already tolerate IOExceptions when reading seen_txid file 
from namenode's dirs. So we take the maximum txid of all the *readable* 
namenode dirs. We should extend this to when the file is corrupted. Currently, 
PersistentLongFile.readFile throws NumberFormatException in this case and the 
whole NN crashes.

> Tolerate corrupted seen_txid file
> -
>
> Key: HDFS-14043
> URL: https://issues.apache.org/jira/browse/HDFS-14043
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Affects Versions: 2.9.2, 3.1.2, 2.9.3
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Attachments: HDFS-14043.001.patch
>
>
> We already tolerate IOExceptions when reading seen_txid file from namenode's 
> dirs. So we take the maximum txid of all the *readable* namenode dirs. We 
> should extend this to when the file is corrupted. Currently, 
> PersistentLongFile.readFile throws NumberFormatException in this case and the 
> whole NN crashes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14043) Tolerate corrupted seen_txid file

2018-11-01 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14043:
--
Attachment: HDFS-14043.001.patch

> Tolerate corrupted seen_txid file
> -
>
> Key: HDFS-14043
> URL: https://issues.apache.org/jira/browse/HDFS-14043
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Affects Versions: 2.9.2, 3.1.2, 2.9.3
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Attachments: HDFS-14043.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14043) Tolerate corrupted seen_txid file

2018-11-01 Thread Lukas Majercak (JIRA)
Lukas Majercak created HDFS-14043:
-

 Summary: Tolerate corrupted seen_txid file
 Key: HDFS-14043
 URL: https://issues.apache.org/jira/browse/HDFS-14043
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs, namenode
Affects Versions: 2.9.2, 3.1.2, 2.9.3
Reporter: Lukas Majercak
Assignee: Lukas Majercak






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12284) RBF: Support for Kerberos authentication

2018-10-23 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661362#comment-16661362
 ] 

Lukas Majercak commented on HDFS-12284:
---

[~daryn], I feel like we should distinguish between ServicePrincipalNames and 
UserPrincipalNames for all services in HDFS, or at least give the admin an 
option to override the user principal. The _HOST solution is okay, but it 
relies on DNS giving consistent results. This inconsistency is fine for SPNs, 
as you can have as many as you want in your keytab, but is not okay for client 
principals.

 Say you have a NN running on HOSTNAME, and set it up using hdfs/_HOST@DOMAIN 
as the principal name. Now, one day, when your NN starts up and tries to 
resolve itself using _HOST, your DNS server decides to return back 
HOSTNAME.domain instead of the usual HOSTNAME. Your NN then uses that as the 
client principal to log in, and will fail.

Maybe something like {{dfs.federation.router.kerberos.user.principal}} would be 
better than {{dfs.federation.router.hostname}}

> RBF: Support for Kerberos authentication
> 
>
> Key: HDFS-12284
> URL: https://issues.apache.org/jira/browse/HDFS-12284
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: security
>Reporter: Zhe Zhang
>Assignee: Sherwood Zheng
>Priority: Major
> Attachments: HDFS-12284-HDFS-13532.004.patch, 
> HDFS-12284-HDFS-13532.005.patch, HDFS-12284-HDFS-13532.006.patch, 
> HDFS-12284-HDFS-13532.007.patch, HDFS-12284-HDFS-13532.008.patch, 
> HDFS-12284-HDFS-13532.009.patch, HDFS-12284-HDFS-13532.010.patch, 
> HDFS-12284-HDFS-13532.011.patch, HDFS-12284.000.patch, HDFS-12284.001.patch, 
> HDFS-12284.002.patch, HDFS-12284.003.patch
>
>
> HDFS Router should support Kerberos authentication and issuing / managing 
> HDFS delegation tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14010) Pass correct DF usage to ReservedSpaceCalculator builder

2018-10-19 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657347#comment-16657347
 ] 

Lukas Majercak commented on HDFS-14010:
---

002.patch LGTM. Maybe some warn log when usage==null, and a comment for the 
unit test.

> Pass correct DF usage to ReservedSpaceCalculator builder
> 
>
> Key: HDFS-14010
> URL: https://issues.apache.org/jira/browse/HDFS-14010
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.9.2
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Minor
> Attachments: HDFS-14010.001.patch, HDFS-14010.002.patch
>
>
> In FsVolumeImpl's constructor, we currently pass the DF usage that was passed 
> to the constructor to ReservedSpaceCalculator.Builder. This can cause issues 
> if the usage is changed in the constructor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14010) Pass correct DF usage to ReservedSpaceCalculator builder

2018-10-18 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14010:
--
Affects Version/s: 2.9.2

> Pass correct DF usage to ReservedSpaceCalculator builder
> 
>
> Key: HDFS-14010
> URL: https://issues.apache.org/jira/browse/HDFS-14010
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.9.2
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Minor
> Attachments: HDFS-14010.001.patch
>
>
> In FsVolumeImpl's constructor, we currently pass the DF usage that was passed 
> to the constructor to ReservedSpaceCalculator.Builder. This can cause issues 
> if the usage is changed in the constructor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14010) Pass correct DF usage to ReservedSpaceCalculator builder

2018-10-18 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14010:
--
Attachment: HDFS-14010.001.patch

> Pass correct DF usage to ReservedSpaceCalculator builder
> 
>
> Key: HDFS-14010
> URL: https://issues.apache.org/jira/browse/HDFS-14010
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Minor
> Attachments: HDFS-14010.001.patch
>
>
> In FsVolumeImpl's constructor, we currently pass the DF usage that was passed 
> to the constructor to ReservedSpaceCalculator.Builder. This can cause issues 
> if the usage is changed in the constructor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14010) Pass correct DF usage to ReservedSpaceCalculator builder

2018-10-18 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-14010:
--
Description: In FsVolumeImpl's constructor, we currently pass the DF usage 
that was passed to the constructor to ReservedSpaceCalculator.Builder. This can 
cause issues if the usage is changed in the constructor.

> Pass correct DF usage to ReservedSpaceCalculator builder
> 
>
> Key: HDFS-14010
> URL: https://issues.apache.org/jira/browse/HDFS-14010
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Minor
>
> In FsVolumeImpl's constructor, we currently pass the DF usage that was passed 
> to the constructor to ReservedSpaceCalculator.Builder. This can cause issues 
> if the usage is changed in the constructor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14010) Pass correct DF usage to ReservedSpaceCalculator builder

2018-10-18 Thread Lukas Majercak (JIRA)
Lukas Majercak created HDFS-14010:
-

 Summary: Pass correct DF usage to ReservedSpaceCalculator builder
 Key: HDFS-14010
 URL: https://issues.apache.org/jira/browse/HDFS-14010
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Lukas Majercak
Assignee: Lukas Majercak






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12284) RBF: Support for Kerberos authentication

2018-10-18 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655772#comment-16655772
 ] 

Lukas Majercak commented on HDFS-12284:
---

There are star imports in every test class 
import static org.apache.hadoop.fs.contract.router.SecurityConfUtil.*;



> RBF: Support for Kerberos authentication
> 
>
> Key: HDFS-12284
> URL: https://issues.apache.org/jira/browse/HDFS-12284
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: security
>Reporter: Zhe Zhang
>Assignee: Sherwood Zheng
>Priority: Major
> Attachments: HDFS-12284-HDFS-13532.004.patch, 
> HDFS-12284-HDFS-13532.005.patch, HDFS-12284-HDFS-13532.006.patch, 
> HDFS-12284-HDFS-13532.007.patch, HDFS-12284-HDFS-13532.008.patch, 
> HDFS-12284.000.patch, HDFS-12284.001.patch, HDFS-12284.002.patch, 
> HDFS-12284.003.patch
>
>
> HDFS Router should support Kerberos authentication and issuing / managing 
> HDFS delegation tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13976) Backport HDFS-12813 to branch-2.9

2018-10-10 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-13976:
--
Attachment: TestRequestHedgingProxyProvider.png

> Backport HDFS-12813 to branch-2.9
> -
>
> Key: HDFS-13976
> URL: https://issues.apache.org/jira/browse/HDFS-13976
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Fix For: 2.9.2
>
> Attachments: HDFS-12813.branch-2.001.patch, 
> HDFS-12813.branch-2.9.001.patch, TestRequestHedgingProxyProvider.png
>
>
> 2.9 also shows the issue from HDFS-12813:
> HDFS-11395 fixed the problem where the MultiException thrown by 
> RequestHedgingProxyProvider was hidden. However when the target proxy size is 
> 1, then unwrapping is not done for the InvocationTargetException. for target 
> proxy size of 1, the unwrapping should be done till first level where as for 
> multiple proxy size, it should be done at 2 levels.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13976) Backport HDFS-12813 to branch-2.9

2018-10-10 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645401#comment-16645401
 ] 

Lukas Majercak commented on HDFS-13976:
---

Ran the tests on branch-2.9 and it seems fine, TestRequestHedgingProxyProvider 
passes and everything else seems intact:
 !TestRequestHedgingProxyProvider.png! 

> Backport HDFS-12813 to branch-2.9
> -
>
> Key: HDFS-13976
> URL: https://issues.apache.org/jira/browse/HDFS-13976
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Fix For: 2.9.2
>
> Attachments: HDFS-12813.branch-2.001.patch, 
> HDFS-12813.branch-2.9.001.patch, TestRequestHedgingProxyProvider.png
>
>
> 2.9 also shows the issue from HDFS-12813:
> HDFS-11395 fixed the problem where the MultiException thrown by 
> RequestHedgingProxyProvider was hidden. However when the target proxy size is 
> 1, then unwrapping is not done for the InvocationTargetException. for target 
> proxy size of 1, the unwrapping should be done till first level where as for 
> multiple proxy size, it should be done at 2 levels.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13976) Backport HDFS-12813 to branch-2.9

2018-10-08 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-13976:
--
Attachment: HDFS-12813.branch-2.9.001.patch

> Backport HDFS-12813 to branch-2.9
> -
>
> Key: HDFS-13976
> URL: https://issues.apache.org/jira/browse/HDFS-13976
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Fix For: 2.9.2
>
> Attachments: HDFS-12813.branch-2.001.patch, 
> HDFS-12813.branch-2.9.001.patch
>
>
> 2.9 also shows the issue from HDFS-12813:
> HDFS-11395 fixed the problem where the MultiException thrown by 
> RequestHedgingProxyProvider was hidden. However when the target proxy size is 
> 1, then unwrapping is not done for the InvocationTargetException. for target 
> proxy size of 1, the unwrapping should be done till first level where as for 
> multiple proxy size, it should be done at 2 levels.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13976) Backport HDFS-12813 to branch-2.9

2018-10-08 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-13976:
--
Attachment: HDFS-12813.branch-2.001.patch

> Backport HDFS-12813 to branch-2.9
> -
>
> Key: HDFS-13976
> URL: https://issues.apache.org/jira/browse/HDFS-13976
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
> Fix For: 2.9.2
>
> Attachments: HDFS-12813.branch-2.001.patch
>
>
> 2.9 also shows the issue from HDFS-12813:
> HDFS-11395 fixed the problem where the MultiException thrown by 
> RequestHedgingProxyProvider was hidden. However when the target proxy size is 
> 1, then unwrapping is not done for the InvocationTargetException. for target 
> proxy size of 1, the unwrapping should be done till first level where as for 
> multiple proxy size, it should be done at 2 levels.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13976) Backport HDFS-12813 to branch-2.9

2018-10-08 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-13976:
--
Fix Version/s: 2.9.2

> Backport HDFS-12813 to branch-2.9
> -
>
> Key: HDFS-13976
> URL: https://issues.apache.org/jira/browse/HDFS-13976
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, hdfs-client
>Reporter: Lukas Majercak
>Priority: Major
> Fix For: 2.9.2
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13976) Backport HDFS-12813 to branch-2.9

2018-10-08 Thread Lukas Majercak (JIRA)
Lukas Majercak created HDFS-13976:
-

 Summary: Backport HDFS-12813 to branch-2.9
 Key: HDFS-13976
 URL: https://issues.apache.org/jira/browse/HDFS-13976
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs, hdfs-client
Reporter: Lukas Majercak






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13792) Fix FSN read/write lock metrics name

2018-08-06 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570761#comment-16570761
 ] 

Lukas Majercak commented on HDFS-13792:
---

Thanks for this [~csun], patch001 LGTM.

> Fix FSN read/write lock metrics name
> 
>
> Key: HDFS-13792
> URL: https://issues.apache.org/jira/browse/HDFS-13792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, metrics
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Trivial
> Attachments: HDFS-13792.000.patch, HDFS-13792.001.patch
>
>
> The metrics name for FSN read/write lock should be in the format:
> {code}
> FSN(Read|Write)Lock`*OperationName*`NanosNumOps
> {code}
> not
> {code}
> FSN(Read|Write)Lock`*OperationName*`NumOps
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13757) After HDFS-12886, close() can throw AssertionError "Negative replicas!"

2018-07-20 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551420#comment-16551420
 ] 

Lukas Majercak commented on HDFS-13757:
---

Surely the test should fail if you disable IBR?

> After HDFS-12886, close() can throw AssertionError "Negative replicas!"
> ---
>
> Key: HDFS-13757
> URL: https://issues.apache.org/jira/browse/HDFS-13757
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.0, 2.10.0, 2.9.1, 3.2.0, 3.0.3
>Reporter: Wei-Chiu Chuang
>Priority: Major
> Attachments: HDFS-13757.test.02.patch, HDFS-13757.test.patch
>
>
> While investigating a data corruption bug caused by concurrent recoverLease() 
> and close(), I found HDFS-12886 may cause close() to throw AssertionError 
> under a corner case, because the block has zero live replica, and client 
> calls recoverLease() immediately followed by close().
> {noformat}
> org.apache.hadoop.ipc.RemoteException(java.lang.AssertionError): Negative 
> replicas!
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.LowRedundancyBlocks.getPriority(LowRedundancyBlocks.java:197)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.LowRedundancyBlocks.update(LowRedundancyBlocks.java:422)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.updateNeededReconstructions(BlockManager.java:4274)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.commitOrCompleteLastBlock(BlockManager.java:1001)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitOrCompleteLastBlock(FSNamesystem.java:3471)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.completeFileInternal(FSDirWriteFileOp.java:713)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.completeFile(FSDirWriteFileOp.java:671)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:2854)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:928)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:607)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1689)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
> {noformat}
> I have a test case to reproduce it.
> [~lukmajercak] [~elgoiri] would you please take a look at it? I think we 
> should add a check to reject completeFile() if the block is under recovery, 
> similar to what's proposed in HDFS-10240.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13757) After HDFS-12886, close() can throw AssertionError "Negative replicas!"

2018-07-20 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551418#comment-16551418
 ] 

Lukas Majercak commented on HDFS-13757:
---

Hi [~jojochuang]. Unfortunately, I haven't been able to reproduce this on trunk 
(ran the test 150~ times). I could reproduce something on 2.9, but the 
exception was not the same and the failed run did not even touch the code added 
in HDFS-12886.

> After HDFS-12886, close() can throw AssertionError "Negative replicas!"
> ---
>
> Key: HDFS-13757
> URL: https://issues.apache.org/jira/browse/HDFS-13757
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.0, 2.10.0, 2.9.1, 3.2.0, 3.0.3
>Reporter: Wei-Chiu Chuang
>Priority: Major
> Attachments: HDFS-13757.test.02.patch, HDFS-13757.test.patch
>
>
> While investigating a data corruption bug caused by concurrent recoverLease() 
> and close(), I found HDFS-12886 may cause close() to throw AssertionError 
> under a corner case, because the block has zero live replica, and client 
> calls recoverLease() immediately followed by close().
> {noformat}
> org.apache.hadoop.ipc.RemoteException(java.lang.AssertionError): Negative 
> replicas!
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.LowRedundancyBlocks.getPriority(LowRedundancyBlocks.java:197)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.LowRedundancyBlocks.update(LowRedundancyBlocks.java:422)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.updateNeededReconstructions(BlockManager.java:4274)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.commitOrCompleteLastBlock(BlockManager.java:1001)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitOrCompleteLastBlock(FSNamesystem.java:3471)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.completeFileInternal(FSDirWriteFileOp.java:713)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.completeFile(FSDirWriteFileOp.java:671)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:2854)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:928)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:607)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1689)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
> {noformat}
> I have a test case to reproduce it.
> [~lukmajercak] [~elgoiri] would you please take a look at it? I think we 
> should add a check to reject completeFile() if the block is under recovery, 
> similar to what's proposed in HDFS-10240.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13714) Fix TestNameNodePrunesMissingStorages test failures on Windows

2018-07-02 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530423#comment-16530423
 ] 

Lukas Majercak edited comment on HDFS-13714 at 7/2/18 8:44 PM:
---

The rename used is standard Java. From docs:
{code:java}
Renames the file denoted by this abstract pathname.
*
*  Many aspects of the behavior of this method are inherently
* platform-dependent: The rename operation might not be able to move a
* file from one filesystem to another, it might not be atomic, and it
* might not succeed if a file with the destination abstract pathname
* already exists. The return value should always be checked to make sure
* that the rename operation was successful.{code}
Seems like this fails on Windows. Proposing to change to delete() followed by 
renameTo()


was (Author: lukmajercak):
The rename used is standard Java. From docs:
{code:java}
Renames the file denoted by this abstract pathname.
*
*  Many aspects of the behavior of this method are inherently
* platform-dependent: The rename operation might not be able to move a
* file from one filesystem to another, it might not be atomic, and it
* might not succeed if a file with the destination abstract pathname
* already exists. The return value should always be checked to make sure
* that the rename operation was successful.{code}
Seems like this fails on Windows. Proposing to change to delete() followed by 
renameTo()


> Fix TestNameNodePrunesMissingStorages test failures on Windows
> --
>
> Key: HDFS-13714
> URL: https://issues.apache.org/jira/browse/HDFS-13714
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode, test
>Affects Versions: 3.1.0, 2.9.1, 3.2.0
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
>  Labels: windows
> Attachments: HDFS-13714.000.patch
>
>
> Failed here:
> https://builds.apache.org/job/hadoop-trunk-win/508/testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestNameNodePrunesMissingStorages/testRenamingStorageIds/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13714) Fix TestNameNodePrunesMissingStorages test failures on Windows

2018-07-02 Thread Lukas Majercak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530423#comment-16530423
 ] 

Lukas Majercak commented on HDFS-13714:
---

The rename used is standard Java. From docs:
{code:java}
Renames the file denoted by this abstract pathname.
*
*  Many aspects of the behavior of this method are inherently
* platform-dependent: The rename operation might not be able to move a
* file from one filesystem to another, it might not be atomic, and it
* might not succeed if a file with the destination abstract pathname
* already exists. The return value should always be checked to make sure
* that the rename operation was successful.{code}
Seems like this fails on Windows. Proposing to change to delete() followed by 
renameTo()


> Fix TestNameNodePrunesMissingStorages test failures on Windows
> --
>
> Key: HDFS-13714
> URL: https://issues.apache.org/jira/browse/HDFS-13714
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode, test
>Affects Versions: 3.1.0, 2.9.1, 3.2.0
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
>  Labels: windows
> Attachments: HDFS-13714.000.patch
>
>
> Failed here:
> https://builds.apache.org/job/hadoop-trunk-win/508/testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestNameNodePrunesMissingStorages/testRenamingStorageIds/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13714) Fix TestNameNodePrunesMissingStorages test failures on Windows

2018-07-02 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-13714:
--
Affects Version/s: 3.2.0

> Fix TestNameNodePrunesMissingStorages test failures on Windows
> --
>
> Key: HDFS-13714
> URL: https://issues.apache.org/jira/browse/HDFS-13714
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode, test
>Affects Versions: 3.1.0, 2.9.1, 3.2.0
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
>  Labels: windows
> Attachments: HDFS-13714.000.patch
>
>
> Failed here:
> https://builds.apache.org/job/hadoop-trunk-win/508/testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestNameNodePrunesMissingStorages/testRenamingStorageIds/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-13714) Fix TestNameNodePrunesMissingStorages test failures on Windows

2018-07-02 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-13714 started by Lukas Majercak.
-
> Fix TestNameNodePrunesMissingStorages test failures on Windows
> --
>
> Key: HDFS-13714
> URL: https://issues.apache.org/jira/browse/HDFS-13714
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode, test
>Affects Versions: 3.1.0, 2.9.1
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
>  Labels: windows
> Attachments: HDFS-13714.000.patch
>
>
> Failed here:
> https://builds.apache.org/job/hadoop-trunk-win/508/testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestNameNodePrunesMissingStorages/testRenamingStorageIds/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13714) Fix TestNameNodePrunesMissingStorages test failures on Windows

2018-07-02 Thread Lukas Majercak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Majercak updated HDFS-13714:
--
Attachment: HDFS-13714.000.patch

> Fix TestNameNodePrunesMissingStorages test failures on Windows
> --
>
> Key: HDFS-13714
> URL: https://issues.apache.org/jira/browse/HDFS-13714
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode, test
>Affects Versions: 3.1.0, 2.9.1
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Major
>  Labels: windows
> Attachments: HDFS-13714.000.patch
>
>
> Failed here:
> https://builds.apache.org/job/hadoop-trunk-win/508/testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestNameNodePrunesMissingStorages/testRenamingStorageIds/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   3   4   >