[jira] [Commented] (HADOOP-17905) Modify Text.ensureCapacity() to efficiently max out the backing array size

2021-09-11 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17413527#comment-17413527
 ] 

Peter Bacsko commented on HADOOP-17905:
---

[~elgoiri] PR is available for review.

cc [~snemeth].

> Modify Text.ensureCapacity() to efficiently max out the backing array size
> --
>
> Key: HADOOP-17905
> URL: https://issues.apache.org/jira/browse/HADOOP-17905
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a continuation of HADOOP-17901.
> Right now we use a factor of 1.5x to increase the byte array if it's full. 
> However, if the size reaches a certain point, the increment is only (current 
> size + length). This can cause performance issues if the textual data which 
> we intend to store is beyond this point.
> Instead, let's max out the array to the maximum. Based on different sources, 
> a safe choice seems to be Integer.MAX_VALUE - 8 (see ArrayList, 
> AbstractCollection, HashTable, etc).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17905) Modify Text.ensureCapacity() to efficiently max out the backing array size

2021-09-09 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated HADOOP-17905:
--
Description: 
This is a continuation of HADOOP-17901.

Right now we use a factor of 1.5x to increase the byte array if it's full. 
However, if the size reaches a certain point, the increment is only (current 
size + length). This can cause performance issues if the textual data which we 
intend to store is beyond this point.

Instead, let's max out the array to the maximum. Based on different sources, a 
safe choice seems to be Integer.MAX_VALUE - 8 (see ArrayList, 
AbstractCollection, HashTable, etc).

  was:
This is a continuation of HADOOP-17901.

Right now we use a factor of 1.5x to increase the byte array if it's full. 
However, if the size reaches a certain point, the increment is only (current 
size + length). This can cause performance issues if the textual data which we 
intend to store is beyond this point.

Instead, let's max out the array to the maximum. Based on different sources, 
this is usually determined to be Integer.MAX_VALUE - 8 (see ArrayList, 
AbstractCollection, HashTable, etc).


> Modify Text.ensureCapacity() to efficiently max out the backing array size
> --
>
> Key: HADOOP-17905
> URL: https://issues.apache.org/jira/browse/HADOOP-17905
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>
> This is a continuation of HADOOP-17901.
> Right now we use a factor of 1.5x to increase the byte array if it's full. 
> However, if the size reaches a certain point, the increment is only (current 
> size + length). This can cause performance issues if the textual data which 
> we intend to store is beyond this point.
> Instead, let's max out the array to the maximum. Based on different sources, 
> a safe choice seems to be Integer.MAX_VALUE - 8 (see ArrayList, 
> AbstractCollection, HashTable, etc).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17905) Modify Text.ensureCapacity() to efficiently max out the backing array size

2021-09-09 Thread Peter Bacsko (Jira)
Peter Bacsko created HADOOP-17905:
-

 Summary: Modify Text.ensureCapacity() to efficiently max out the 
backing array size
 Key: HADOOP-17905
 URL: https://issues.apache.org/jira/browse/HADOOP-17905
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Peter Bacsko
Assignee: Peter Bacsko


This is a continuation of HADOOP-17901.

Right now we use a factor of 1.5x to increase the byte array if it's full. 
However, if the size reaches a certain point, the increment is only (current 
size + length). This can cause performance issues if the textual data which we 
intend to store is beyond this point.

Instead, let's max out the array to the maximum. Based on different sources, 
this is usually determined to be Integer.MAX_VALUE - 8 (see ArrayList, 
AbstractCollection, HashTable, etc).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17901) Performance degradation in Text.append() after HADOOP-16951

2021-09-09 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412483#comment-17412483
 ] 

Peter Bacsko commented on HADOOP-17901:
---

Thanks [~elgoiri]. I was also thinking about the possible expansion of the 
array to the max size (which is often assumed to be Integer.MAX_VALUE - 8), but 
I think I'll do that in a different JIRA.

> Performance degradation in Text.append() after HADOOP-16951
> ---
>
> Key: HADOOP-17901
> URL: https://issues.apache.org/jira/browse/HADOOP-17901
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Critical
>  Labels: pull-request-available
> Attachments: HADOOP-17901-001.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We discovered a serious performance degradation in {{Text.append()}}.
> The problem is that the logic which intends to increase the size of the 
> backing array does not work as intended.
> It's very difficult to spot, so I added extra logs to see what happens.
> Let's add 4096 bytes of textual data in a loop:
> {noformat}
>   public static void main(String[] args) {
> Text text = new Text();
> String toAppend = RandomStringUtils.randomAscii(4096);
> for(int i = 0; i < 100; i++) {
>   text.append(toAppend.getBytes(), 0, 4096);
> }
>   }
> {noformat}
> With some debug printouts, we can observe:
> {noformat}
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(251)) - 
> length: 24576,  len: 4096, utf8ArraySize: 4096, bytes.length: 30720
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(253)) - length 
> + (length >> 1): 36864
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(254)) - length 
> + len: 28672
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:ensureCapacity(287)) 
> - >>> enhancing capacity from 30720 to 36864
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(251)) - 
> length: 28672,  len: 4096, utf8ArraySize: 4096, bytes.length: 36864
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(253)) - length 
> + (length >> 1): 43008
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(254)) - length 
> + len: 32768
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:ensureCapacity(287)) 
> - >>> enhancing capacity from 36864 to 43008
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(251)) - 
> length: 32768,  len: 4096, utf8ArraySize: 4096, bytes.length: 43008
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(253)) - length 
> + (length >> 1): 49152
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(254)) - length 
> + len: 36864
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:ensureCapacity(287)) 
> - >>> enhancing capacity from 43008 to 49152
> ...
> {noformat}
> After a certain number of {{append()}} calls, subsequent capacity increments 
> are small.
> It's because the difference between two {{length + (length >> 1)}} values is 
> always 6144 bytes. Because the size of the backing array is trailing behind 
> the calculated value, the increment will also be 6144 bytes. This means that 
> new arrays are constantly created.
> Suggested solution: don't calculate the capacity in advance based on length. 
> Instead, pass the required minimum to {{ensureCapacity()}}. Then the 
> increment should depend on the actual size of the byte array if the desired 
> capacity is larger.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17901) Performance degradation in Text.append() after HADOOP-16951

2021-09-08 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17411883#comment-17411883
 ] 

Peter Bacsko commented on HADOOP-17901:
---

cc [~belugabehr] [~elgoiri] you guys worked on the related ticket, could you 
review this?

> Performance degradation in Text.append() after HADOOP-16951
> ---
>
> Key: HADOOP-17901
> URL: https://issues.apache.org/jira/browse/HADOOP-17901
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Critical
> Attachments: HADOOP-17901-001.patch
>
>
> We discovered a serious performance degradation in {{Text.append()}}.
> The problem is that the logic which intends to increase the size of the 
> backing array does not work as intended.
> It's very difficult to spot, so I added extra logs to see what happens.
> Let's add 4096 bytes of textual data in a loop:
> {noformat}
>   public static void main(String[] args) {
> Text text = new Text();
> String toAppend = RandomStringUtils.randomAscii(4096);
> for(int i = 0; i < 100; i++) {
>   text.append(toAppend.getBytes(), 0, 4096);
> }
>   }
> {noformat}
> With some debug printouts, we can observe:
> {noformat}
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(251)) - 
> length: 24576,  len: 4096, utf8ArraySize: 4096, bytes.length: 30720
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(253)) - length 
> + (length >> 1): 36864
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(254)) - length 
> + len: 28672
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:ensureCapacity(287)) 
> - >>> enhancing capacity from 30720 to 36864
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(251)) - 
> length: 28672,  len: 4096, utf8ArraySize: 4096, bytes.length: 36864
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(253)) - length 
> + (length >> 1): 43008
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(254)) - length 
> + len: 32768
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:ensureCapacity(287)) 
> - >>> enhancing capacity from 36864 to 43008
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(251)) - 
> length: 32768,  len: 4096, utf8ArraySize: 4096, bytes.length: 43008
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(253)) - length 
> + (length >> 1): 49152
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(254)) - length 
> + len: 36864
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:ensureCapacity(287)) 
> - >>> enhancing capacity from 43008 to 49152
> ...
> {noformat}
> After a certain number of {{append()}} calls, subsequent capacity increments 
> are small.
> It's because the difference between two {{length + (length >> 1)}} values is 
> always 6144 bytes. Because the size of the backing array is trailing behind 
> the calculated value, the increment will also be 6144 bytes. This means that 
> new arrays are constantly created.
> Suggested solution: don't calculate the capacity in advance based on length. 
> Instead, pass the required minimum to {{ensureCapacity()}}. Then the 
> increment should depend on the actual size of the byte array if the desired 
> capacity is larger.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17901) Performance degradation in Text.append() after HADOOP-16951

2021-09-08 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated HADOOP-17901:
--
Status: Patch Available  (was: Open)

> Performance degradation in Text.append() after HADOOP-16951
> ---
>
> Key: HADOOP-17901
> URL: https://issues.apache.org/jira/browse/HADOOP-17901
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Critical
> Attachments: HADOOP-17901-001.patch
>
>
> We discovered a serious performance degradation in {{Text.append()}}.
> The problem is that the logic which intends to increase the size of the 
> backing array does not work as intended.
> It's very difficult to spot, so I added extra logs to see what happens.
> Let's add 4096 bytes of textual data in a loop:
> {noformat}
>   public static void main(String[] args) {
> Text text = new Text();
> String toAppend = RandomStringUtils.randomAscii(4096);
> for(int i = 0; i < 100; i++) {
>   text.append(toAppend.getBytes(), 0, 4096);
> }
>   }
> {noformat}
> With some debug printouts, we can observe:
> {noformat}
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(251)) - 
> length: 24576,  len: 4096, utf8ArraySize: 4096, bytes.length: 30720
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(253)) - length 
> + (length >> 1): 36864
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(254)) - length 
> + len: 28672
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:ensureCapacity(287)) 
> - >>> enhancing capacity from 30720 to 36864
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(251)) - 
> length: 28672,  len: 4096, utf8ArraySize: 4096, bytes.length: 36864
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(253)) - length 
> + (length >> 1): 43008
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(254)) - length 
> + len: 32768
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:ensureCapacity(287)) 
> - >>> enhancing capacity from 36864 to 43008
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(251)) - 
> length: 32768,  len: 4096, utf8ArraySize: 4096, bytes.length: 43008
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(253)) - length 
> + (length >> 1): 49152
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(254)) - length 
> + len: 36864
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:ensureCapacity(287)) 
> - >>> enhancing capacity from 43008 to 49152
> ...
> {noformat}
> After a certain number of {{append()}} calls, subsequent capacity increments 
> are small.
> It's because the difference between two {{length + (length >> 1)}} values is 
> always 6144 bytes. Because the size of the backing array is trailing behind 
> the calculated value, the increment will also be 6144 bytes. This means that 
> new arrays are constantly created.
> Suggested solution: don't calculate the capacity in advance based on length. 
> Instead, pass the required minimum to {{ensureCapacity()}}. Then the 
> increment should depend on the actual size of the byte array if the desired 
> capacity is larger.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17901) Performance degradation in Text.append() after HADOOP-16951

2021-09-08 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated HADOOP-17901:
--
Attachment: HADOOP-17901-001.patch

> Performance degradation in Text.append() after HADOOP-16951
> ---
>
> Key: HADOOP-17901
> URL: https://issues.apache.org/jira/browse/HADOOP-17901
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Critical
> Attachments: HADOOP-17901-001.patch
>
>
> We discovered a serious performance degradation in {{Text.append()}}.
> The problem is that the logic which intends to increase the size of the 
> backing array does not work as intended.
> It's very difficult to spot, so I added extra logs to see what happens.
> Let's add 4096 bytes of textual data in a loop:
> {noformat}
>   public static void main(String[] args) {
> Text text = new Text();
> String toAppend = RandomStringUtils.randomAscii(4096);
> for(int i = 0; i < 100; i++) {
>   text.append(toAppend.getBytes(), 0, 4096);
> }
>   }
> {noformat}
> With some debug printouts, we can observe:
> {noformat}
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(251)) - 
> length: 24576,  len: 4096, utf8ArraySize: 4096, bytes.length: 30720
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(253)) - length 
> + (length >> 1): 36864
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(254)) - length 
> + len: 28672
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:ensureCapacity(287)) 
> - >>> enhancing capacity from 30720 to 36864
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(251)) - 
> length: 28672,  len: 4096, utf8ArraySize: 4096, bytes.length: 36864
> 2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(253)) - length 
> + (length >> 1): 43008
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(254)) - length 
> + len: 32768
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:ensureCapacity(287)) 
> - >>> enhancing capacity from 36864 to 43008
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(251)) - 
> length: 32768,  len: 4096, utf8ArraySize: 4096, bytes.length: 43008
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(253)) - length 
> + (length >> 1): 49152
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(254)) - length 
> + len: 36864
> 2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:ensureCapacity(287)) 
> - >>> enhancing capacity from 43008 to 49152
> ...
> {noformat}
> After a certain number of {{append()}} calls, subsequent capacity increments 
> are small.
> It's because the difference between two {{length + (length >> 1)}} values is 
> always 6144 bytes. Because the size of the backing array is trailing behind 
> the calculated value, the increment will also be 6144 bytes. This means that 
> new arrays are constantly created.
> Suggested solution: don't calculate the capacity in advance based on length. 
> Instead, pass the required minimum to {{ensureCapacity()}}. Then the 
> increment should depend on the actual size of the byte array if the desired 
> capacity is larger.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17901) Performance degradation in Text.append() after HADOOP-16951

2021-09-08 Thread Peter Bacsko (Jira)
Peter Bacsko created HADOOP-17901:
-

 Summary: Performance degradation in Text.append() after 
HADOOP-16951
 Key: HADOOP-17901
 URL: https://issues.apache.org/jira/browse/HADOOP-17901
 Project: Hadoop Common
  Issue Type: Bug
  Components: common
Reporter: Peter Bacsko
Assignee: Peter Bacsko


We discovered a serious performance degradation in {{Text.append()}}.

The problem is that the logic which intends to increase the size of the backing 
array does not work as intended.
It's very difficult to spot, so I added extra logs to see what happens.

Let's add 4096 bytes of textual data in a loop:
{noformat}
  public static void main(String[] args) {
Text text = new Text();
String toAppend = RandomStringUtils.randomAscii(4096);

for(int i = 0; i < 100; i++) {
  text.append(toAppend.getBytes(), 0, 4096);
}
  }
{noformat}

With some debug printouts, we can observe:
{noformat}
2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(251)) - length: 
24576,  len: 4096, utf8ArraySize: 4096, bytes.length: 30720
2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(253)) - length + 
(length >> 1): 36864
2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(254)) - length + 
len: 28672
2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:ensureCapacity(287)) - 
>>> enhancing capacity from 30720 to 36864
2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(251)) - length: 
28672,  len: 4096, utf8ArraySize: 4096, bytes.length: 36864
2021-09-08 13:35:29,528 INFO  [main] io.Text (Text.java:append(253)) - length + 
(length >> 1): 43008
2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(254)) - length + 
len: 32768
2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:ensureCapacity(287)) - 
>>> enhancing capacity from 36864 to 43008
2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(251)) - length: 
32768,  len: 4096, utf8ArraySize: 4096, bytes.length: 43008
2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(253)) - length + 
(length >> 1): 49152
2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:append(254)) - length + 
len: 36864
2021-09-08 13:35:29,529 INFO  [main] io.Text (Text.java:ensureCapacity(287)) - 
>>> enhancing capacity from 43008 to 49152
...
{noformat}

After a certain number of {{append()}} calls, subsequent capacity increments 
are small.

It's because the difference between two {{length + (length >> 1)}} values is 
always 6144 bytes. Because the size of the backing array is trailing behind the 
calculated value, the increment will also be 6144 bytes. This means that new 
arrays are constantly created.

Suggested solution: don't calculate the capacity in advance based on length. 
Instead, pass the required minimum to {{ensureCapacity()}}. Then the increment 
should depend on the actual size of the byte array if the desired capacity is 
larger.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17573) Fix compilation error of OBSFileSystem in trunk

2021-03-10 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298800#comment-17298800
 ] 

Peter Bacsko commented on HADOOP-17573:
---

I've just seen this problem. The PR fixed the build locally.

+1 from me.

> Fix compilation error of OBSFileSystem in trunk
> ---
>
> Key: HADOOP-17573
> URL: https://issues.apache.org/jira/browse/HADOOP-17573
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {noformat}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project hadoop-huaweicloud: Compilation failure
> [ERROR] 
> /home/centos/srcs/hadoop/hadoop-cloud-storage-project/hadoop-huaweicloud/src/main/java/org/apache/hadoop/fs/obs/OBSFileSystem.java:[396,58]
>  incompatible types: org.apache.hadoop.util.BlockingThreadPoolExecutorService 
> cannot be converted to 
> org.apache.hadoop.thirdparty.com.google.common.util.concurrent.ListeningExecutorService
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17324) Don't relocate org.bouncycastle in shaded client jars

2020-11-11 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17230071#comment-17230071
 ] 

Peter Bacsko commented on HADOOP-17324:
---

[~csun] I think the commit 
https://github.com/apache/hadoop/commit/2522bf2f9b0c720eab099fef27bd3d22460ad5d0
 introduced a compilation problem:

{noformat}
[INFO] Apache Hadoop Client Aggregator  SKIPPED
[INFO] Apache Hadoop Client API ... SKIPPED
[INFO] Apache Hadoop Client Runtime ... SKIPPED
[INFO] Apache Hadoop Client Test Minicluster .. SKIPPED
[INFO] Apache Hadoop Client Packaging Invariants .. SKIPPED
[INFO] Apache Hadoop Client Packaging Invariants for Test . SKIPPED
[INFO] Apache Hadoop Client Packaging Integration Tests ... FAILURE [  2.050 s]
[INFO] Apache Hadoop Client Modules 3.4.0-SNAPSHOT  SUCCESS [  1.455 s]
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 3.012 s (Wall Clock)
[INFO] Finished at: 2020-11-11T17:29:17+01:00
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile 
(default-testCompile) on project hadoop-client-integration-tests: Compilation 
failure: Compilation failure: 
[ERROR] 
/home/bacskop/repos/hadoop/hadoop-client-modules/hadoop-client-integration-tests/src/test/java/org/apache/hadoop/example/ITUseMiniCluster.java:[47,37]
 package org.apache.hadoop.yarn.server does not exist
[ERROR] 
/home/bacskop/repos/hadoop/hadoop-client-modules/hadoop-client-integration-tests/src/test/java/org/apache/hadoop/example/ITUseMiniCluster.java:[59,11]
 cannot find symbol
[ERROR]   symbol:   class MiniYARNCluster
[ERROR]   location: class org.apache.hadoop.example.ITUseMiniCluster
[ERROR] 
/home/bacskop/repos/hadoop/hadoop-client-modules/hadoop-client-integration-tests/src/test/java/org/apache/hadoop/example/ITUseMiniCluster.java:[82,23]
 cannot find symbol
[ERROR]   symbol:   class MiniYARNCluster
[ERROR]   location: class org.apache.hadoop.example.ITUseMiniCluster
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :hadoop-client-integration-tests
{noformat}

Could you please investigate this?

> Don't relocate org.bouncycastle in shaded client jars
> -
>
> Key: HADOOP-17324
> URL: https://issues.apache.org/jira/browse/HADOOP-17324
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> When downstream apps depend on {{hadoop-client-api}}, 
> {{hadoop-client-runtime}} and {{hadoop-client-minicluster}}, it seems the 
> {{MiniYARNCluster}} could have issue because 
> {{org.apache.hadoop.shaded.org.bouncycastle.operator.OperatorCreationException}}
>  is not in any of the above jars. 
> {code}
> Error:  Caused by: sbt.ForkMain$ForkError: java.lang.ClassNotFoundException: 
> org.apache.hadoop.shaded.org.bouncycastle.operator.OperatorCreationException
> Error:at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
> Error:at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
> Error:at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
> Error:at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
> Error:at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:862)
> Error:at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> Error:at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1296)
> Error:at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:339)
> Error:at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> Error:at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.initResourceManager(MiniYARNCluster.java:353)
> Error:at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.access$200(MiniYARNCluster.java:127)
> 

[jira] [Commented] (HADOOP-16683) Disable retry of FailoverOnNetworkExceptionRetry in case of wrapped AccessControlException

2019-11-15 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16975157#comment-16975157
 ] 

Peter Bacsko commented on HADOOP-16683:
---

[~adam.antal] we don't have build results for branch-3.2. The trick is to 
upload a patch, wait until the build starts, then upload the next one.

> Disable retry of FailoverOnNetworkExceptionRetry in case of wrapped 
> AccessControlException
> --
>
> Key: HADOOP-16683
> URL: https://issues.apache.org/jira/browse/HADOOP-16683
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-16683.001.patch, HADOOP-16683.002.patch, 
> HADOOP-16683.003.patch, HADOOP-16683.branch-3.1.001.patch, 
> HADOOP-16683.branch-3.2.001.patch
>
>
> Follow up patch on HADOOP-16580.
> We successfully disabled the retry in case of an AccessControlException which 
> has resolved some of the cases, but in other cases AccessControlException is 
> wrapped inside another IOException and you can only get the original 
> exception by calling getCause().
> Let's add this extra case as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16683) Disable retry of FailoverOnNetworkExceptionRetry in case of wrapped AccessControlException

2019-11-06 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968412#comment-16968412
 ] 

Peter Bacsko commented on HADOOP-16683:
---

+1 (non-binding)

> Disable retry of FailoverOnNetworkExceptionRetry in case of wrapped 
> AccessControlException
> --
>
> Key: HADOOP-16683
> URL: https://issues.apache.org/jira/browse/HADOOP-16683
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: HADOOP-16683.001.patch, HADOOP-16683.002.patch, 
> HADOOP-16683.003.patch
>
>
> Follow up patch on HADOOP-16580.
> We successfully disabled the retry in case of an AccessControlException which 
> has resolved some of the cases, but in other cases AccessControlException is 
> wrapped inside another IOException and you can only get the original 
> exception by calling getCause().
> Let's add this extra case as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16683) Disable retry of FailoverOnNetworkExceptionRetry in case of wrapped AccessControlException

2019-11-06 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968247#comment-16968247
 ] 

Peter Bacsko commented on HADOOP-16683:
---

Just a question: do we know for sure that {{AccessControlException}} can only 
be wrapped inside an {{IOException}}? Perhaps we should guard ourselves more 
aggressively, examining the entire exception-chain (I believe this is what I 
did with {{SaslException}}). 

> Disable retry of FailoverOnNetworkExceptionRetry in case of wrapped 
> AccessControlException
> --
>
> Key: HADOOP-16683
> URL: https://issues.apache.org/jira/browse/HADOOP-16683
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: HADOOP-16683.001.patch, HADOOP-16683.002.patch
>
>
> Follow up patch on HADOOP-16580.
> We successfully disabled the retry in case of an AccessControlException which 
> has resolved some of the cases, but in other cases AccessControlException is 
> wrapped inside another IOException and you can only get the original 
> exception by calling getCause().
> Let's add this extra case as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16683) Disable retry of FailoverOnNetworkExceptionRetry in case of wrapped AccessControlException

2019-11-05 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967666#comment-16967666
 ] 

Peter Bacsko commented on HADOOP-16683:
---

[~adam.antal] I think the return value of this function should be {{boolean}} 
instead of {{Throwable}}, because you don't really do much with the returned 
object:

{{private static Throwable getWrappedAccessControlException(Exception e)}}
  

> Disable retry of FailoverOnNetworkExceptionRetry in case of wrapped 
> AccessControlException
> --
>
> Key: HADOOP-16683
> URL: https://issues.apache.org/jira/browse/HADOOP-16683
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: HADOOP-16683.001.patch
>
>
> Follow up patch on HADOOP-16580.
> We successfully disabled the retry in case of an AccessControlException which 
> has resolved some of the cases, but in other cases AccessControlException is 
> wrapped inside another IOException and you can only get the original 
> exception by calling getCause().
> Let's add this extra case as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16580) Disable retry of FailoverOnNetworkExceptionRetry in case of AccessControlException

2019-09-23 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935841#comment-16935841
 ] 

Peter Bacsko commented on HADOOP-16580:
---

+1 (non-binding)

> Disable retry of FailoverOnNetworkExceptionRetry in case of 
> AccessControlException
> --
>
> Key: HADOOP-16580
> URL: https://issues.apache.org/jira/browse/HADOOP-16580
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: HADOOP-16580.001.patch, HADOOP-16580.002.patch
>
>
> HADOOP-14982 handled the case where a SaslException is thrown. The issue 
> still persists, since the exception that is thrown is an 
> *AccessControlException* because user has no kerberos credentials. 
> My suggestion is that we should add this case as well to 
> {{FailoverOnNetworkExceptionRetry}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16211) Update guava to 27.0-jre in hadoop-project branch-3.2

2019-06-13 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16863090#comment-16863090
 ] 

Peter Bacsko commented on HADOOP-16211:
---

On my machine, {{TestTimelineReaderWebServicesHBaseStorage}} is so bad that 
every single testcase fails. Looking at the Jenkins build result, the same 
thing happened. Wow, that's really bad. Created JIRA: YARN-9622

> Update guava to 27.0-jre in hadoop-project branch-3.2
> -
>
> Key: HADOOP-16211
> URL: https://issues.apache.org/jira/browse/HADOOP-16211
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.2.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-16211-branch-3.2.001.patch, 
> HADOOP-16211-branch-3.2.002.patch, HADOOP-16211-branch-3.2.003.patch, 
> HADOOP-16211-branch-3.2.004.patch, HADOOP-16211-branch-3.2.005.patch, 
> HADOOP-16211-branch-3.2.006.patch
>
>
> com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found 
> CVE-2018-10237.
> This is a sub-task for branch-3.2 from HADOOP-15960 to track issues on that 
> particular branch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16213) Update guava to 27.0-jre in hadoop-project branch-3.1

2019-06-13 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16863060#comment-16863060
 ] 

Peter Bacsko commented on HADOOP-16213:
---

Created YARN-9621 to track testDistributedShellWithPlacementConstraint failure.

> Update guava to 27.0-jre in hadoop-project branch-3.1
> -
>
> Key: HADOOP-16213
> URL: https://issues.apache.org/jira/browse/HADOOP-16213
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.1.0, 3.1.1, 3.1.2
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Critical
> Attachments: HADOOP-16213-branch-3.1.001.patch, 
> HADOOP-16213-branch-3.1.002.patch, HADOOP-16213-branch-3.1.003.patch, 
> HADOOP-16213-branch-3.1.004.patch, HADOOP-16213-branch-3.1.005.patch, 
> HADOOP-16213-branch-3.1.006.patch
>
>
> com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found 
> CVE-2018-10237.
> This is a sub-task for branch-3.1 from HADOOP-15960 to track issues on that 
> particular branch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16211) Update guava to 27.0-jre in hadoop-project branch-3.2

2019-06-13 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16863035#comment-16863035
 ] 

Peter Bacsko commented on HADOOP-16211:
---

https://issues.apache.org/jira/browse/YARN-8672 addressed the failed test in 
hadoop.yarn.server.nodemanager.containermanager.TestContainerManager, it's just 
hasn't been backported to 3.2.

> Update guava to 27.0-jre in hadoop-project branch-3.2
> -
>
> Key: HADOOP-16211
> URL: https://issues.apache.org/jira/browse/HADOOP-16211
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.2.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-16211-branch-3.2.001.patch, 
> HADOOP-16211-branch-3.2.002.patch, HADOOP-16211-branch-3.2.003.patch, 
> HADOOP-16211-branch-3.2.004.patch, HADOOP-16211-branch-3.2.005.patch, 
> HADOOP-16211-branch-3.2.006.patch
>
>
> com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found 
> CVE-2018-10237.
> This is a sub-task for branch-3.2 from HADOOP-15960 to track issues on that 
> particular branch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16238) Add the possbility to set SO_REUSEADDR in IPC Server Listener

2019-05-06 Thread Peter Bacsko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated HADOOP-16238:
--
Attachment: HADOOP-16238-005.patch

> Add the possbility to set SO_REUSEADDR in IPC Server Listener
> -
>
> Key: HADOOP-16238
> URL: https://issues.apache.org/jira/browse/HADOOP-16238
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Minor
> Attachments: HADOOP-16238-001.patch, HADOOP-16238-002.patch, 
> HADOOP-16238-003.patch, HADOOP-16238-004.patch, HADOOP-16238-005.patch
>
>
> Currently we can't enable SO_REUSEADDR in the IPC Server. In some 
> circumstances, this would be desirable, see explanation here:
> [https://developer.ibm.com/tutorials/l-sockpit/#pitfall-3-address-in-use-error-eaddrinuse-]
> Rarely it also causes problems in a test case 
> {{TestMiniMRClientCluster.testRestart}}:
> {noformat}
> 2019-04-04 11:21:31,896 INFO [main] service.AbstractService 
> (AbstractService.java:noteFailure(273)) - Service 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService failed in state 
> STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
>  at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:138)
>  at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
>  at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:178)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:165)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1244)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:355)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.access$300(MiniYARNCluster.java:127)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:493)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:312)
>  at 
> org.apache.hadoop.mapreduce.v2.MiniMRYarnCluster.serviceStart(MiniMRYarnCluster.java:210)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.mapred.MiniMRYarnClusterAdapter.restart(MiniMRYarnClusterAdapter.java:73)
>  at 
> org.apache.hadoop.mapred.TestMiniMRClientCluster.testRestart(TestMiniMRClientCluster.java:114)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62){noformat}
>  
> At least for testing, having this socket option enabled is benefical. We 
> could enable this with a new property like {{ipc.server.reuseaddr}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16238) Add the possbility to set SO_REUSEADDR in IPC Server Listener

2019-05-06 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16833641#comment-16833641
 ] 

Peter Bacsko commented on HADOOP-16238:
---

I uploaded patch v5 where the default is "true".

> Add the possbility to set SO_REUSEADDR in IPC Server Listener
> -
>
> Key: HADOOP-16238
> URL: https://issues.apache.org/jira/browse/HADOOP-16238
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Minor
> Attachments: HADOOP-16238-001.patch, HADOOP-16238-002.patch, 
> HADOOP-16238-003.patch, HADOOP-16238-004.patch, HADOOP-16238-005.patch
>
>
> Currently we can't enable SO_REUSEADDR in the IPC Server. In some 
> circumstances, this would be desirable, see explanation here:
> [https://developer.ibm.com/tutorials/l-sockpit/#pitfall-3-address-in-use-error-eaddrinuse-]
> Rarely it also causes problems in a test case 
> {{TestMiniMRClientCluster.testRestart}}:
> {noformat}
> 2019-04-04 11:21:31,896 INFO [main] service.AbstractService 
> (AbstractService.java:noteFailure(273)) - Service 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService failed in state 
> STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
>  at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:138)
>  at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
>  at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:178)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:165)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1244)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:355)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.access$300(MiniYARNCluster.java:127)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:493)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:312)
>  at 
> org.apache.hadoop.mapreduce.v2.MiniMRYarnCluster.serviceStart(MiniMRYarnCluster.java:210)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.mapred.MiniMRYarnClusterAdapter.restart(MiniMRYarnClusterAdapter.java:73)
>  at 
> org.apache.hadoop.mapred.TestMiniMRClientCluster.testRestart(TestMiniMRClientCluster.java:114)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62){noformat}
>  
> At least for testing, having this socket option enabled is benefical. We 
> could enable this with a new property like {{ipc.server.reuseaddr}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16238) Add the possbility to set SO_REUSEADDR in IPC Server Listener

2019-04-18 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820794#comment-16820794
 ] 

Peter Bacsko commented on HADOOP-16238:
---

[~jojochuang] could you review this patch please?

> Add the possbility to set SO_REUSEADDR in IPC Server Listener
> -
>
> Key: HADOOP-16238
> URL: https://issues.apache.org/jira/browse/HADOOP-16238
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Minor
> Attachments: HADOOP-16238-001.patch, HADOOP-16238-002.patch, 
> HADOOP-16238-003.patch, HADOOP-16238-004.patch
>
>
> Currently we can't enable SO_REUSEADDR in the IPC Server. In some 
> circumstances, this would be desirable, see explanation here:
> [https://developer.ibm.com/tutorials/l-sockpit/#pitfall-3-address-in-use-error-eaddrinuse-]
> Rarely it also causes problems in a test case 
> {{TestMiniMRClientCluster.testRestart}}:
> {noformat}
> 2019-04-04 11:21:31,896 INFO [main] service.AbstractService 
> (AbstractService.java:noteFailure(273)) - Service 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService failed in state 
> STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
>  at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:138)
>  at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
>  at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:178)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:165)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1244)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:355)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.access$300(MiniYARNCluster.java:127)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:493)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:312)
>  at 
> org.apache.hadoop.mapreduce.v2.MiniMRYarnCluster.serviceStart(MiniMRYarnCluster.java:210)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.mapred.MiniMRYarnClusterAdapter.restart(MiniMRYarnClusterAdapter.java:73)
>  at 
> org.apache.hadoop.mapred.TestMiniMRClientCluster.testRestart(TestMiniMRClientCluster.java:114)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62){noformat}
>  
> At least for testing, having this socket option enabled is benefical. We 
> could enable this with a new property like {{ipc.server.reuseaddr}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16238) Add the possbility to set SO_REUSEADDR in IPC Server Listener

2019-04-15 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16817970#comment-16817970
 ] 

Peter Bacsko commented on HADOOP-16238:
---

[~adam.antal] based on a quick analysis, this socket option is disabled on 
every major OS, so setting it to false by default doesn't break anything. 
Having said that, a 100% perfect solution is simply not touching it if not 
defined in the config. If we want to be extra-safe, it's a viable approach.

> Add the possbility to set SO_REUSEADDR in IPC Server Listener
> -
>
> Key: HADOOP-16238
> URL: https://issues.apache.org/jira/browse/HADOOP-16238
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Minor
> Attachments: HADOOP-16238-001.patch, HADOOP-16238-002.patch, 
> HADOOP-16238-003.patch, HADOOP-16238-004.patch
>
>
> Currently we can't enable SO_REUSEADDR in the IPC Server. In some 
> circumstances, this would be desirable, see explanation here:
> [https://developer.ibm.com/tutorials/l-sockpit/#pitfall-3-address-in-use-error-eaddrinuse-]
> Rarely it also causes problems in a test case 
> {{TestMiniMRClientCluster.testRestart}}:
> {noformat}
> 2019-04-04 11:21:31,896 INFO [main] service.AbstractService 
> (AbstractService.java:noteFailure(273)) - Service 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService failed in state 
> STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
>  at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:138)
>  at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
>  at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:178)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:165)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1244)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:355)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.access$300(MiniYARNCluster.java:127)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:493)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:312)
>  at 
> org.apache.hadoop.mapreduce.v2.MiniMRYarnCluster.serviceStart(MiniMRYarnCluster.java:210)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.mapred.MiniMRYarnClusterAdapter.restart(MiniMRYarnClusterAdapter.java:73)
>  at 
> org.apache.hadoop.mapred.TestMiniMRClientCluster.testRestart(TestMiniMRClientCluster.java:114)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62){noformat}
>  
> At least for testing, having this socket option enabled is benefical. We 
> could enable this with a new property like {{ipc.server.reuseaddr}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16238) Add the possbility to set SO_REUSEADDR in IPC Server Listener

2019-04-10 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814468#comment-16814468
 ] 

Peter Bacsko commented on HADOOP-16238:
---

Thanks [~wilfreds], handled the newline stuff.

> Add the possbility to set SO_REUSEADDR in IPC Server Listener
> -
>
> Key: HADOOP-16238
> URL: https://issues.apache.org/jira/browse/HADOOP-16238
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Minor
> Attachments: HADOOP-16238-001.patch, HADOOP-16238-002.patch, 
> HADOOP-16238-003.patch, HADOOP-16238-004.patch
>
>
> Currently we can't enable SO_REUSEADDR in the IPC Server. In some 
> circumstances, this would be desirable, see explanation here:
> [https://developer.ibm.com/tutorials/l-sockpit/#pitfall-3-address-in-use-error-eaddrinuse-]
> Rarely it also causes problems in a test case 
> {{TestMiniMRClientCluster.testRestart}}:
> {noformat}
> 2019-04-04 11:21:31,896 INFO [main] service.AbstractService 
> (AbstractService.java:noteFailure(273)) - Service 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService failed in state 
> STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
>  at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:138)
>  at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
>  at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:178)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:165)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1244)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:355)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.access$300(MiniYARNCluster.java:127)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:493)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:312)
>  at 
> org.apache.hadoop.mapreduce.v2.MiniMRYarnCluster.serviceStart(MiniMRYarnCluster.java:210)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.mapred.MiniMRYarnClusterAdapter.restart(MiniMRYarnClusterAdapter.java:73)
>  at 
> org.apache.hadoop.mapred.TestMiniMRClientCluster.testRestart(TestMiniMRClientCluster.java:114)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62){noformat}
>  
> At least for testing, having this socket option enabled is benefical. We 
> could enable this with a new property like {{ipc.server.reuseaddr}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16238) Add the possbility to set SO_REUSEADDR in IPC Server Listener

2019-04-10 Thread Peter Bacsko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated HADOOP-16238:
--
Attachment: HADOOP-16238-004.patch

> Add the possbility to set SO_REUSEADDR in IPC Server Listener
> -
>
> Key: HADOOP-16238
> URL: https://issues.apache.org/jira/browse/HADOOP-16238
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Minor
> Attachments: HADOOP-16238-001.patch, HADOOP-16238-002.patch, 
> HADOOP-16238-003.patch, HADOOP-16238-004.patch
>
>
> Currently we can't enable SO_REUSEADDR in the IPC Server. In some 
> circumstances, this would be desirable, see explanation here:
> [https://developer.ibm.com/tutorials/l-sockpit/#pitfall-3-address-in-use-error-eaddrinuse-]
> Rarely it also causes problems in a test case 
> {{TestMiniMRClientCluster.testRestart}}:
> {noformat}
> 2019-04-04 11:21:31,896 INFO [main] service.AbstractService 
> (AbstractService.java:noteFailure(273)) - Service 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService failed in state 
> STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
>  at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:138)
>  at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
>  at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:178)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:165)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1244)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:355)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.access$300(MiniYARNCluster.java:127)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:493)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:312)
>  at 
> org.apache.hadoop.mapreduce.v2.MiniMRYarnCluster.serviceStart(MiniMRYarnCluster.java:210)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.mapred.MiniMRYarnClusterAdapter.restart(MiniMRYarnClusterAdapter.java:73)
>  at 
> org.apache.hadoop.mapred.TestMiniMRClientCluster.testRestart(TestMiniMRClientCluster.java:114)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62){noformat}
>  
> At least for testing, having this socket option enabled is benefical. We 
> could enable this with a new property like {{ipc.server.reuseaddr}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16237) Fix new findbugs issues after update guava to 27.0-jre in hadoop-project trunk

2019-04-05 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810775#comment-16810775
 ] 

Peter Bacsko commented on HADOOP-16237:
---

*CosmosDBDocumentStoreReader /* *CosmosDBDocumentStoreWriter*

 
{noformat}
if (client == null) {
  synchronized (this) {
if (client == null) {
  LOG.info("Creating Cosmos DB Client...");
  client = DocumentStoreUtils.createCosmosDBClient(conf);
}
  }
}{noformat}
To me this looks like the standard DCL pattern and with {{client}} being 
non-volatile, it's faulty. So either make it volatile or we should consider 
making {{client}} non-static - is it expensive to create? Do we really need to 
cache it once it's created?

*FlowRunDocument.aggregate*
{noformat}
LOG.error("Unknown TimelineMetricOperation."){noformat}
I vote for WARN level. If it's really an error, then we probably should throw 
an exception, no? 

 

*FlowRunDocument.aggregateMetrics(Map)*

I think FindBugs is right, this can be enhanced. We retrieve the {{keySet()}} 
from {{metricSubDocMap}} then perform {{get()}}  on it if {{metrics}} happens 
to contain the same key. Operating on the {{EntrySet}} is definitely better 
here (although I have no idea whether it really speeds up things). 

> Fix new findbugs issues after update guava to 27.0-jre in hadoop-project trunk
> --
>
> Key: HADOOP-16237
> URL: https://issues.apache.org/jira/browse/HADOOP-16237
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Critical
> Attachments: 
> branch-findbugs-hadoop-common-project_hadoop-kms-warnings.html, 
> branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html,
>  
> branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-warnings.html,
>  
> branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-documentstore-warnings.html
>
>
> There are a bunch of new findbugs issues in the build after committing the 
> guava update.
> Mostly in yarn, but we have to check and handle those.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16238) Add the possbility to set SO_REUSEADDR in IPC Server Listener

2019-04-05 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810746#comment-16810746
 ] 

Peter Bacsko commented on HADOOP-16238:
---

[~wilfreds] / [~ste...@apache.org] could you review this patch?

> Add the possbility to set SO_REUSEADDR in IPC Server Listener
> -
>
> Key: HADOOP-16238
> URL: https://issues.apache.org/jira/browse/HADOOP-16238
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Minor
> Attachments: HADOOP-16238-001.patch, HADOOP-16238-002.patch, 
> HADOOP-16238-003.patch
>
>
> Currently we can't enable SO_REUSEADDR in the IPC Server. In some 
> circumstances, this would be desirable, see explanation here:
> [https://developer.ibm.com/tutorials/l-sockpit/#pitfall-3-address-in-use-error-eaddrinuse-]
> Rarely it also causes problems in a test case 
> {{TestMiniMRClientCluster.testRestart}}:
> {noformat}
> 2019-04-04 11:21:31,896 INFO [main] service.AbstractService 
> (AbstractService.java:noteFailure(273)) - Service 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService failed in state 
> STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
>  at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:138)
>  at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
>  at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:178)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:165)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1244)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:355)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.access$300(MiniYARNCluster.java:127)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:493)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:312)
>  at 
> org.apache.hadoop.mapreduce.v2.MiniMRYarnCluster.serviceStart(MiniMRYarnCluster.java:210)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.mapred.MiniMRYarnClusterAdapter.restart(MiniMRYarnClusterAdapter.java:73)
>  at 
> org.apache.hadoop.mapred.TestMiniMRClientCluster.testRestart(TestMiniMRClientCluster.java:114)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62){noformat}
>  
> At least for testing, having this socket option enabled is benefical. We 
> could enable this with a new property like {{ipc.server.reuseaddr}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16238) Add the possbility to set SO_REUSEADDR in IPC Server Listener

2019-04-05 Thread Peter Bacsko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated HADOOP-16238:
--
Attachment: HADOOP-16238-003.patch

> Add the possbility to set SO_REUSEADDR in IPC Server Listener
> -
>
> Key: HADOOP-16238
> URL: https://issues.apache.org/jira/browse/HADOOP-16238
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Minor
> Attachments: HADOOP-16238-001.patch, HADOOP-16238-002.patch, 
> HADOOP-16238-003.patch
>
>
> Currently we can't enable SO_REUSEADDR in the IPC Server. In some 
> circumstances, this would be desirable, see explanation here:
> [https://developer.ibm.com/tutorials/l-sockpit/#pitfall-3-address-in-use-error-eaddrinuse-]
> Rarely it also causes problems in a test case 
> {{TestMiniMRClientCluster.testRestart}}:
> {noformat}
> 2019-04-04 11:21:31,896 INFO [main] service.AbstractService 
> (AbstractService.java:noteFailure(273)) - Service 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService failed in state 
> STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
>  at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:138)
>  at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
>  at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:178)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:165)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1244)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:355)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.access$300(MiniYARNCluster.java:127)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:493)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:312)
>  at 
> org.apache.hadoop.mapreduce.v2.MiniMRYarnCluster.serviceStart(MiniMRYarnCluster.java:210)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.mapred.MiniMRYarnClusterAdapter.restart(MiniMRYarnClusterAdapter.java:73)
>  at 
> org.apache.hadoop.mapred.TestMiniMRClientCluster.testRestart(TestMiniMRClientCluster.java:114)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62){noformat}
>  
> At least for testing, having this socket option enabled is benefical. We 
> could enable this with a new property like {{ipc.server.reuseaddr}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16238) Add the possbility to set SO_REUSEADDR in IPC Server Listener

2019-04-04 Thread Peter Bacsko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated HADOOP-16238:
--
Attachment: HADOOP-16238-002.patch

> Add the possbility to set SO_REUSEADDR in IPC Server Listener
> -
>
> Key: HADOOP-16238
> URL: https://issues.apache.org/jira/browse/HADOOP-16238
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Minor
> Attachments: HADOOP-16238-001.patch, HADOOP-16238-002.patch
>
>
> Currently we can't enable SO_REUSEADDR in the IPC Server. In some 
> circumstances, this would be desirable, see explanation here:
> [https://developer.ibm.com/tutorials/l-sockpit/#pitfall-3-address-in-use-error-eaddrinuse-]
> Rarely it also causes problems in a test case 
> {{TestMiniMRClientCluster.testRestart}}:
> {noformat}
> 2019-04-04 11:21:31,896 INFO [main] service.AbstractService 
> (AbstractService.java:noteFailure(273)) - Service 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService failed in state 
> STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
>  at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:138)
>  at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
>  at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:178)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:165)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1244)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:355)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.access$300(MiniYARNCluster.java:127)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:493)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:312)
>  at 
> org.apache.hadoop.mapreduce.v2.MiniMRYarnCluster.serviceStart(MiniMRYarnCluster.java:210)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.mapred.MiniMRYarnClusterAdapter.restart(MiniMRYarnClusterAdapter.java:73)
>  at 
> org.apache.hadoop.mapred.TestMiniMRClientCluster.testRestart(TestMiniMRClientCluster.java:114)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62){noformat}
>  
> At least for testing, having this socket option enabled is benefical. We 
> could enable this with a new property like {{ipc.server.reuseaddr}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16238) Add the possbility to set SO_REUSEADDR in IPC Server Listener

2019-04-04 Thread Peter Bacsko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated HADOOP-16238:
--
Status: Patch Available  (was: Open)

> Add the possbility to set SO_REUSEADDR in IPC Server Listener
> -
>
> Key: HADOOP-16238
> URL: https://issues.apache.org/jira/browse/HADOOP-16238
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Minor
> Attachments: HADOOP-16238-001.patch
>
>
> Currently we can't enable SO_REUSEADDR in the IPC Server. In some 
> circumstances, this would be desirable, see explanation here:
> [https://developer.ibm.com/tutorials/l-sockpit/#pitfall-3-address-in-use-error-eaddrinuse-]
> Rarely it also causes problems in a test case 
> {{TestMiniMRClientCluster.testRestart}}:
> {noformat}
> 2019-04-04 11:21:31,896 INFO [main] service.AbstractService 
> (AbstractService.java:noteFailure(273)) - Service 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService failed in state 
> STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
>  at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:138)
>  at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
>  at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:178)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:165)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1244)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:355)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.access$300(MiniYARNCluster.java:127)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:493)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:312)
>  at 
> org.apache.hadoop.mapreduce.v2.MiniMRYarnCluster.serviceStart(MiniMRYarnCluster.java:210)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.mapred.MiniMRYarnClusterAdapter.restart(MiniMRYarnClusterAdapter.java:73)
>  at 
> org.apache.hadoop.mapred.TestMiniMRClientCluster.testRestart(TestMiniMRClientCluster.java:114)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62){noformat}
>  
> At least for testing, having this socket option enabled is benefical. We 
> could enable this with a new property like {{ipc.server.reuseaddr}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16238) Add the possbility to set SO_REUSEADDR in IPC Server Listener

2019-04-04 Thread Peter Bacsko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated HADOOP-16238:
--
Attachment: HADOOP-16238-001.patch

> Add the possbility to set SO_REUSEADDR in IPC Server Listener
> -
>
> Key: HADOOP-16238
> URL: https://issues.apache.org/jira/browse/HADOOP-16238
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Minor
> Attachments: HADOOP-16238-001.patch
>
>
> Currently we can't enable SO_REUSEADDR in the IPC Server. In some 
> circumstances, this would be desirable, see explanation here:
> [https://developer.ibm.com/tutorials/l-sockpit/#pitfall-3-address-in-use-error-eaddrinuse-]
> Rarely it also causes problems in a test case 
> {{TestMiniMRClientCluster.testRestart}}:
> {noformat}
> 2019-04-04 11:21:31,896 INFO [main] service.AbstractService 
> (AbstractService.java:noteFailure(273)) - Service 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService failed in state 
> STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [test-host:35491] 
> java.net.BindException: Address already in use; For more details see: 
> http://wiki.apache.org/hadoop/BindException
>  at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:138)
>  at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
>  at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:178)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:165)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1244)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:355)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.access$300(MiniYARNCluster.java:127)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:493)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>  at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:312)
>  at 
> org.apache.hadoop.mapreduce.v2.MiniMRYarnCluster.serviceStart(MiniMRYarnCluster.java:210)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>  at 
> org.apache.hadoop.mapred.MiniMRYarnClusterAdapter.restart(MiniMRYarnClusterAdapter.java:73)
>  at 
> org.apache.hadoop.mapred.TestMiniMRClientCluster.testRestart(TestMiniMRClientCluster.java:114)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62){noformat}
>  
> At least for testing, having this socket option enabled is benefical. We 
> could enable this with a new property like {{ipc.server.reuseaddr}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16238) Add the possbility to set SO_REUSEADDR in IPC Server Listener

2019-04-04 Thread Peter Bacsko (JIRA)
Peter Bacsko created HADOOP-16238:
-

 Summary: Add the possbility to set SO_REUSEADDR in IPC Server 
Listener
 Key: HADOOP-16238
 URL: https://issues.apache.org/jira/browse/HADOOP-16238
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Peter Bacsko
Assignee: Peter Bacsko


Currently we can't enable SO_REUSEADDR in the IPC Server. In some 
circumstances, this would be desirable, see explanation here:

[https://developer.ibm.com/tutorials/l-sockpit/#pitfall-3-address-in-use-error-eaddrinuse-]

Rarely it also causes problems in a test case 
{{TestMiniMRClientCluster.testRestart}}:
{noformat}
2019-04-04 11:21:31,896 INFO [main] service.AbstractService 
(AbstractService.java:noteFailure(273)) - Service 
org.apache.hadoop.yarn.server.resourcemanager.AdminService failed in state 
STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
java.net.BindException: Problem binding to [test-host:35491] 
java.net.BindException: Address already in use; For more details see: 
http://wiki.apache.org/hadoop/BindException
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.BindException: 
Problem binding to [test-host:35491] java.net.BindException: Address already in 
use; For more details see: http://wiki.apache.org/hadoop/BindException
 at 
org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:138)
 at 
org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
 at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
 at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:178)
 at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:165)
 at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
 at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
 at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1244)
 at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
 at 
org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:355)
 at 
org.apache.hadoop.yarn.server.MiniYARNCluster.access$300(MiniYARNCluster.java:127)
 at 
org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:493)
 at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
 at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
 at 
org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:312)
 at 
org.apache.hadoop.mapreduce.v2.MiniMRYarnCluster.serviceStart(MiniMRYarnCluster.java:210)
 at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
 at 
org.apache.hadoop.mapred.MiniMRYarnClusterAdapter.restart(MiniMRYarnClusterAdapter.java:73)
 at 
org.apache.hadoop.mapred.TestMiniMRClientCluster.testRestart(TestMiniMRClientCluster.java:114)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62){noformat}
 

At least for testing, having this socket option enabled is benefical. We could 
enable this with a new property like {{ipc.server.reuseaddr}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15822) zstd compressor can fail with a small output buffer

2018-10-09 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16643416#comment-16643416
 ] 

Peter Bacsko commented on HADOOP-15822:
---

[~jlowe] you were right, it's not related to zstandard. I reproduced this with 
other codecs + no compression. It's possibly an edge case. 

> zstd compressor can fail with a small output buffer
> ---
>
> Key: HADOOP-15822
> URL: https://issues.apache.org/jira/browse/HADOOP-15822
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: HADOOP-15822.001.patch, HADOOP-15822.002.patch
>
>
> TestZStandardCompressorDecompressor fails a couple of tests on my machine 
> with the latest zstd library (1.3.5).  Compression can fail to successfully 
> finalize the stream when a small output buffer is used resulting in a failed 
> to init error, and decompression with a direct buffer can fail with an 
> invalid src size error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15822) zstd compressor can fail with a small output buffer

2018-10-08 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642106#comment-16642106
 ] 

Peter Bacsko commented on HADOOP-15822:
---

No, I still haven't had the time to check out with other codecs. But tomorrow 
I'll perform a test with no compression/snappy/lz4/etc.

> zstd compressor can fail with a small output buffer
> ---
>
> Key: HADOOP-15822
> URL: https://issues.apache.org/jira/browse/HADOOP-15822
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: HADOOP-15822.001.patch, HADOOP-15822.002.patch
>
>
> TestZStandardCompressorDecompressor fails a couple of tests on my machine 
> with the latest zstd library (1.3.5).  Compression can fail to successfully 
> finalize the stream when a small output buffer is used resulting in a failed 
> to init error, and decompression with a direct buffer can fail with an 
> invalid src size error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15822) zstd compressor can fail with a small output buffer

2018-10-08 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642000#comment-16642000
 ] 

Peter Bacsko edited comment on HADOOP-15822 at 10/8/18 3:28 PM:


I reproduced the problem. This is what happens if the sort buffer is 2047MiB.

{noformat}
...
2018-10-08 08:15:04,126 INFO [main] org.apache.hadoop.mapred.MapTask: Spilling 
map output
2018-10-08 08:15:04,126 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart 
= 1267927860; bufend = 2082571562; bufvoid = 2146435072
2018-10-08 08:15:04,126 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 
316981960(1267927840); kvend = 91355880(365423520); length = 225626081/134152192
2018-10-08 08:15:04,126 INFO [main] org.apache.hadoop.mapred.MapTask: (EQUATOR) 
-1997752227 kvi 37170708(148682832)
2018-10-08 08:16:24,712 INFO [SpillThread] org.apache.hadoop.mapred.MapTask: 
Finished spill 20
2018-10-08 08:16:24,712 INFO [main] org.apache.hadoop.mapred.MapTask: (RESET) 
equator -1997752227 kv 37170708(148682832) kvi 37170708(148682832)
2018-10-08 08:16:24,713 INFO [main] org.apache.hadoop.mapred.MapTask: Starting 
flush of map output
2018-10-08 08:16:24,713 INFO [main] org.apache.hadoop.mapred.MapTask: (RESET) 
equator -1997752227 kv 37170708(148682832) kvi 37170708(148682832)
2018-10-08 08:16:24,727 INFO [main] org.apache.hadoop.mapred.Merger: Merging 21 
sorted segments
2018-10-08 08:16:24,735 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,736 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,738 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,739 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,741 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,742 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,743 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,744 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,745 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,746 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,748 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,749 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,750 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,752 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,753 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,754 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,755 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,756 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,757 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,769 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,770 INFO [main] org.apache.hadoop.mapred.Merger: Down to 
the last merge-pass, with 21 segments left of total size: 35310116 bytes
2018-10-08 08:16:30,104 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1469)
at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1365)
at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
at org.apache.hadoop.io.WritableUtils.writeVLong(WritableUtils.java:273)
at org.apache.hadoop.io.WritableUtils.writeVInt(WritableUtils.java:253)
at org.apache.hadoop.io.Text.write(Text.java:330)
at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:98)
at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:82)
at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1163)
at 
org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:727)
at 

[jira] [Commented] (HADOOP-15822) zstd compressor can fail with a small output buffer

2018-10-08 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642000#comment-16642000
 ] 

Peter Bacsko commented on HADOOP-15822:
---

I reproduced the problem. This is what happens if the sort buffer is 2047MiB.

{noformat}
...
2018-10-08 08:15:04,126 INFO [main] org.apache.hadoop.mapred.MapTask: Spilling 
map output
2018-10-08 08:15:04,126 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart 
= 1267927860; bufend = 2082571562; bufvoid = 2146435072
2018-10-08 08:15:04,126 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 
316981960(1267927840); kvend = 91355880(365423520); length = 225626081/134152192
2018-10-08 08:15:04,126 INFO [main] org.apache.hadoop.mapred.MapTask: (EQUATOR) 
-1997752227 kvi 37170708(148682832)
2018-10-08 08:16:24,712 INFO [SpillThread] org.apache.hadoop.mapred.MapTask: 
Finished spill 20
2018-10-08 08:16:24,712 INFO [main] org.apache.hadoop.mapred.MapTask: (RESET) 
equator -1997752227 kv 37170708(148682832) kvi 37170708(148682832)
2018-10-08 08:16:24,713 INFO [main] org.apache.hadoop.mapred.MapTask: Starting 
flush of map output
2018-10-08 08:16:24,713 INFO [main] org.apache.hadoop.mapred.MapTask: (RESET) 
equator -1997752227 kv 37170708(148682832) kvi 37170708(148682832)
2018-10-08 08:16:24,727 INFO [main] org.apache.hadoop.mapred.Merger: Merging 21 
sorted segments
2018-10-08 08:16:24,735 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,736 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,738 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,739 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,741 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,742 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,743 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,744 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,745 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,746 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,748 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,749 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,750 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,752 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,753 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,754 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,755 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,756 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,757 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,769 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
Got brand-new decompressor [.zst]
2018-10-08 08:16:24,770 INFO [main] org.apache.hadoop.mapred.Merger: Down to 
the last merge-pass, with 21 segments left of total size: 35310116 bytes
2018-10-08 08:16:30,104 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1469)
at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1365)
at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
at org.apache.hadoop.io.WritableUtils.writeVLong(WritableUtils.java:273)
at org.apache.hadoop.io.WritableUtils.writeVInt(WritableUtils.java:253)
at org.apache.hadoop.io.Text.write(Text.java:330)
at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:98)
at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:82)
at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1163)
at 
org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:727)
at 

[jira] [Comment Edited] (HADOOP-15822) zstd compressor can fail with a small output buffer

2018-10-05 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640338#comment-16640338
 ] 

Peter Bacsko edited comment on HADOOP-15822 at 10/5/18 8:58 PM:


[~jlowe] what a strange coincidence. I was also testing zstandard today and set 
{{mapreduce.task.io.sort.mb}} to 2047, which is the max I guess. Now the mapper 
was running on a 10GiB zstd compressed text file and then failed. The 
{{equator}} became a negative number and {{collect()}} threw 
{{ArrayIndexOutOfBoundsException}}. Mapper output compression was also enabled, 
probably that's what really matters here. It failed after like 40 minutes.

I'm not sure whether it's zstd or not, because I haven't had the time to try it 
with other codecs, but it's something worth keeping in mind.


was (Author: pbacsko):
[~jlowe] what strange coincidence. I was also testing zstandard today and set 
{{mapreduce.task.io.sort.mb}} to 2047, which is the max I guess. Now the mapper 
was running on a 10GiB zstd compressed text file and then failed. The 
{{equator}} became a negative number and {{collect()}} threw 
{{ArrayIndexOutOfBoundsException}}. Mapper output compression was also enabled, 
probably that's what really matters here. It failed after like 40 minutes.

I'm not sure whether it's zstd or not, because I haven't had the time to try it 
with other codecs, but it's something worth keeping in mind.

> zstd compressor can fail with a small output buffer
> ---
>
> Key: HADOOP-15822
> URL: https://issues.apache.org/jira/browse/HADOOP-15822
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: HADOOP-15822.001.patch, HADOOP-15822.002.patch
>
>
> TestZStandardCompressorDecompressor fails a couple of tests on my machine 
> with the latest zstd library (1.3.5).  Compression can fail to successfully 
> finalize the stream when a small output buffer is used resulting in a failed 
> to init error, and decompression with a direct buffer can fail with an 
> invalid src size error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15822) zstd compressor can fail with a small output buffer

2018-10-05 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640338#comment-16640338
 ] 

Peter Bacsko commented on HADOOP-15822:
---

[~jlowe] what strange coincidence. I was also testing zstandard today and set 
{{mapreduce.task.io.sort.mb}} to 2047, which is the max I guess. Now the mapper 
was running on a 10GiB zstd compressed text file and then failed. The 
{{equator}} became a negative number and {{collect()}} threw 
{{ArrayIndexOutOfBoundsException}}. Mapper output compression was also enabled, 
probably that's what really matters here. It failed after like 40 minutes.

I'm not sure whether it's zstd or not, because I haven't had the time to try it 
with other codecs, but it's something worth keeping in mind.

> zstd compressor can fail with a small output buffer
> ---
>
> Key: HADOOP-15822
> URL: https://issues.apache.org/jira/browse/HADOOP-15822
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: HADOOP-15822.001.patch, HADOOP-15822.002.patch
>
>
> TestZStandardCompressorDecompressor fails a couple of tests on my machine 
> with the latest zstd library (1.3.5).  Compression can fail to successfully 
> finalize the stream when a small output buffer is used resulting in a failed 
> to init error, and decompression with a direct buffer can fail with an 
> invalid src size error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14982) Clients using FailoverOnNetworkExceptionRetry can go into a loop if they're used without authenticating with kerberos in HA env

2017-10-31 Thread Peter Bacsko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated HADOOP-14982:
--
Attachment: HADOOP-14982-003.patch

> Clients using FailoverOnNetworkExceptionRetry can go into a loop if they're 
> used without authenticating with kerberos in HA env
> ---
>
> Key: HADOOP-14982
> URL: https://issues.apache.org/jira/browse/HADOOP-14982
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
> Attachments: HADOOP-14892-001.patch, HADOOP-14892-002.patch, 
> HADOOP-14982-003.patch
>
>
> If HA is configured for the Resource Manager in a secure environment, using 
> the mapred client goes into a loop if the user is not authenticated with 
> Kerberos.
> {noformat}
> [root@pb6sec-1 ~]# mapred job -list
> 17/10/25 06:37:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm36
> 17/10/25 06:37:43 WARN ipc.Client: Exception encountered while connecting to 
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused 
> by GSSException: No valid credentials provided (Mechanism level: Failed to 
> find any Kerberos tgt)]
> 17/10/25 06:37:43 INFO retry.RetryInvocationHandler: java.io.IOException: 
> Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: 
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; , 
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over 
> rm36 after 1 failover attempts. Trying to failover after sleeping for 160ms.
> 17/10/25 06:37:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm25
> 17/10/25 06:37:43 INFO retry.RetryInvocationHandler: 
> java.net.ConnectException: Call From host_redacted/IP_redacted to 
> com.host.redacted:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.getApplications over rm25 after 2 
> failover attempts. Trying to failover after sleeping for 582ms.
> 17/10/25 06:37:44 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm36
> 17/10/25 06:37:44 WARN ipc.Client: Exception encountered while connecting to 
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused 
> by GSSException: No valid credentials provided (Mechanism level: Failed to 
> find any Kerberos tgt)]
> 17/10/25 06:37:44 INFO retry.RetryInvocationHandler: java.io.IOException: 
> Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: 
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; , 
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over 
> rm36 after 3 failover attempts. Trying to failover after sleeping for 977ms.
> 17/10/25 06:37:45 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm25
> 17/10/25 06:37:45 INFO retry.RetryInvocationHandler: 
> java.net.ConnectException: Call From host_redacted/IP_redacted to 
> com.host.redacted:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.getApplications over rm25 after 4 
> failover attempts. Trying to failover after sleeping for 1667ms.
> 17/10/25 06:37:46 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm36
> 17/10/25 06:37:46 WARN ipc.Client: Exception encountered while connecting to 
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused 
> by GSSException: No valid credentials provided (Mechanism level: Failed to 
> find any Kerberos tgt)]
> 17/10/25 06:37:46 INFO retry.RetryInvocationHandler: java.io.IOException: 
> Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: 
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; , 
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over 
> rm36 after 5 failover attempts. Trying to failover after sleeping for 2776ms.
> 17/10/25 06:37:49 

[jira] [Updated] (HADOOP-14982) Clients using FailoverOnNetworkExceptionRetry can go into a loop if they're used without authenticating with kerberos in HA env

2017-10-31 Thread Peter Bacsko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated HADOOP-14982:
--
Attachment: HADOOP-14892-002.patch

> Clients using FailoverOnNetworkExceptionRetry can go into a loop if they're 
> used without authenticating with kerberos in HA env
> ---
>
> Key: HADOOP-14982
> URL: https://issues.apache.org/jira/browse/HADOOP-14982
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
> Attachments: HADOOP-14892-001.patch, HADOOP-14892-002.patch
>
>
> If HA is configured for the Resource Manager in a secure environment, using 
> the mapred client goes into a loop if the user is not authenticated with 
> Kerberos.
> {noformat}
> [root@pb6sec-1 ~]# mapred job -list
> 17/10/25 06:37:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm36
> 17/10/25 06:37:43 WARN ipc.Client: Exception encountered while connecting to 
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused 
> by GSSException: No valid credentials provided (Mechanism level: Failed to 
> find any Kerberos tgt)]
> 17/10/25 06:37:43 INFO retry.RetryInvocationHandler: java.io.IOException: 
> Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: 
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; , 
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over 
> rm36 after 1 failover attempts. Trying to failover after sleeping for 160ms.
> 17/10/25 06:37:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm25
> 17/10/25 06:37:43 INFO retry.RetryInvocationHandler: 
> java.net.ConnectException: Call From host_redacted/IP_redacted to 
> com.host.redacted:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.getApplications over rm25 after 2 
> failover attempts. Trying to failover after sleeping for 582ms.
> 17/10/25 06:37:44 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm36
> 17/10/25 06:37:44 WARN ipc.Client: Exception encountered while connecting to 
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused 
> by GSSException: No valid credentials provided (Mechanism level: Failed to 
> find any Kerberos tgt)]
> 17/10/25 06:37:44 INFO retry.RetryInvocationHandler: java.io.IOException: 
> Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: 
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; , 
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over 
> rm36 after 3 failover attempts. Trying to failover after sleeping for 977ms.
> 17/10/25 06:37:45 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm25
> 17/10/25 06:37:45 INFO retry.RetryInvocationHandler: 
> java.net.ConnectException: Call From host_redacted/IP_redacted to 
> com.host.redacted:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.getApplications over rm25 after 4 
> failover attempts. Trying to failover after sleeping for 1667ms.
> 17/10/25 06:37:46 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm36
> 17/10/25 06:37:46 WARN ipc.Client: Exception encountered while connecting to 
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused 
> by GSSException: No valid credentials provided (Mechanism level: Failed to 
> find any Kerberos tgt)]
> 17/10/25 06:37:46 INFO retry.RetryInvocationHandler: java.io.IOException: 
> Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: 
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; , 
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over 
> rm36 after 5 failover attempts. Trying to failover after sleeping for 2776ms.
> 17/10/25 06:37:49 INFO 

[jira] [Commented] (HADOOP-14982) Clients using FailoverOnNetworkExceptionRetry can go into a loop if they're used without authenticating with kerberos in HA env

2017-10-27 Thread Peter Bacsko (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1657#comment-1657
 ] 

Peter Bacsko commented on HADOOP-14982:
---

[~daryn] how do you get 1011 lines of output? I set the logging level to DEBUG 
and even in that case it's only 215 lines (in case of Hadoop 3).

> Clients using FailoverOnNetworkExceptionRetry can go into a loop if they're 
> used without authenticating with kerberos in HA env
> ---
>
> Key: HADOOP-14982
> URL: https://issues.apache.org/jira/browse/HADOOP-14982
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
> Attachments: HADOOP-14892-001.patch
>
>
> If HA is configured for the Resource Manager in a secure environment, using 
> the mapred client goes into a loop if the user is not authenticated with 
> Kerberos.
> {noformat}
> [root@pb6sec-1 ~]# mapred job -list
> 17/10/25 06:37:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm36
> 17/10/25 06:37:43 WARN ipc.Client: Exception encountered while connecting to 
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused 
> by GSSException: No valid credentials provided (Mechanism level: Failed to 
> find any Kerberos tgt)]
> 17/10/25 06:37:43 INFO retry.RetryInvocationHandler: java.io.IOException: 
> Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: 
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; , 
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over 
> rm36 after 1 failover attempts. Trying to failover after sleeping for 160ms.
> 17/10/25 06:37:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm25
> 17/10/25 06:37:43 INFO retry.RetryInvocationHandler: 
> java.net.ConnectException: Call From host_redacted/IP_redacted to 
> com.host.redacted:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.getApplications over rm25 after 2 
> failover attempts. Trying to failover after sleeping for 582ms.
> 17/10/25 06:37:44 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm36
> 17/10/25 06:37:44 WARN ipc.Client: Exception encountered while connecting to 
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused 
> by GSSException: No valid credentials provided (Mechanism level: Failed to 
> find any Kerberos tgt)]
> 17/10/25 06:37:44 INFO retry.RetryInvocationHandler: java.io.IOException: 
> Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: 
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; , 
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over 
> rm36 after 3 failover attempts. Trying to failover after sleeping for 977ms.
> 17/10/25 06:37:45 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm25
> 17/10/25 06:37:45 INFO retry.RetryInvocationHandler: 
> java.net.ConnectException: Call From host_redacted/IP_redacted to 
> com.host.redacted:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.getApplications over rm25 after 4 
> failover attempts. Trying to failover after sleeping for 1667ms.
> 17/10/25 06:37:46 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm36
> 17/10/25 06:37:46 WARN ipc.Client: Exception encountered while connecting to 
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused 
> by GSSException: No valid credentials provided (Mechanism level: Failed to 
> find any Kerberos tgt)]
> 17/10/25 06:37:46 INFO retry.RetryInvocationHandler: java.io.IOException: 
> Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: 
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; , 
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over 
> rm36 

[jira] [Commented] (HADOOP-14982) Clients using FailoverOnNetworkExceptionRetry can go into a loop if they're used without authenticating with kerberos in HA env

2017-10-26 Thread Peter Bacsko (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16220762#comment-16220762
 ] 

Peter Bacsko commented on HADOOP-14982:
---

Thanks [~daryn], will modify the patch accordingly.

> Clients using FailoverOnNetworkExceptionRetry can go into a loop if they're 
> used without authenticating with kerberos in HA env
> ---
>
> Key: HADOOP-14982
> URL: https://issues.apache.org/jira/browse/HADOOP-14982
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
> Attachments: HADOOP-14892-001.patch
>
>
> If HA is configured for the Resource Manager in a secure environment, using 
> the mapred client goes into a loop if the user is not authenticated with 
> Kerberos.
> {noformat}
> [root@pb6sec-1 ~]# mapred job -list
> 17/10/25 06:37:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm36
> 17/10/25 06:37:43 WARN ipc.Client: Exception encountered while connecting to 
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused 
> by GSSException: No valid credentials provided (Mechanism level: Failed to 
> find any Kerberos tgt)]
> 17/10/25 06:37:43 INFO retry.RetryInvocationHandler: java.io.IOException: 
> Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: 
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; , 
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over 
> rm36 after 1 failover attempts. Trying to failover after sleeping for 160ms.
> 17/10/25 06:37:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm25
> 17/10/25 06:37:43 INFO retry.RetryInvocationHandler: 
> java.net.ConnectException: Call From host_redacted/IP_redacted to 
> com.host.redacted:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.getApplications over rm25 after 2 
> failover attempts. Trying to failover after sleeping for 582ms.
> 17/10/25 06:37:44 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm36
> 17/10/25 06:37:44 WARN ipc.Client: Exception encountered while connecting to 
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused 
> by GSSException: No valid credentials provided (Mechanism level: Failed to 
> find any Kerberos tgt)]
> 17/10/25 06:37:44 INFO retry.RetryInvocationHandler: java.io.IOException: 
> Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: 
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; , 
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over 
> rm36 after 3 failover attempts. Trying to failover after sleeping for 977ms.
> 17/10/25 06:37:45 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm25
> 17/10/25 06:37:45 INFO retry.RetryInvocationHandler: 
> java.net.ConnectException: Call From host_redacted/IP_redacted to 
> com.host.redacted:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.getApplications over rm25 after 4 
> failover attempts. Trying to failover after sleeping for 1667ms.
> 17/10/25 06:37:46 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm36
> 17/10/25 06:37:46 WARN ipc.Client: Exception encountered while connecting to 
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused 
> by GSSException: No valid credentials provided (Mechanism level: Failed to 
> find any Kerberos tgt)]
> 17/10/25 06:37:46 INFO retry.RetryInvocationHandler: java.io.IOException: 
> Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: 
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; , 
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over 
> rm36 after 5 failover attempts. Trying to failover after sleeping for 2776ms.
> 17/10/25 06:37:49 

[jira] [Updated] (HADOOP-14982) Clients using FailoverOnNetworkExceptionRetry can go into a loop if they're used without authenticating with kerberos in HA env

2017-10-26 Thread Peter Bacsko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated HADOOP-14982:
--
Status: Patch Available  (was: Open)

> Clients using FailoverOnNetworkExceptionRetry can go into a loop if they're 
> used without authenticating with kerberos in HA env
> ---
>
> Key: HADOOP-14982
> URL: https://issues.apache.org/jira/browse/HADOOP-14982
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
> Attachments: HADOOP-14892-001.patch
>
>
> If HA is configured for the Resource Manager in a secure environment, using 
> the mapred client goes into a loop if the user is not authenticated with 
> Kerberos.
> {noformat}
> [root@pb6sec-1 ~]# mapred job -list
> 17/10/25 06:37:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm36
> 17/10/25 06:37:43 WARN ipc.Client: Exception encountered while connecting to 
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused 
> by GSSException: No valid credentials provided (Mechanism level: Failed to 
> find any Kerberos tgt)]
> 17/10/25 06:37:43 INFO retry.RetryInvocationHandler: java.io.IOException: 
> Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: 
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; , 
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over 
> rm36 after 1 failover attempts. Trying to failover after sleeping for 160ms.
> 17/10/25 06:37:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm25
> 17/10/25 06:37:43 INFO retry.RetryInvocationHandler: 
> java.net.ConnectException: Call From host_redacted/IP_redacted to 
> com.host.redacted:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.getApplications over rm25 after 2 
> failover attempts. Trying to failover after sleeping for 582ms.
> 17/10/25 06:37:44 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm36
> 17/10/25 06:37:44 WARN ipc.Client: Exception encountered while connecting to 
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused 
> by GSSException: No valid credentials provided (Mechanism level: Failed to 
> find any Kerberos tgt)]
> 17/10/25 06:37:44 INFO retry.RetryInvocationHandler: java.io.IOException: 
> Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: 
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; , 
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over 
> rm36 after 3 failover attempts. Trying to failover after sleeping for 977ms.
> 17/10/25 06:37:45 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm25
> 17/10/25 06:37:45 INFO retry.RetryInvocationHandler: 
> java.net.ConnectException: Call From host_redacted/IP_redacted to 
> com.host.redacted:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.getApplications over rm25 after 4 
> failover attempts. Trying to failover after sleeping for 1667ms.
> 17/10/25 06:37:46 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm36
> 17/10/25 06:37:46 WARN ipc.Client: Exception encountered while connecting to 
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused 
> by GSSException: No valid credentials provided (Mechanism level: Failed to 
> find any Kerberos tgt)]
> 17/10/25 06:37:46 INFO retry.RetryInvocationHandler: java.io.IOException: 
> Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: 
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; , 
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over 
> rm36 after 5 failover attempts. Trying to failover after sleeping for 2776ms.
> 17/10/25 06:37:49 INFO client.ConfiguredRMFailoverProxyProvider: 

[jira] [Updated] (HADOOP-14982) Clients using FailoverOnNetworkExceptionRetry can go into a loop if they're used without authenticating with kerberos in HA env

2017-10-26 Thread Peter Bacsko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated HADOOP-14982:
--
Attachment: HADOOP-14892-001.patch

> Clients using FailoverOnNetworkExceptionRetry can go into a loop if they're 
> used without authenticating with kerberos in HA env
> ---
>
> Key: HADOOP-14982
> URL: https://issues.apache.org/jira/browse/HADOOP-14982
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
> Attachments: HADOOP-14892-001.patch
>
>
> If HA is configured for the Resource Manager in a secure environment, using 
> the mapred client goes into a loop if the user is not authenticated with 
> Kerberos.
> {noformat}
> [root@pb6sec-1 ~]# mapred job -list
> 17/10/25 06:37:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm36
> 17/10/25 06:37:43 WARN ipc.Client: Exception encountered while connecting to 
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused 
> by GSSException: No valid credentials provided (Mechanism level: Failed to 
> find any Kerberos tgt)]
> 17/10/25 06:37:43 INFO retry.RetryInvocationHandler: java.io.IOException: 
> Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: 
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; , 
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over 
> rm36 after 1 failover attempts. Trying to failover after sleeping for 160ms.
> 17/10/25 06:37:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm25
> 17/10/25 06:37:43 INFO retry.RetryInvocationHandler: 
> java.net.ConnectException: Call From host_redacted/IP_redacted to 
> com.host.redacted:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.getApplications over rm25 after 2 
> failover attempts. Trying to failover after sleeping for 582ms.
> 17/10/25 06:37:44 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm36
> 17/10/25 06:37:44 WARN ipc.Client: Exception encountered while connecting to 
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused 
> by GSSException: No valid credentials provided (Mechanism level: Failed to 
> find any Kerberos tgt)]
> 17/10/25 06:37:44 INFO retry.RetryInvocationHandler: java.io.IOException: 
> Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: 
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; , 
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over 
> rm36 after 3 failover attempts. Trying to failover after sleeping for 977ms.
> 17/10/25 06:37:45 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm25
> 17/10/25 06:37:45 INFO retry.RetryInvocationHandler: 
> java.net.ConnectException: Call From host_redacted/IP_redacted to 
> com.host.redacted:8032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> ApplicationClientProtocolPBClientImpl.getApplications over rm25 after 4 
> failover attempts. Trying to failover after sleeping for 1667ms.
> 17/10/25 06:37:46 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm36
> 17/10/25 06:37:46 WARN ipc.Client: Exception encountered while connecting to 
> the server : javax.security.sasl.SaslException: GSS initiate failed [Caused 
> by GSSException: No valid credentials provided (Mechanism level: Failed to 
> find any Kerberos tgt)]
> 17/10/25 06:37:46 INFO retry.RetryInvocationHandler: java.io.IOException: 
> Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: 
> "host_redacted/IP_redacted"; destination host is: "com.host2.redacted:8032; , 
> while invoking ApplicationClientProtocolPBClientImpl.getApplications over 
> rm36 after 5 failover attempts. Trying to failover after sleeping for 2776ms.
> 17/10/25 06:37:49 INFO client.ConfiguredRMFailoverProxyProvider: Failing 

[jira] [Created] (HADOOP-14982) Clients using FailoverOnNetworkExceptionRetry can go into a loop if they're used without authenticating with kerberos in HA env

2017-10-25 Thread Peter Bacsko (JIRA)
Peter Bacsko created HADOOP-14982:
-

 Summary: Clients using FailoverOnNetworkExceptionRetry can go into 
a loop if they're used without authenticating with kerberos in HA env
 Key: HADOOP-14982
 URL: https://issues.apache.org/jira/browse/HADOOP-14982
 Project: Hadoop Common
  Issue Type: Bug
  Components: common
Reporter: Peter Bacsko
Assignee: Peter Bacsko


If HA is configured for the Resource Manager in a secure environment, using the 
mapred client goes into a loop if the user is not authenticated with Kerberos.

{noformat}
[root@pb6sec-1 ~]# mapred job -list
17/10/25 06:37:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
to rm36
17/10/25 06:37:43 WARN ipc.Client: Exception encountered while connecting to 
the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
17/10/25 06:37:43 INFO retry.RetryInvocationHandler: java.io.IOException: 
Failed on local exception: java.io.IOException: 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]; Host Details : local host is: "host_redacted/IP_redacted"; destination 
host is: "com.host2.redacted:8032; , while invoking 
ApplicationClientProtocolPBClientImpl.getApplications over rm36 after 1 
failover attempts. Trying to failover after sleeping for 160ms.
17/10/25 06:37:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
to rm25
17/10/25 06:37:43 INFO retry.RetryInvocationHandler: java.net.ConnectException: 
Call From host_redacted/IP_redacted to com.host.redacted:8032 failed on 
connection exception: java.net.ConnectException: Connection refused; For more 
details see:  http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
ApplicationClientProtocolPBClientImpl.getApplications over rm25 after 2 
failover attempts. Trying to failover after sleeping for 582ms.
17/10/25 06:37:44 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
to rm36
17/10/25 06:37:44 WARN ipc.Client: Exception encountered while connecting to 
the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
17/10/25 06:37:44 INFO retry.RetryInvocationHandler: java.io.IOException: 
Failed on local exception: java.io.IOException: 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]; Host Details : local host is: "host_redacted/IP_redacted"; destination 
host is: "com.host2.redacted:8032; , while invoking 
ApplicationClientProtocolPBClientImpl.getApplications over rm36 after 3 
failover attempts. Trying to failover after sleeping for 977ms.
17/10/25 06:37:45 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
to rm25
17/10/25 06:37:45 INFO retry.RetryInvocationHandler: java.net.ConnectException: 
Call From host_redacted/IP_redacted to com.host.redacted:8032 failed on 
connection exception: java.net.ConnectException: Connection refused; For more 
details see:  http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
ApplicationClientProtocolPBClientImpl.getApplications over rm25 after 4 
failover attempts. Trying to failover after sleeping for 1667ms.
17/10/25 06:37:46 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
to rm36
17/10/25 06:37:46 WARN ipc.Client: Exception encountered while connecting to 
the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
17/10/25 06:37:46 INFO retry.RetryInvocationHandler: java.io.IOException: 
Failed on local exception: java.io.IOException: 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]; Host Details : local host is: "host_redacted/IP_redacted"; destination 
host is: "com.host2.redacted:8032; , while invoking 
ApplicationClientProtocolPBClientImpl.getApplications over rm36 after 5 
failover attempts. Trying to failover after sleeping for 2776ms.
17/10/25 06:37:49 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
to rm25
17/10/25 06:37:49 INFO retry.RetryInvocationHandler: java.net.ConnectException: 
Call From host_redacted/IP_redacted to com.host.redacted:8032 failed on 
connection exception: java.net.ConnectException: Connection refused; For more 
details see:  http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
ApplicationClientProtocolPBClientImpl.getApplications over rm25 after 6 
failover attempts. Trying to failover after sleeping for 1055ms.