[jira] [Resolved] (HADOOP-15360) Log some more helpful information when catch RuntimeException or Error in IPC.Server

2018-04-03 Thread He Xiaoqiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao resolved HADOOP-15360.
--
Resolution: Not A Problem

close issue since it is not a problem.

> Log some more helpful information when catch RuntimeException or Error in 
> IPC.Server 
> -
>
> Key: HADOOP-15360
> URL: https://issues.apache.org/jira/browse/HADOOP-15360
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: He Xiaoqiao
>Priority: Major
>
> IPC.Server#logException doesn't not print exception stack trace when catch 
> RuntimeException or Error, for instance:
> {code:java}
> 2018-03-28 21:52:25,385 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 17 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo 
> from *.*.*.*:59326 Call#46 Retry#0 java.lang.ArrayIndexOutOfBoundsException: 0
> {code}
> this log message is not friendly for debug. I think it is necessary to print 
> more helpful message or full stack trace when the exception is 
> RuntimeException or Error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15360) Log some more helpful information when catch RuntimeException or Error in IPC.Server

2018-04-03 Thread He Xiaoqiao (JIRA)
He Xiaoqiao created HADOOP-15360:


 Summary: Log some more helpful information when catch 
RuntimeException or Error in IPC.Server 
 Key: HADOOP-15360
 URL: https://issues.apache.org/jira/browse/HADOOP-15360
 Project: Hadoop Common
  Issue Type: Improvement
  Components: ipc
Reporter: He Xiaoqiao


IPC.Server#logException doesn't not print exception stack trace when catch 
RuntimeException or Error, for instance:
{code:java}
2018-03-28 21:52:25,385 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
17 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo 
from *.*.*.*:59326 Call#46 Retry#0 java.lang.ArrayIndexOutOfBoundsException: 0
{code}
this log message is not friendly for debug. I think it is necessary to print 
more helpful message or full stack trace when the exception is RuntimeException 
or Error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14999) AliyunOSS: provide one asynchronous multi-part based uploading mechanism

2018-04-03 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424863#comment-16424863
 ] 

genericqa commented on HADOOP-14999:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 18m 
36s{color} | {color:red} Docker failed to build yetus/hadoop:dbd69cb. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HADOOP-14999 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12917477/HADOOP-14999-branch-2.001.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14430/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> AliyunOSS: provide one asynchronous multi-part based uploading mechanism
> 
>
> Key: HADOOP-14999
> URL: https://issues.apache.org/jira/browse/HADOOP-14999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/oss
>Affects Versions: 3.0.0-beta1
>Reporter: Genmao Yu
>Assignee: Genmao Yu
>Priority: Major
> Attachments: HADOOP-14999-branch-2.001.patch, HADOOP-14999.001.patch, 
> HADOOP-14999.002.patch, HADOOP-14999.003.patch, HADOOP-14999.004.patch, 
> HADOOP-14999.005.patch, HADOOP-14999.006.patch, HADOOP-14999.007.patch, 
> HADOOP-14999.008.patch, HADOOP-14999.009.patch, HADOOP-14999.010.patch, 
> HADOOP-14999.011.patch, asynchronous_file_uploading.pdf, 
> diff-between-patch7-and-patch8.txt
>
>
> This mechanism is designed for uploading file in parallel and asynchronously:
>  - improve the performance of uploading file to OSS server. Firstly, this 
> mechanism splits result to multiple small blocks and upload them in parallel. 
> Then, getting result and uploading blocks are asynchronous.
>  - avoid buffering too large result into local disk. To cite an extreme 
> example, there is a task which will output 100GB or even larger, we may need 
> to output this 100GB to local disk and then upload it. Sometimes, it is 
> inefficient and limited to disk space.
> This patch reuse {{SemaphoredDelegatingExecutor}} as executor service and 
> depends on HADOOP-15039.
> Attached {{asynchronous_file_uploading.pdf}} illustrated the difference 
> between previous {{AliyunOSSOutputStream}} and 
> {{AliyunOSSBlockOutputStream}}, i.e. this asynchronous multi-part based 
> uploading mechanism.
> 1. {{AliyunOSSOutputStream}}: we need to output the whole result to local 
> disk before we can upload it to OSS. This will poses two problems:
>  - if the output file is too large, it will run out of the local disk.
>  - if the output file is too large, task will wait long time to upload result 
> to OSS before finish, wasting much compute resource.
> 2. {{AliyunOSSBlockOutputStream}}: we cut the task output into small blocks, 
> i.e. some small local file, and each block will be packaged into a uploading 
> task. These tasks will be submitted into {{SemaphoredDelegatingExecutor}}. 
> {{SemaphoredDelegatingExecutor}} will upload this blocks in parallel, this 
> will improve performance greatly.
> 3. Each task will retry 3 times to upload block to Aliyun OSS. If one of 
> those tasks failed, the whole file uploading will failed, and we will abort 
> current uploading.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14999) AliyunOSS: provide one asynchronous multi-part based uploading mechanism

2018-04-03 Thread Genmao Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Genmao Yu updated HADOOP-14999:
---
Attachment: HADOOP-14999-branch-2.001.patch

> AliyunOSS: provide one asynchronous multi-part based uploading mechanism
> 
>
> Key: HADOOP-14999
> URL: https://issues.apache.org/jira/browse/HADOOP-14999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/oss
>Affects Versions: 3.0.0-beta1
>Reporter: Genmao Yu
>Assignee: Genmao Yu
>Priority: Major
> Attachments: HADOOP-14999-branch-2.001.patch, HADOOP-14999.001.patch, 
> HADOOP-14999.002.patch, HADOOP-14999.003.patch, HADOOP-14999.004.patch, 
> HADOOP-14999.005.patch, HADOOP-14999.006.patch, HADOOP-14999.007.patch, 
> HADOOP-14999.008.patch, HADOOP-14999.009.patch, HADOOP-14999.010.patch, 
> HADOOP-14999.011.patch, asynchronous_file_uploading.pdf, 
> diff-between-patch7-and-patch8.txt
>
>
> This mechanism is designed for uploading file in parallel and asynchronously:
>  - improve the performance of uploading file to OSS server. Firstly, this 
> mechanism splits result to multiple small blocks and upload them in parallel. 
> Then, getting result and uploading blocks are asynchronous.
>  - avoid buffering too large result into local disk. To cite an extreme 
> example, there is a task which will output 100GB or even larger, we may need 
> to output this 100GB to local disk and then upload it. Sometimes, it is 
> inefficient and limited to disk space.
> This patch reuse {{SemaphoredDelegatingExecutor}} as executor service and 
> depends on HADOOP-15039.
> Attached {{asynchronous_file_uploading.pdf}} illustrated the difference 
> between previous {{AliyunOSSOutputStream}} and 
> {{AliyunOSSBlockOutputStream}}, i.e. this asynchronous multi-part based 
> uploading mechanism.
> 1. {{AliyunOSSOutputStream}}: we need to output the whole result to local 
> disk before we can upload it to OSS. This will poses two problems:
>  - if the output file is too large, it will run out of the local disk.
>  - if the output file is too large, task will wait long time to upload result 
> to OSS before finish, wasting much compute resource.
> 2. {{AliyunOSSBlockOutputStream}}: we cut the task output into small blocks, 
> i.e. some small local file, and each block will be packaged into a uploading 
> task. These tasks will be submitted into {{SemaphoredDelegatingExecutor}}. 
> {{SemaphoredDelegatingExecutor}} will upload this blocks in parallel, this 
> will improve performance greatly.
> 3. Each task will retry 3 times to upload block to Aliyun OSS. If one of 
> those tasks failed, the whole file uploading will failed, and we will abort 
> current uploading.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14999) AliyunOSS: provide one asynchronous multi-part based uploading mechanism

2018-04-03 Thread Genmao Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Genmao Yu updated HADOOP-14999:
---
Attachment: (was: HADOOP-14999-branch-2.001.patch)

> AliyunOSS: provide one asynchronous multi-part based uploading mechanism
> 
>
> Key: HADOOP-14999
> URL: https://issues.apache.org/jira/browse/HADOOP-14999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/oss
>Affects Versions: 3.0.0-beta1
>Reporter: Genmao Yu
>Assignee: Genmao Yu
>Priority: Major
> Attachments: HADOOP-14999-branch-2.001.patch, HADOOP-14999.001.patch, 
> HADOOP-14999.002.patch, HADOOP-14999.003.patch, HADOOP-14999.004.patch, 
> HADOOP-14999.005.patch, HADOOP-14999.006.patch, HADOOP-14999.007.patch, 
> HADOOP-14999.008.patch, HADOOP-14999.009.patch, HADOOP-14999.010.patch, 
> HADOOP-14999.011.patch, asynchronous_file_uploading.pdf, 
> diff-between-patch7-and-patch8.txt
>
>
> This mechanism is designed for uploading file in parallel and asynchronously:
>  - improve the performance of uploading file to OSS server. Firstly, this 
> mechanism splits result to multiple small blocks and upload them in parallel. 
> Then, getting result and uploading blocks are asynchronous.
>  - avoid buffering too large result into local disk. To cite an extreme 
> example, there is a task which will output 100GB or even larger, we may need 
> to output this 100GB to local disk and then upload it. Sometimes, it is 
> inefficient and limited to disk space.
> This patch reuse {{SemaphoredDelegatingExecutor}} as executor service and 
> depends on HADOOP-15039.
> Attached {{asynchronous_file_uploading.pdf}} illustrated the difference 
> between previous {{AliyunOSSOutputStream}} and 
> {{AliyunOSSBlockOutputStream}}, i.e. this asynchronous multi-part based 
> uploading mechanism.
> 1. {{AliyunOSSOutputStream}}: we need to output the whole result to local 
> disk before we can upload it to OSS. This will poses two problems:
>  - if the output file is too large, it will run out of the local disk.
>  - if the output file is too large, task will wait long time to upload result 
> to OSS before finish, wasting much compute resource.
> 2. {{AliyunOSSBlockOutputStream}}: we cut the task output into small blocks, 
> i.e. some small local file, and each block will be packaged into a uploading 
> task. These tasks will be submitted into {{SemaphoredDelegatingExecutor}}. 
> {{SemaphoredDelegatingExecutor}} will upload this blocks in parallel, this 
> will improve performance greatly.
> 3. Each task will retry 3 times to upload block to Aliyun OSS. If one of 
> those tasks failed, the whole file uploading will failed, and we will abort 
> current uploading.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-7050) proxyuser host/group config properties don't work if user name as DOT in it

2018-04-03 Thread Quanlong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424792#comment-16424792
 ] 

Quanlong Huang commented on HADOOP-7050:


[~k4kaliazz], I have a similar problem as yours. I failed to launch HiveServer2 
in linux since my username contains a (.)dot too. Finally, I disabled 
impersonation in it by setting hive.server2.enable.doAs to false in 
hive-site.xml.

> proxyuser host/group config properties don't work if user name as DOT in it
> ---
>
> Key: HADOOP-7050
> URL: https://issues.apache.org/jira/browse/HADOOP-7050
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Reporter: Alejandro Abdelnur
>Priority: Major
>
> If the user contains a DOT, "foo.bar", proxy user configuration fails to be 
> read properly and it does not kick in.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15359) IPC client hang in kerberized cluster due to JDK deadlock

2018-04-03 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424727#comment-16424727
 ] 

Xiao Chen commented on HADOOP-15359:


bq. JDK-7092821
Yes, I wasn't accurate, updated the description. That jira is the closest I can 
find. But if that's taken care of, (i.e. either the synchronized method isn't 
called any more, or the lock removed), then there is no deadlock in this case 
anyways.

One of its linked jiras also mentioned " In order to alleviate this problem 
applications should cache the result of the Cipher.getInstance() call per 
thread and reinitialise (Cipher.init(...)) the cached copy instead of calling 
Cipher.getInstance() again. " but the caller here would be krb5...

> IPC client hang in kerberized cluster due to JDK deadlock
> -
>
> Key: HADOOP-15359
> URL: https://issues.apache.org/jira/browse/HADOOP-15359
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc
>Affects Versions: 2.6.0, 2.8.0, 3.0.0
>Reporter: Xiao Chen
>Priority: Major
> Attachments: 1.jstack, 2.jstack
>
>
> In a recent internal testing, we have found a DFS client hang. Further 
> inspecting jstack shows the following:
> {noformat}
> "IPC Client (552936351) connection toHOSTNAME:8020 from PRINCIPAL" #7468 
> daemon prio=5 os_prio=0 tid=0x7f6bb306c000 nid=0x1c76e waiting for 
> monitor entry [0x7f6bc2bd6000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at java.security.Provider.getService(Provider.java:1035)
> - waiting to lock <0x80277040> (a sun.security.provider.Sun)
> at 
> sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:444)
> at 
> sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
> at 
> sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
> at javax.crypto.Cipher.getInstance(Cipher.java:513)
> at 
> sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:202)
> at sun.security.krb5.internal.crypto.dk.DkCrypto.dr(DkCrypto.java:484)
> at sun.security.krb5.internal.crypto.dk.DkCrypto.dk(DkCrypto.java:447)
> at 
> sun.security.krb5.internal.crypto.dk.DkCrypto.calculateChecksum(DkCrypto.java:413)
> at 
> sun.security.krb5.internal.crypto.Des3.calculateChecksum(Des3.java:59)
> at 
> sun.security.jgss.krb5.CipherHelper.calculateChecksum(CipherHelper.java:231)
> at 
> sun.security.jgss.krb5.MessageToken.getChecksum(MessageToken.java:466)
> at 
> sun.security.jgss.krb5.MessageToken.verifySignAndSeqNumber(MessageToken.java:374)
> at 
> sun.security.jgss.krb5.WrapToken.getDataFromBuffer(WrapToken.java:284)
> at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:209)
> at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:182)
> at sun.security.jgss.krb5.Krb5Context.unwrap(Krb5Context.java:1053)
> at sun.security.jgss.GSSContextImpl.unwrap(GSSContextImpl.java:403)
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Base.unwrap(GssKrb5Base.java:77)
> at 
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.readNextRpcPacket(SaslRpcClient.java:617)
> at 
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.read(SaslRpcClient.java:583)
> - locked <0x83444878> (a java.nio.HeapByteBuffer)
> at java.io.FilterInputStream.read(FilterInputStream.java:133)
> at 
> org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:553)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> - locked <0x834448c0> (a java.io.BufferedInputStream)
> at java.io.DataInputStream.readInt(DataInputStream.java:387)
> at 
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1113)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1006)
> {noformat}
> and at the end of jstack:
> {noformat}
> Found one Java-level deadlock:
> =
> "IPC Parameter Sending Thread #29":
>   waiting to lock monitor 0x17ff49f8 (object 0x80277040, a 
> sun.security.provider.Sun),
>   which is held by UNKNOWN_owner_addr=0x50607000
> Java stack information for the threads listed above:
> ===
> "IPC Parameter Sending Thread #29":
> at java.security.Provider.getService(Provider.java:1035)
> - waiting to lock <0x80277040> (a sun.security.provider.Sun)
> at 
> sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:437)
> at 
> 

[jira] [Updated] (HADOOP-15359) IPC client hang in kerberized cluster due to JDK deadlock

2018-04-03 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-15359:
---
Priority: Major  (was: Critical)

> IPC client hang in kerberized cluster due to JDK deadlock
> -
>
> Key: HADOOP-15359
> URL: https://issues.apache.org/jira/browse/HADOOP-15359
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc
>Affects Versions: 2.6.0, 2.8.0, 3.0.0
>Reporter: Xiao Chen
>Priority: Major
> Attachments: 1.jstack, 2.jstack
>
>
> In a recent internal testing, we have found a DFS client hang. Further 
> inspecting jstack shows the following:
> {noformat}
> "IPC Client (552936351) connection toHOSTNAME:8020 from PRINCIPAL" #7468 
> daemon prio=5 os_prio=0 tid=0x7f6bb306c000 nid=0x1c76e waiting for 
> monitor entry [0x7f6bc2bd6000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at java.security.Provider.getService(Provider.java:1035)
> - waiting to lock <0x80277040> (a sun.security.provider.Sun)
> at 
> sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:444)
> at 
> sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
> at 
> sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
> at javax.crypto.Cipher.getInstance(Cipher.java:513)
> at 
> sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:202)
> at sun.security.krb5.internal.crypto.dk.DkCrypto.dr(DkCrypto.java:484)
> at sun.security.krb5.internal.crypto.dk.DkCrypto.dk(DkCrypto.java:447)
> at 
> sun.security.krb5.internal.crypto.dk.DkCrypto.calculateChecksum(DkCrypto.java:413)
> at 
> sun.security.krb5.internal.crypto.Des3.calculateChecksum(Des3.java:59)
> at 
> sun.security.jgss.krb5.CipherHelper.calculateChecksum(CipherHelper.java:231)
> at 
> sun.security.jgss.krb5.MessageToken.getChecksum(MessageToken.java:466)
> at 
> sun.security.jgss.krb5.MessageToken.verifySignAndSeqNumber(MessageToken.java:374)
> at 
> sun.security.jgss.krb5.WrapToken.getDataFromBuffer(WrapToken.java:284)
> at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:209)
> at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:182)
> at sun.security.jgss.krb5.Krb5Context.unwrap(Krb5Context.java:1053)
> at sun.security.jgss.GSSContextImpl.unwrap(GSSContextImpl.java:403)
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Base.unwrap(GssKrb5Base.java:77)
> at 
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.readNextRpcPacket(SaslRpcClient.java:617)
> at 
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.read(SaslRpcClient.java:583)
> - locked <0x83444878> (a java.nio.HeapByteBuffer)
> at java.io.FilterInputStream.read(FilterInputStream.java:133)
> at 
> org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:553)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> - locked <0x834448c0> (a java.io.BufferedInputStream)
> at java.io.DataInputStream.readInt(DataInputStream.java:387)
> at 
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1113)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1006)
> {noformat}
> and at the end of jstack:
> {noformat}
> Found one Java-level deadlock:
> =
> "IPC Parameter Sending Thread #29":
>   waiting to lock monitor 0x17ff49f8 (object 0x80277040, a 
> sun.security.provider.Sun),
>   which is held by UNKNOWN_owner_addr=0x50607000
> Java stack information for the threads listed above:
> ===
> "IPC Parameter Sending Thread #29":
> at java.security.Provider.getService(Provider.java:1035)
> - waiting to lock <0x80277040> (a sun.security.provider.Sun)
> at 
> sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:437)
> at 
> sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
> at 
> sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
> at javax.crypto.SecretKeyFactory.nextSpi(SecretKeyFactory.java:293)
> - locked <0x834386b8> (a java.lang.Object)
> at javax.crypto.SecretKeyFactory.(SecretKeyFactory.java:121)
> at 
> javax.crypto.SecretKeyFactory.getInstance(SecretKeyFactory.java:160)
> at 
> sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:187)
> at 

[jira] [Updated] (HADOOP-15359) IPC client hang in kerberized cluster due to JDK deadlock

2018-04-03 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-15359:
---
Description: 
In a recent internal testing, we have found a DFS client hang. Further 
inspecting jstack shows the following:

{noformat}
"IPC Client (552936351) connection toHOSTNAME:8020 from PRINCIPAL" #7468 daemon 
prio=5 os_prio=0 tid=0x7f6bb306c000 nid=0x1c76e waiting for monitor entry 
[0x7f6bc2bd6000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at java.security.Provider.getService(Provider.java:1035)
- waiting to lock <0x80277040> (a sun.security.provider.Sun)
at 
sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:444)
at 
sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
at 
sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
at javax.crypto.Cipher.getInstance(Cipher.java:513)
at 
sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:202)
at sun.security.krb5.internal.crypto.dk.DkCrypto.dr(DkCrypto.java:484)
at sun.security.krb5.internal.crypto.dk.DkCrypto.dk(DkCrypto.java:447)
at 
sun.security.krb5.internal.crypto.dk.DkCrypto.calculateChecksum(DkCrypto.java:413)
at 
sun.security.krb5.internal.crypto.Des3.calculateChecksum(Des3.java:59)
at 
sun.security.jgss.krb5.CipherHelper.calculateChecksum(CipherHelper.java:231)
at 
sun.security.jgss.krb5.MessageToken.getChecksum(MessageToken.java:466)
at 
sun.security.jgss.krb5.MessageToken.verifySignAndSeqNumber(MessageToken.java:374)
at 
sun.security.jgss.krb5.WrapToken.getDataFromBuffer(WrapToken.java:284)
at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:209)
at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:182)
at sun.security.jgss.krb5.Krb5Context.unwrap(Krb5Context.java:1053)
at sun.security.jgss.GSSContextImpl.unwrap(GSSContextImpl.java:403)
at com.sun.security.sasl.gsskerb.GssKrb5Base.unwrap(GssKrb5Base.java:77)
at 
org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.readNextRpcPacket(SaslRpcClient.java:617)
at 
org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.read(SaslRpcClient.java:583)
- locked <0x83444878> (a java.nio.HeapByteBuffer)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at 
org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:553)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
- locked <0x834448c0> (a java.io.BufferedInputStream)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at 
org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1113)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1006)
{noformat}

and at the end of jstack:
{noformat}
Found one Java-level deadlock:
=
"IPC Parameter Sending Thread #29":
  waiting to lock monitor 0x17ff49f8 (object 0x80277040, a 
sun.security.provider.Sun),
  which is held by UNKNOWN_owner_addr=0x50607000

Java stack information for the threads listed above:
===
"IPC Parameter Sending Thread #29":
at java.security.Provider.getService(Provider.java:1035)
- waiting to lock <0x80277040> (a sun.security.provider.Sun)
at 
sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:437)
at 
sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
at 
sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
at javax.crypto.SecretKeyFactory.nextSpi(SecretKeyFactory.java:293)
- locked <0x834386b8> (a java.lang.Object)
at javax.crypto.SecretKeyFactory.(SecretKeyFactory.java:121)
at javax.crypto.SecretKeyFactory.getInstance(SecretKeyFactory.java:160)
at 
sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:187)
at sun.security.krb5.internal.crypto.dk.DkCrypto.dr(DkCrypto.java:484)
at sun.security.krb5.internal.crypto.dk.DkCrypto.dk(DkCrypto.java:447)
at 
sun.security.krb5.internal.crypto.dk.DkCrypto.calculateChecksum(DkCrypto.java:413)
at 
sun.security.krb5.internal.crypto.Des3.calculateChecksum(Des3.java:59)
at 
sun.security.jgss.krb5.CipherHelper.calculateChecksum(CipherHelper.java:231)
at 
sun.security.jgss.krb5.MessageToken.getChecksum(MessageToken.java:466)
at 
sun.security.jgss.krb5.MessageToken.genSignAndSeqNumber(MessageToken.java:315)
at sun.security.jgss.krb5.WrapToken.(WrapToken.java:422)
at 

[jira] [Commented] (HADOOP-15359) IPC client hang in kerberized cluster due to JDK deadlock

2018-04-03 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424683#comment-16424683
 ] 

Wei-Chiu Chuang commented on HADOOP-15359:
--

JDK-7092821 mentioned it a scalability bottleneck rather than a deadlock. Not 
sure how JDK detetermines a deadlock though.
 HADOOP-13836 (Securing Hadoop RPC using SSL) should help with this in the long 
run, since it would not depend on JDK SASL eventually. And it improves RPC 
performance as well.

> IPC client hang in kerberized cluster due to JDK deadlock
> -
>
> Key: HADOOP-15359
> URL: https://issues.apache.org/jira/browse/HADOOP-15359
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc
>Affects Versions: 2.6.0, 2.8.0, 3.0.0
>Reporter: Xiao Chen
>Priority: Critical
> Attachments: 1.jstack, 2.jstack
>
>
> In a recent internal testing, we have found a DFS client hang. Further 
> inspecting jstack shows the following:
> {noformat}
> "IPC Client (552936351) connection toHOSTNAME:8020 from PRINCIPAL" #7468 
> daemon prio=5 os_prio=0 tid=0x7f6bb306c000 nid=0x1c76e waiting for 
> monitor entry [0x7f6bc2bd6000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at java.security.Provider.getService(Provider.java:1035)
> - waiting to lock <0x80277040> (a sun.security.provider.Sun)
> at 
> sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:444)
> at 
> sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
> at 
> sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
> at javax.crypto.Cipher.getInstance(Cipher.java:513)
> at 
> sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:202)
> at sun.security.krb5.internal.crypto.dk.DkCrypto.dr(DkCrypto.java:484)
> at sun.security.krb5.internal.crypto.dk.DkCrypto.dk(DkCrypto.java:447)
> at 
> sun.security.krb5.internal.crypto.dk.DkCrypto.calculateChecksum(DkCrypto.java:413)
> at 
> sun.security.krb5.internal.crypto.Des3.calculateChecksum(Des3.java:59)
> at 
> sun.security.jgss.krb5.CipherHelper.calculateChecksum(CipherHelper.java:231)
> at 
> sun.security.jgss.krb5.MessageToken.getChecksum(MessageToken.java:466)
> at 
> sun.security.jgss.krb5.MessageToken.verifySignAndSeqNumber(MessageToken.java:374)
> at 
> sun.security.jgss.krb5.WrapToken.getDataFromBuffer(WrapToken.java:284)
> at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:209)
> at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:182)
> at sun.security.jgss.krb5.Krb5Context.unwrap(Krb5Context.java:1053)
> at sun.security.jgss.GSSContextImpl.unwrap(GSSContextImpl.java:403)
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Base.unwrap(GssKrb5Base.java:77)
> at 
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.readNextRpcPacket(SaslRpcClient.java:617)
> at 
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.read(SaslRpcClient.java:583)
> - locked <0x83444878> (a java.nio.HeapByteBuffer)
> at java.io.FilterInputStream.read(FilterInputStream.java:133)
> at 
> org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:553)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> - locked <0x834448c0> (a java.io.BufferedInputStream)
> at java.io.DataInputStream.readInt(DataInputStream.java:387)
> at 
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1113)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1006)
> {noformat}
> and at the end of jstack:
> {noformat}
> Found one Java-level deadlock:
> =
> "IPC Parameter Sending Thread #29":
>   waiting to lock monitor 0x17ff49f8 (object 0x80277040, a 
> sun.security.provider.Sun),
>   which is held by UNKNOWN_owner_addr=0x50607000
> Java stack information for the threads listed above:
> ===
> "IPC Parameter Sending Thread #29":
> at java.security.Provider.getService(Provider.java:1035)
> - waiting to lock <0x80277040> (a sun.security.provider.Sun)
> at 
> sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:437)
> at 
> sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
> at 
> sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
> at javax.crypto.SecretKeyFactory.nextSpi(SecretKeyFactory.java:293)
> - locked 

[jira] [Updated] (HADOOP-15359) IPC client hang in kerberized cluster due to JDK deadlock

2018-04-03 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-15359:
---
Attachment: 2.jstack
1.jstack

> IPC client hang in kerberized cluster due to JDK deadlock
> -
>
> Key: HADOOP-15359
> URL: https://issues.apache.org/jira/browse/HADOOP-15359
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc
>Affects Versions: 2.6.0, 2.8.0, 3.0.0
>Reporter: Xiao Chen
>Priority: Critical
> Attachments: 1.jstack, 2.jstack
>
>
> In a recent internal testing, we have found a DFS client hang. Further 
> inspecting jstack shows the following:
> {noformat}
> "IPC Client (552936351) connection toHOSTNAME:8020 from PRINCIPAL" #7468 
> daemon prio=5 os_prio=0 tid=0x7f6bb306c000 nid=0x1c76e waiting for 
> monitor entry [0x7f6bc2bd6000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at java.security.Provider.getService(Provider.java:1035)
> - waiting to lock <0x80277040> (a sun.security.provider.Sun)
> at 
> sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:444)
> at 
> sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
> at 
> sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
> at javax.crypto.Cipher.getInstance(Cipher.java:513)
> at 
> sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:202)
> at sun.security.krb5.internal.crypto.dk.DkCrypto.dr(DkCrypto.java:484)
> at sun.security.krb5.internal.crypto.dk.DkCrypto.dk(DkCrypto.java:447)
> at 
> sun.security.krb5.internal.crypto.dk.DkCrypto.calculateChecksum(DkCrypto.java:413)
> at 
> sun.security.krb5.internal.crypto.Des3.calculateChecksum(Des3.java:59)
> at 
> sun.security.jgss.krb5.CipherHelper.calculateChecksum(CipherHelper.java:231)
> at 
> sun.security.jgss.krb5.MessageToken.getChecksum(MessageToken.java:466)
> at 
> sun.security.jgss.krb5.MessageToken.verifySignAndSeqNumber(MessageToken.java:374)
> at 
> sun.security.jgss.krb5.WrapToken.getDataFromBuffer(WrapToken.java:284)
> at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:209)
> at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:182)
> at sun.security.jgss.krb5.Krb5Context.unwrap(Krb5Context.java:1053)
> at sun.security.jgss.GSSContextImpl.unwrap(GSSContextImpl.java:403)
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Base.unwrap(GssKrb5Base.java:77)
> at 
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.readNextRpcPacket(SaslRpcClient.java:617)
> at 
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.read(SaslRpcClient.java:583)
> - locked <0x83444878> (a java.nio.HeapByteBuffer)
> at java.io.FilterInputStream.read(FilterInputStream.java:133)
> at 
> org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:553)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> - locked <0x834448c0> (a java.io.BufferedInputStream)
> at java.io.DataInputStream.readInt(DataInputStream.java:387)
> at 
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1113)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1006)
> {noformat}
> and at the end of jstack:
> {noformat}
> Found one Java-level deadlock:
> =
> "IPC Parameter Sending Thread #29":
>   waiting to lock monitor 0x17ff49f8 (object 0x80277040, a 
> sun.security.provider.Sun),
>   which is held by UNKNOWN_owner_addr=0x50607000
> Java stack information for the threads listed above:
> ===
> "IPC Parameter Sending Thread #29":
> at java.security.Provider.getService(Provider.java:1035)
> - waiting to lock <0x80277040> (a sun.security.provider.Sun)
> at 
> sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:437)
> at 
> sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
> at 
> sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
> at javax.crypto.SecretKeyFactory.nextSpi(SecretKeyFactory.java:293)
> - locked <0x834386b8> (a java.lang.Object)
> at javax.crypto.SecretKeyFactory.(SecretKeyFactory.java:121)
> at 
> javax.crypto.SecretKeyFactory.getInstance(SecretKeyFactory.java:160)
> at 
> sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:187)
> at 

[jira] [Commented] (HADOOP-15359) IPC client hang in kerberized cluster due to JDK deadlock

2018-04-03 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424643#comment-16424643
 ] 

Xiao Chen commented on HADOOP-15359:


Attached 2 sample jstacks.

> IPC client hang in kerberized cluster due to JDK deadlock
> -
>
> Key: HADOOP-15359
> URL: https://issues.apache.org/jira/browse/HADOOP-15359
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc
>Affects Versions: 2.6.0, 2.8.0, 3.0.0
>Reporter: Xiao Chen
>Priority: Critical
> Attachments: 1.jstack, 2.jstack
>
>
> In a recent internal testing, we have found a DFS client hang. Further 
> inspecting jstack shows the following:
> {noformat}
> "IPC Client (552936351) connection toHOSTNAME:8020 from PRINCIPAL" #7468 
> daemon prio=5 os_prio=0 tid=0x7f6bb306c000 nid=0x1c76e waiting for 
> monitor entry [0x7f6bc2bd6000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at java.security.Provider.getService(Provider.java:1035)
> - waiting to lock <0x80277040> (a sun.security.provider.Sun)
> at 
> sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:444)
> at 
> sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
> at 
> sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
> at javax.crypto.Cipher.getInstance(Cipher.java:513)
> at 
> sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:202)
> at sun.security.krb5.internal.crypto.dk.DkCrypto.dr(DkCrypto.java:484)
> at sun.security.krb5.internal.crypto.dk.DkCrypto.dk(DkCrypto.java:447)
> at 
> sun.security.krb5.internal.crypto.dk.DkCrypto.calculateChecksum(DkCrypto.java:413)
> at 
> sun.security.krb5.internal.crypto.Des3.calculateChecksum(Des3.java:59)
> at 
> sun.security.jgss.krb5.CipherHelper.calculateChecksum(CipherHelper.java:231)
> at 
> sun.security.jgss.krb5.MessageToken.getChecksum(MessageToken.java:466)
> at 
> sun.security.jgss.krb5.MessageToken.verifySignAndSeqNumber(MessageToken.java:374)
> at 
> sun.security.jgss.krb5.WrapToken.getDataFromBuffer(WrapToken.java:284)
> at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:209)
> at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:182)
> at sun.security.jgss.krb5.Krb5Context.unwrap(Krb5Context.java:1053)
> at sun.security.jgss.GSSContextImpl.unwrap(GSSContextImpl.java:403)
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Base.unwrap(GssKrb5Base.java:77)
> at 
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.readNextRpcPacket(SaslRpcClient.java:617)
> at 
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.read(SaslRpcClient.java:583)
> - locked <0x83444878> (a java.nio.HeapByteBuffer)
> at java.io.FilterInputStream.read(FilterInputStream.java:133)
> at 
> org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:553)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> - locked <0x834448c0> (a java.io.BufferedInputStream)
> at java.io.DataInputStream.readInt(DataInputStream.java:387)
> at 
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1113)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1006)
> {noformat}
> and at the end of jstack:
> {noformat}
> Found one Java-level deadlock:
> =
> "IPC Parameter Sending Thread #29":
>   waiting to lock monitor 0x17ff49f8 (object 0x80277040, a 
> sun.security.provider.Sun),
>   which is held by UNKNOWN_owner_addr=0x50607000
> Java stack information for the threads listed above:
> ===
> "IPC Parameter Sending Thread #29":
> at java.security.Provider.getService(Provider.java:1035)
> - waiting to lock <0x80277040> (a sun.security.provider.Sun)
> at 
> sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:437)
> at 
> sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
> at 
> sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
> at javax.crypto.SecretKeyFactory.nextSpi(SecretKeyFactory.java:293)
> - locked <0x834386b8> (a java.lang.Object)
> at javax.crypto.SecretKeyFactory.(SecretKeyFactory.java:121)
> at 
> javax.crypto.SecretKeyFactory.getInstance(SecretKeyFactory.java:160)
> at 
> sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:187)
>   

[jira] [Updated] (HADOOP-15359) IPC client hang in kerberized cluster due to JDK deadlock

2018-04-03 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-15359:
---
Affects Version/s: 2.6.0

> IPC client hang in kerberized cluster due to JDK deadlock
> -
>
> Key: HADOOP-15359
> URL: https://issues.apache.org/jira/browse/HADOOP-15359
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc
>Affects Versions: 2.6.0, 2.8.0, 3.0.0
>Reporter: Xiao Chen
>Priority: Critical
>
> In a recent internal testing, we have found a DFS client hang. Further 
> inspecting jstack shows the following:
> {noformat}
> "IPC Client (552936351) connection toHOSTNAME:8020 from PRINCIPAL" #7468 
> daemon prio=5 os_prio=0 tid=0x7f6bb306c000 nid=0x1c76e waiting for 
> monitor entry [0x7f6bc2bd6000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at java.security.Provider.getService(Provider.java:1035)
> - waiting to lock <0x80277040> (a sun.security.provider.Sun)
> at 
> sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:444)
> at 
> sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
> at 
> sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
> at javax.crypto.Cipher.getInstance(Cipher.java:513)
> at 
> sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:202)
> at sun.security.krb5.internal.crypto.dk.DkCrypto.dr(DkCrypto.java:484)
> at sun.security.krb5.internal.crypto.dk.DkCrypto.dk(DkCrypto.java:447)
> at 
> sun.security.krb5.internal.crypto.dk.DkCrypto.calculateChecksum(DkCrypto.java:413)
> at 
> sun.security.krb5.internal.crypto.Des3.calculateChecksum(Des3.java:59)
> at 
> sun.security.jgss.krb5.CipherHelper.calculateChecksum(CipherHelper.java:231)
> at 
> sun.security.jgss.krb5.MessageToken.getChecksum(MessageToken.java:466)
> at 
> sun.security.jgss.krb5.MessageToken.verifySignAndSeqNumber(MessageToken.java:374)
> at 
> sun.security.jgss.krb5.WrapToken.getDataFromBuffer(WrapToken.java:284)
> at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:209)
> at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:182)
> at sun.security.jgss.krb5.Krb5Context.unwrap(Krb5Context.java:1053)
> at sun.security.jgss.GSSContextImpl.unwrap(GSSContextImpl.java:403)
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Base.unwrap(GssKrb5Base.java:77)
> at 
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.readNextRpcPacket(SaslRpcClient.java:617)
> at 
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.read(SaslRpcClient.java:583)
> - locked <0x83444878> (a java.nio.HeapByteBuffer)
> at java.io.FilterInputStream.read(FilterInputStream.java:133)
> at 
> org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:553)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> - locked <0x834448c0> (a java.io.BufferedInputStream)
> at java.io.DataInputStream.readInt(DataInputStream.java:387)
> at 
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1113)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1006)
> {noformat}
> and at the end of jstack:
> {noformat}
> Found one Java-level deadlock:
> =
> "IPC Parameter Sending Thread #29":
>   waiting to lock monitor 0x17ff49f8 (object 0x80277040, a 
> sun.security.provider.Sun),
>   which is held by UNKNOWN_owner_addr=0x50607000
> Java stack information for the threads listed above:
> ===
> "IPC Parameter Sending Thread #29":
> at java.security.Provider.getService(Provider.java:1035)
> - waiting to lock <0x80277040> (a sun.security.provider.Sun)
> at 
> sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:437)
> at 
> sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
> at 
> sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
> at javax.crypto.SecretKeyFactory.nextSpi(SecretKeyFactory.java:293)
> - locked <0x834386b8> (a java.lang.Object)
> at javax.crypto.SecretKeyFactory.(SecretKeyFactory.java:121)
> at 
> javax.crypto.SecretKeyFactory.getInstance(SecretKeyFactory.java:160)
> at 
> sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:187)
> at sun.security.krb5.internal.crypto.dk.DkCrypto.dr(DkCrypto.java:484)
> 

[jira] [Updated] (HADOOP-15359) IPC client hang in kerberized cluster due to JDK deadlock

2018-04-03 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-15359:
---
Description: 
In a recent internal testing, we have found a DFS client hang. Further 
inspecting jstack shows the following:

{noformat}
"IPC Client (552936351) connection toHOSTNAME:8020 from PRINCIPAL" #7468 daemon 
prio=5 os_prio=0 tid=0x7f6bb306c000 nid=0x1c76e waiting for monitor entry 
[0x7f6bc2bd6000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at java.security.Provider.getService(Provider.java:1035)
- waiting to lock <0x80277040> (a sun.security.provider.Sun)
at 
sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:444)
at 
sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
at 
sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
at javax.crypto.Cipher.getInstance(Cipher.java:513)
at 
sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:202)
at sun.security.krb5.internal.crypto.dk.DkCrypto.dr(DkCrypto.java:484)
at sun.security.krb5.internal.crypto.dk.DkCrypto.dk(DkCrypto.java:447)
at 
sun.security.krb5.internal.crypto.dk.DkCrypto.calculateChecksum(DkCrypto.java:413)
at 
sun.security.krb5.internal.crypto.Des3.calculateChecksum(Des3.java:59)
at 
sun.security.jgss.krb5.CipherHelper.calculateChecksum(CipherHelper.java:231)
at 
sun.security.jgss.krb5.MessageToken.getChecksum(MessageToken.java:466)
at 
sun.security.jgss.krb5.MessageToken.verifySignAndSeqNumber(MessageToken.java:374)
at 
sun.security.jgss.krb5.WrapToken.getDataFromBuffer(WrapToken.java:284)
at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:209)
at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:182)
at sun.security.jgss.krb5.Krb5Context.unwrap(Krb5Context.java:1053)
at sun.security.jgss.GSSContextImpl.unwrap(GSSContextImpl.java:403)
at com.sun.security.sasl.gsskerb.GssKrb5Base.unwrap(GssKrb5Base.java:77)
at 
org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.readNextRpcPacket(SaslRpcClient.java:617)
at 
org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.read(SaslRpcClient.java:583)
- locked <0x83444878> (a java.nio.HeapByteBuffer)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at 
org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:553)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
- locked <0x834448c0> (a java.io.BufferedInputStream)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at 
org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1113)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1006)
{noformat}

and at the end of jstack:
{noformat}
Found one Java-level deadlock:
=
"IPC Parameter Sending Thread #29":
  waiting to lock monitor 0x17ff49f8 (object 0x80277040, a 
sun.security.provider.Sun),
  which is held by UNKNOWN_owner_addr=0x50607000

Java stack information for the threads listed above:
===
"IPC Parameter Sending Thread #29":
at java.security.Provider.getService(Provider.java:1035)
- waiting to lock <0x80277040> (a sun.security.provider.Sun)
at 
sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:437)
at 
sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
at 
sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
at javax.crypto.SecretKeyFactory.nextSpi(SecretKeyFactory.java:293)
- locked <0x834386b8> (a java.lang.Object)
at javax.crypto.SecretKeyFactory.(SecretKeyFactory.java:121)
at javax.crypto.SecretKeyFactory.getInstance(SecretKeyFactory.java:160)
at 
sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:187)
at sun.security.krb5.internal.crypto.dk.DkCrypto.dr(DkCrypto.java:484)
at sun.security.krb5.internal.crypto.dk.DkCrypto.dk(DkCrypto.java:447)
at 
sun.security.krb5.internal.crypto.dk.DkCrypto.calculateChecksum(DkCrypto.java:413)
at 
sun.security.krb5.internal.crypto.Des3.calculateChecksum(Des3.java:59)
at 
sun.security.jgss.krb5.CipherHelper.calculateChecksum(CipherHelper.java:231)
at 
sun.security.jgss.krb5.MessageToken.getChecksum(MessageToken.java:466)
at 
sun.security.jgss.krb5.MessageToken.genSignAndSeqNumber(MessageToken.java:315)
at sun.security.jgss.krb5.WrapToken.(WrapToken.java:422)
at 

[jira] [Updated] (HADOOP-15359) IPC client hang in kerberized cluster due to JDK deadlock

2018-04-03 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-15359:
---
Priority: Critical  (was: Major)

> IPC client hang in kerberized cluster due to JDK deadlock
> -
>
> Key: HADOOP-15359
> URL: https://issues.apache.org/jira/browse/HADOOP-15359
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc
>Affects Versions: 2.6.0, 2.8.0, 3.0.0
>Reporter: Xiao Chen
>Priority: Critical
>
> In a recent internal testing, we have found a DFS client hang. Further 
> inspecting jstack shows the following:
> {noformat}
> "IPC Client (552936351) connection toHOSTNAME:8020 from PRINCIPAL" #7468 
> daemon prio=5 os_prio=0 tid=0x7f6bb306c000 nid=0x1c76e waiting for 
> monitor entry [0x7f6bc2bd6000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at java.security.Provider.getService(Provider.java:1035)
> - waiting to lock <0x80277040> (a sun.security.provider.Sun)
> at 
> sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:444)
> at 
> sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
> at 
> sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
> at javax.crypto.Cipher.getInstance(Cipher.java:513)
> at 
> sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:202)
> at sun.security.krb5.internal.crypto.dk.DkCrypto.dr(DkCrypto.java:484)
> at sun.security.krb5.internal.crypto.dk.DkCrypto.dk(DkCrypto.java:447)
> at 
> sun.security.krb5.internal.crypto.dk.DkCrypto.calculateChecksum(DkCrypto.java:413)
> at 
> sun.security.krb5.internal.crypto.Des3.calculateChecksum(Des3.java:59)
> at 
> sun.security.jgss.krb5.CipherHelper.calculateChecksum(CipherHelper.java:231)
> at 
> sun.security.jgss.krb5.MessageToken.getChecksum(MessageToken.java:466)
> at 
> sun.security.jgss.krb5.MessageToken.verifySignAndSeqNumber(MessageToken.java:374)
> at 
> sun.security.jgss.krb5.WrapToken.getDataFromBuffer(WrapToken.java:284)
> at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:209)
> at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:182)
> at sun.security.jgss.krb5.Krb5Context.unwrap(Krb5Context.java:1053)
> at sun.security.jgss.GSSContextImpl.unwrap(GSSContextImpl.java:403)
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Base.unwrap(GssKrb5Base.java:77)
> at 
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.readNextRpcPacket(SaslRpcClient.java:617)
> at 
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.read(SaslRpcClient.java:583)
> - locked <0x83444878> (a java.nio.HeapByteBuffer)
> at java.io.FilterInputStream.read(FilterInputStream.java:133)
> at 
> org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:553)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> - locked <0x834448c0> (a java.io.BufferedInputStream)
> at java.io.DataInputStream.readInt(DataInputStream.java:387)
> at 
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1113)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1006)
> {noformat}
> and at the end of jstack:
> {noformat}
> Found one Java-level deadlock:
> =
> "IPC Parameter Sending Thread #29":
>   waiting to lock monitor 0x17ff49f8 (object 0x80277040, a 
> sun.security.provider.Sun),
>   which is held by UNKNOWN_owner_addr=0x50607000
> Java stack information for the threads listed above:
> ===
> "IPC Parameter Sending Thread #29":
> at java.security.Provider.getService(Provider.java:1035)
> - waiting to lock <0x80277040> (a sun.security.provider.Sun)
> at 
> sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:437)
> at 
> sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
> at 
> sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
> at javax.crypto.SecretKeyFactory.nextSpi(SecretKeyFactory.java:293)
> - locked <0x834386b8> (a java.lang.Object)
> at javax.crypto.SecretKeyFactory.(SecretKeyFactory.java:121)
> at 
> javax.crypto.SecretKeyFactory.getInstance(SecretKeyFactory.java:160)
> at 
> sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:187)
> at 

[jira] [Created] (HADOOP-15359) IPC client could run into JDK deadlock

2018-04-03 Thread Xiao Chen (JIRA)
Xiao Chen created HADOOP-15359:
--

 Summary: IPC client could run into JDK deadlock
 Key: HADOOP-15359
 URL: https://issues.apache.org/jira/browse/HADOOP-15359
 Project: Hadoop Common
  Issue Type: Bug
  Components: ipc
Affects Versions: 3.0.0, 2.8.0
Reporter: Xiao Chen


In a recent internal testing, we have found a DFS client hang. Further 
inspecting jstack shows the following:

{noformat}
"IPC Client (552936351) connection toHOSTNAME:8020 from PRINCIPAL" #7468 daemon 
prio=5 os_prio=0 tid=0x7f6bb306c000 nid=0x1c76e waiting for monitor entry 
[0x7f6bc2bd6000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at java.security.Provider.getService(Provider.java:1035)
- waiting to lock <0x80277040> (a sun.security.provider.Sun)
at 
sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:444)
at 
sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
at 
sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
at javax.crypto.Cipher.getInstance(Cipher.java:513)
at 
sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:202)
at sun.security.krb5.internal.crypto.dk.DkCrypto.dr(DkCrypto.java:484)
at sun.security.krb5.internal.crypto.dk.DkCrypto.dk(DkCrypto.java:447)
at 
sun.security.krb5.internal.crypto.dk.DkCrypto.calculateChecksum(DkCrypto.java:413)
at 
sun.security.krb5.internal.crypto.Des3.calculateChecksum(Des3.java:59)
at 
sun.security.jgss.krb5.CipherHelper.calculateChecksum(CipherHelper.java:231)
at 
sun.security.jgss.krb5.MessageToken.getChecksum(MessageToken.java:466)
at 
sun.security.jgss.krb5.MessageToken.verifySignAndSeqNumber(MessageToken.java:374)
at 
sun.security.jgss.krb5.WrapToken.getDataFromBuffer(WrapToken.java:284)
at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:209)
at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:182)
at sun.security.jgss.krb5.Krb5Context.unwrap(Krb5Context.java:1053)
at sun.security.jgss.GSSContextImpl.unwrap(GSSContextImpl.java:403)
at com.sun.security.sasl.gsskerb.GssKrb5Base.unwrap(GssKrb5Base.java:77)
at 
org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.readNextRpcPacket(SaslRpcClient.java:617)
at 
org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.read(SaslRpcClient.java:583)
- locked <0x83444878> (a java.nio.HeapByteBuffer)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at 
org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:553)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
- locked <0x834448c0> (a java.io.BufferedInputStream)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at 
org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1113)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1006)
{noformat}

and at the end of jstack:
{noformat}
Found one Java-level deadlock:
=
"IPC Parameter Sending Thread #29":
  waiting to lock monitor 0x17ff49f8 (object 0x80277040, a 
sun.security.provider.Sun),
  which is held by UNKNOWN_owner_addr=0x50607000

Java stack information for the threads listed above:
===
"IPC Parameter Sending Thread #29":
at java.security.Provider.getService(Provider.java:1035)
- waiting to lock <0x80277040> (a sun.security.provider.Sun)
at 
sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:437)
at 
sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
at 
sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
at javax.crypto.SecretKeyFactory.nextSpi(SecretKeyFactory.java:293)
- locked <0x834386b8> (a java.lang.Object)
at javax.crypto.SecretKeyFactory.(SecretKeyFactory.java:121)
at javax.crypto.SecretKeyFactory.getInstance(SecretKeyFactory.java:160)
at 
sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:187)
at sun.security.krb5.internal.crypto.dk.DkCrypto.dr(DkCrypto.java:484)
at sun.security.krb5.internal.crypto.dk.DkCrypto.dk(DkCrypto.java:447)
at 
sun.security.krb5.internal.crypto.dk.DkCrypto.calculateChecksum(DkCrypto.java:413)
at 
sun.security.krb5.internal.crypto.Des3.calculateChecksum(Des3.java:59)
at 
sun.security.jgss.krb5.CipherHelper.calculateChecksum(CipherHelper.java:231)
at 
sun.security.jgss.krb5.MessageToken.getChecksum(MessageToken.java:466)
 

[jira] [Updated] (HADOOP-15359) IPC client hang in kerberized cluster due to JDK deadlock

2018-04-03 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-15359:
---
Summary: IPC client hang in kerberized cluster due to JDK deadlock  (was: 
IPC client could run into JDK deadlock)

> IPC client hang in kerberized cluster due to JDK deadlock
> -
>
> Key: HADOOP-15359
> URL: https://issues.apache.org/jira/browse/HADOOP-15359
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc
>Affects Versions: 2.8.0, 3.0.0
>Reporter: Xiao Chen
>Priority: Major
>
> In a recent internal testing, we have found a DFS client hang. Further 
> inspecting jstack shows the following:
> {noformat}
> "IPC Client (552936351) connection toHOSTNAME:8020 from PRINCIPAL" #7468 
> daemon prio=5 os_prio=0 tid=0x7f6bb306c000 nid=0x1c76e waiting for 
> monitor entry [0x7f6bc2bd6000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at java.security.Provider.getService(Provider.java:1035)
> - waiting to lock <0x80277040> (a sun.security.provider.Sun)
> at 
> sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:444)
> at 
> sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
> at 
> sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
> at javax.crypto.Cipher.getInstance(Cipher.java:513)
> at 
> sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:202)
> at sun.security.krb5.internal.crypto.dk.DkCrypto.dr(DkCrypto.java:484)
> at sun.security.krb5.internal.crypto.dk.DkCrypto.dk(DkCrypto.java:447)
> at 
> sun.security.krb5.internal.crypto.dk.DkCrypto.calculateChecksum(DkCrypto.java:413)
> at 
> sun.security.krb5.internal.crypto.Des3.calculateChecksum(Des3.java:59)
> at 
> sun.security.jgss.krb5.CipherHelper.calculateChecksum(CipherHelper.java:231)
> at 
> sun.security.jgss.krb5.MessageToken.getChecksum(MessageToken.java:466)
> at 
> sun.security.jgss.krb5.MessageToken.verifySignAndSeqNumber(MessageToken.java:374)
> at 
> sun.security.jgss.krb5.WrapToken.getDataFromBuffer(WrapToken.java:284)
> at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:209)
> at sun.security.jgss.krb5.WrapToken.getData(WrapToken.java:182)
> at sun.security.jgss.krb5.Krb5Context.unwrap(Krb5Context.java:1053)
> at sun.security.jgss.GSSContextImpl.unwrap(GSSContextImpl.java:403)
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Base.unwrap(GssKrb5Base.java:77)
> at 
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.readNextRpcPacket(SaslRpcClient.java:617)
> at 
> org.apache.hadoop.security.SaslRpcClient$WrappedInputStream.read(SaslRpcClient.java:583)
> - locked <0x83444878> (a java.nio.HeapByteBuffer)
> at java.io.FilterInputStream.read(FilterInputStream.java:133)
> at 
> org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:553)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> - locked <0x834448c0> (a java.io.BufferedInputStream)
> at java.io.DataInputStream.readInt(DataInputStream.java:387)
> at 
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1113)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1006)
> {noformat}
> and at the end of jstack:
> {noformat}
> Found one Java-level deadlock:
> =
> "IPC Parameter Sending Thread #29":
>   waiting to lock monitor 0x17ff49f8 (object 0x80277040, a 
> sun.security.provider.Sun),
>   which is held by UNKNOWN_owner_addr=0x50607000
> Java stack information for the threads listed above:
> ===
> "IPC Parameter Sending Thread #29":
> at java.security.Provider.getService(Provider.java:1035)
> - waiting to lock <0x80277040> (a sun.security.provider.Sun)
> at 
> sun.security.jca.ProviderList$ServiceList.tryGet(ProviderList.java:437)
> at 
> sun.security.jca.ProviderList$ServiceList.access$200(ProviderList.java:376)
> at 
> sun.security.jca.ProviderList$ServiceList$1.hasNext(ProviderList.java:486)
> at javax.crypto.SecretKeyFactory.nextSpi(SecretKeyFactory.java:293)
> - locked <0x834386b8> (a java.lang.Object)
> at javax.crypto.SecretKeyFactory.(SecretKeyFactory.java:121)
> at 
> javax.crypto.SecretKeyFactory.getInstance(SecretKeyFactory.java:160)
> at 
> sun.security.krb5.internal.crypto.dk.Des3DkCrypto.getCipher(Des3DkCrypto.java:187)
>  

[jira] [Commented] (HADOOP-15357) Configuration.getPropsWithPrefix no longer does variable substitution

2018-04-03 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424565#comment-16424565
 ] 

Jim Brennan commented on HADOOP-15357:
--

[~lmccay], [~asuresh], I believe this patch is ready for review.  cc: [~jlowe]

> Configuration.getPropsWithPrefix no longer does variable substitution
> -
>
> Key: HADOOP-15357
> URL: https://issues.apache.org/jira/browse/HADOOP-15357
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-15357.001.patch, HADOOP-15357.002.patch
>
>
> Before [HADOOP-13556], Configuration.getPropsWithPrefix() used the 
> Configuration.get() method to get the value of the variables.   After 
> [HADOOP-13556], it now uses props.getProperty().
> The difference is that Configuration.get() does deprecation handling and more 
> importantly variable substitution on the value.  So if a property has a 
> variable specified with ${variable_name}, it will no longer be expanded when 
> retrieved via getPropsWithPrefix().
> Was this change in behavior intentional?  I am using this function in the fix 
> for [MAPREDUCE-7069], but we do want variable expansion to happen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15357) Configuration.getPropsWithPrefix no longer does variable substitution

2018-04-03 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424553#comment-16424553
 ] 

genericqa commented on HADOOP-15357:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  9m 
36s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 24s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 26m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} hadoop-common-project/hadoop-common: The patch 
generated 0 new + 243 unchanged - 1 fixed = 243 total (was 244) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  7s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m  
7s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}131m 36s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | HADOOP-15357 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12917415/HADOOP-15357.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 4777a3250d9f 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 5a174f8 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14429/testReport/ |
| Max. process+thread count | 1512 (vs. ulimit of 1) |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14429/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Configuration.getPropsWithPrefix no longer does variable 

[jira] [Commented] (HADOOP-14987) Improve KMSClientProvider log around delegation token checking

2018-04-03 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424445#comment-16424445
 ] 

Xiao Chen commented on HADOOP-14987:


The conflicts were trivial so I resolved it on the fly. Can you get the diff 
from git history?

> Improve KMSClientProvider log around delegation token checking
> --
>
> Key: HADOOP-14987
> URL: https://issues.apache.org/jira/browse/HADOOP-14987
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.7.3
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
> Fix For: 3.0.0, 2.10.0, 2.8.4, 2.9.2
>
> Attachments: HADOOP-14987.001.patch, HADOOP-14987.002.patch, 
> HADOOP-14987.003.patch, HADOOP-14987.004.patch, HADOOP-14987.005.patch
>
>
> KMSClientProvider#containsKmsDt uses SecurityUtil.buildTokenService(addr) to 
> build the key to look for KMS-DT from the UGI's token map. The token lookup 
> key here varies depending  on the KMSClientProvider's configuration value for 
> hadoop.security.token.service.use_ip. In certain cases, the token obtained 
> with non-matching hadoop.security.token.service.use_ip setting will not be 
> recognized by KMSClientProvider. This ticket is opened to improve logs for 
> troubleshooting KMS delegation token related issues like this.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14987) Improve KMSClientProvider log around delegation token checking

2018-04-03 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424395#comment-16424395
 ] 

Rushabh S Shah commented on HADOOP-14987:
-

[~xiaochen]:Do you mind attaching the latest patch that you committed for 
branch-2* ?

> Improve KMSClientProvider log around delegation token checking
> --
>
> Key: HADOOP-14987
> URL: https://issues.apache.org/jira/browse/HADOOP-14987
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.7.3
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
> Fix For: 3.0.0, 2.10.0, 2.8.4, 2.9.2
>
> Attachments: HADOOP-14987.001.patch, HADOOP-14987.002.patch, 
> HADOOP-14987.003.patch, HADOOP-14987.004.patch, HADOOP-14987.005.patch
>
>
> KMSClientProvider#containsKmsDt uses SecurityUtil.buildTokenService(addr) to 
> build the key to look for KMS-DT from the UGI's token map. The token lookup 
> key here varies depending  on the KMSClientProvider's configuration value for 
> hadoop.security.token.service.use_ip. In certain cases, the token obtained 
> with non-matching hadoop.security.token.service.use_ip setting will not be 
> recognized by KMSClientProvider. This ticket is opened to improve logs for 
> troubleshooting KMS delegation token related issues like this.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14987) Improve KMSClientProvider log around delegation token checking

2018-04-03 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424391#comment-16424391
 ] 

Xiao Chen commented on HADOOP-14987:


Due to conflicts from another jira, I cherry-picked this to branch-2, 
branch-2.9, branch-2.8. Compiled and ran the changed test locally before 
pushing.

> Improve KMSClientProvider log around delegation token checking
> --
>
> Key: HADOOP-14987
> URL: https://issues.apache.org/jira/browse/HADOOP-14987
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.7.3
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
> Fix For: 3.0.0, 2.10.0, 2.8.4, 2.9.2
>
> Attachments: HADOOP-14987.001.patch, HADOOP-14987.002.patch, 
> HADOOP-14987.003.patch, HADOOP-14987.004.patch, HADOOP-14987.005.patch
>
>
> KMSClientProvider#containsKmsDt uses SecurityUtil.buildTokenService(addr) to 
> build the key to look for KMS-DT from the UGI's token map. The token lookup 
> key here varies depending  on the KMSClientProvider's configuration value for 
> hadoop.security.token.service.use_ip. In certain cases, the token obtained 
> with non-matching hadoop.security.token.service.use_ip setting will not be 
> recognized by KMSClientProvider. This ticket is opened to improve logs for 
> troubleshooting KMS delegation token related issues like this.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14987) Improve KMSClientProvider log around delegation token checking

2018-04-03 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-14987:
---
Fix Version/s: 2.9.2
   2.8.4
   2.10.0

> Improve KMSClientProvider log around delegation token checking
> --
>
> Key: HADOOP-14987
> URL: https://issues.apache.org/jira/browse/HADOOP-14987
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.7.3
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
> Fix For: 3.0.0, 2.10.0, 2.8.4, 2.9.2
>
> Attachments: HADOOP-14987.001.patch, HADOOP-14987.002.patch, 
> HADOOP-14987.003.patch, HADOOP-14987.004.patch, HADOOP-14987.005.patch
>
>
> KMSClientProvider#containsKmsDt uses SecurityUtil.buildTokenService(addr) to 
> build the key to look for KMS-DT from the UGI's token map. The token lookup 
> key here varies depending  on the KMSClientProvider's configuration value for 
> hadoop.security.token.service.use_ip. In certain cases, the token obtained 
> with non-matching hadoop.security.token.service.use_ip setting will not be 
> recognized by KMSClientProvider. This ticket is opened to improve logs for 
> troubleshooting KMS delegation token related issues like this.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15357) Configuration.getPropsWithPrefix no longer does variable substitution

2018-04-03 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424387#comment-16424387
 ] 

Jim Brennan commented on HADOOP-15357:
--

Renamed local variable to fix the check-style issue and submitted a new patch.

> Configuration.getPropsWithPrefix no longer does variable substitution
> -
>
> Key: HADOOP-15357
> URL: https://issues.apache.org/jira/browse/HADOOP-15357
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-15357.001.patch, HADOOP-15357.002.patch
>
>
> Before [HADOOP-13556], Configuration.getPropsWithPrefix() used the 
> Configuration.get() method to get the value of the variables.   After 
> [HADOOP-13556], it now uses props.getProperty().
> The difference is that Configuration.get() does deprecation handling and more 
> importantly variable substitution on the value.  So if a property has a 
> variable specified with ${variable_name}, it will no longer be expanded when 
> retrieved via getPropsWithPrefix().
> Was this change in behavior intentional?  I am using this function in the fix 
> for [MAPREDUCE-7069], but we do want variable expansion to happen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15357) Configuration.getPropsWithPrefix no longer does variable substitution

2018-04-03 Thread Jim Brennan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-15357:
-
Attachment: HADOOP-15357.002.patch

> Configuration.getPropsWithPrefix no longer does variable substitution
> -
>
> Key: HADOOP-15357
> URL: https://issues.apache.org/jira/browse/HADOOP-15357
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-15357.001.patch, HADOOP-15357.002.patch
>
>
> Before [HADOOP-13556], Configuration.getPropsWithPrefix() used the 
> Configuration.get() method to get the value of the variables.   After 
> [HADOOP-13556], it now uses props.getProperty().
> The difference is that Configuration.get() does deprecation handling and more 
> importantly variable substitution on the value.  So if a property has a 
> variable specified with ${variable_name}, it will no longer be expanded when 
> retrieved via getPropsWithPrefix().
> Was this change in behavior intentional?  I am using this function in the fix 
> for [MAPREDUCE-7069], but we do want variable expansion to happen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15357) Configuration.getPropsWithPrefix no longer does variable substitution

2018-04-03 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424316#comment-16424316
 ] 

genericqa commented on HADOOP-15357:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 
32s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  2s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 26m 
43s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 51s{color} | {color:orange} hadoop-common-project/hadoop-common: The patch 
generated 1 new + 243 unchanged - 1 fixed = 244 total (was 244) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  4s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
49s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}136m 25s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | HADOOP-15357 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12917380/HADOOP-15357.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 513dc5057496 3.13.0-137-generic #186-Ubuntu SMP Mon Dec 4 
19:09:19 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 93d47a0 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14428/artifact/out/diff-checkstyle-hadoop-common-project_hadoop-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14428/testReport/ |
| Max. process+thread count | 1717 (vs. ulimit of 1) |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14428/console |
| Powered by | Apache Yetus 

[jira] [Commented] (HADOOP-14445) Delegation tokens are not shared between KMS instances

2018-04-03 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424241#comment-16424241
 ] 

Xiao Chen commented on HADOOP-14445:


Thanks for the prompt review Rushabh.
bq.  test failure 
I found it yesterday when looking at another jira, and it turns out to be 
HADOOP-15355. Committed that one last night, so next run should clear off.

I'll correct the checkstyle after a final round of manual testing with real 
clusters. Will also provide branch-2 patches.

> Delegation tokens are not shared between KMS instances
> --
>
> Key: HADOOP-14445
> URL: https://issues.apache.org/jira/browse/HADOOP-14445
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms
>Affects Versions: 2.8.0, 3.0.0-alpha1
> Environment: CDH5.7.4, Kerberized, SSL, KMS-HA, at rest encryption
>Reporter: Wei-Chiu Chuang
>Assignee: Xiao Chen
>Priority: Major
> Attachments: HADOOP-14445-branch-2.8.002.patch, 
> HADOOP-14445-branch-2.8.patch, HADOOP-14445.002.patch, 
> HADOOP-14445.003.patch, HADOOP-14445.004.patch, HADOOP-14445.05.patch, 
> HADOOP-14445.06.patch, HADOOP-14445.07.patch, HADOOP-14445.08.patch, 
> HADOOP-14445.09.patch, HADOOP-14445.10.patch, HADOOP-14445.11.patch
>
>
> As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do 
> not share delegation tokens. (a client uses KMS address/port as the key for 
> delegation token)
> {code:title=DelegationTokenAuthenticatedURL#openConnection}
> if (!creds.getAllTokens().isEmpty()) {
> InetSocketAddress serviceAddr = new InetSocketAddress(url.getHost(),
> url.getPort());
> Text service = SecurityUtil.buildTokenService(serviceAddr);
> dToken = creds.getToken(service);
> {code}
> But KMS doc states:
> {quote}
> Delegation Tokens
> Similar to HTTP authentication, KMS uses Hadoop Authentication for delegation 
> tokens too.
> Under HA, A KMS instance must verify the delegation token given by another 
> KMS instance, by checking the shared secret used to sign the delegation 
> token. To do this, all KMS instances must be able to retrieve the shared 
> secret from ZooKeeper.
> {quote}
> We should either update the KMS documentation, or fix this code to share 
> delegation tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14758) S3GuardTool.prune to handle UnsupportedOperationException

2018-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424204#comment-16424204
 ] 

Hudson commented on HADOOP-14758:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13919 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13919/])
HADOOP-14758. S3GuardTool.prune to handle UnsupportedOperationException. 
(stevel: rev 5a174f8ac6e5f170b427b30bf72ef33f90c20d91)
* (edit) 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardTool.java


> S3GuardTool.prune to handle UnsupportedOperationException
> -
>
> Key: HADOOP-14758
> URL: https://issues.apache.org/jira/browse/HADOOP-14758
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Trivial
> Fix For: 3.2.0
>
> Attachments: HADOOP-14758.001.patch
>
>
> {{MetadataStore.prune()}} may throw {{UnsupportedOperationException}} if not 
> supported.
> {{S3GuardTool.prune}} should recognise this, catch it and treat it 
> differently from any other failure, e.g. inform and return 0 as its a no-op



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-13500) Concurrency issues when using Configuration iterator

2018-04-03 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar reassigned HADOOP-13500:
---

Assignee: (was: Ajay Kumar)

> Concurrency issues when using Configuration iterator
> 
>
> Key: HADOOP-13500
> URL: https://issues.apache.org/jira/browse/HADOOP-13500
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Reporter: Jason Lowe
>Priority: Major
>
> It is possible to encounter a ConcurrentModificationException while trying to 
> iterate a Configuration object.  The iterator method tries to walk the 
> underlying Property object without proper synchronization, so another thread 
> simultaneously calling the set method can trigger it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14758) S3GuardTool.prune to handle UnsupportedOperationException

2018-04-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14758:

   Resolution: Fixed
Fix Version/s: 3.2.0
   Status: Resolved  (was: Patch Available)

+1,

committed —thanks!

> S3GuardTool.prune to handle UnsupportedOperationException
> -
>
> Key: HADOOP-14758
> URL: https://issues.apache.org/jira/browse/HADOOP-14758
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Trivial
> Fix For: 3.2.0
>
> Attachments: HADOOP-14758.001.patch
>
>
> {{MetadataStore.prune()}} may throw {{UnsupportedOperationException}} if not 
> supported.
> {{S3GuardTool.prune}} should recognise this, catch it and treat it 
> differently from any other failure, e.g. inform and return 0 as its a no-op



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14445) Delegation tokens are not shared between KMS instances

2018-04-03 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424166#comment-16424166
 ] 

Rushabh S Shah commented on HADOOP-14445:
-

Thanks [~xiaochen] for the latest patch.
It looks good.
bq.  If KMS server is old, you'd get an old token. 
Thanks for catching that. I totally missed that.

There is one test failure in the latest run. 
{noformat}
org.apache.hadoop.conf.TestCommonConfigurationFields.testCompareXmlAgainstConfigurationClass

Failing for the past 1 build (Since Failed#14425 )
Took 0.2 sec.
Error Message
core-default.xml has 2 properties missing in  class 
org.apache.hadoop.fs.CommonConfigurationKeys  class 
org.apache.hadoop.fs.CommonConfigurationKeysPublic  class 
org.apache.hadoop.fs.local.LocalConfigKeys  class 
org.apache.hadoop.fs.ftp.FtpConfigKeys  class 
org.apache.hadoop.ha.SshFenceByTcpPort  class 
org.apache.hadoop.security.LdapGroupsMapping  class 
org.apache.hadoop.ha.ZKFailoverController  class 
org.apache.hadoop.security.ssl.SSLFactory  class 
org.apache.hadoop.security.CompositeGroupsMapping  class 
org.apache.hadoop.io.erasurecode.CodecUtil  class 
org.apache.hadoop.security.RuleBasedLdapGroupsMapping Entries:   
hadoop.security.key.default.bitlength  hadoop.security.key.default.cipher 
expected:<0> but was:<2>
{noformat}
I can't think of a way that your latest patch can introduce this failure.
The hadoop-common build is fairly stable compared to hadoop-hdfs. Can you 
please double check whether your patch introduced this failure.
If not, can you please find out which jira is responsible ?

Also there are couple of checkstyle warnings in TestKMS.java regarding unused 
import.


If the test failure is not related, then you can make the checkstyle changes 
while committing.
Also can you upload the new patch after committing and resolving the jira.
 I know some people had concerns that it is difficult to co-relate the commit 
with the last patch if they are not the same.

+1 (non-binding) pending confirming test failure.
Thanks a lot for the good work here.

> Delegation tokens are not shared between KMS instances
> --
>
> Key: HADOOP-14445
> URL: https://issues.apache.org/jira/browse/HADOOP-14445
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms
>Affects Versions: 2.8.0, 3.0.0-alpha1
> Environment: CDH5.7.4, Kerberized, SSL, KMS-HA, at rest encryption
>Reporter: Wei-Chiu Chuang
>Assignee: Xiao Chen
>Priority: Major
> Attachments: HADOOP-14445-branch-2.8.002.patch, 
> HADOOP-14445-branch-2.8.patch, HADOOP-14445.002.patch, 
> HADOOP-14445.003.patch, HADOOP-14445.004.patch, HADOOP-14445.05.patch, 
> HADOOP-14445.06.patch, HADOOP-14445.07.patch, HADOOP-14445.08.patch, 
> HADOOP-14445.09.patch, HADOOP-14445.10.patch, HADOOP-14445.11.patch
>
>
> As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do 
> not share delegation tokens. (a client uses KMS address/port as the key for 
> delegation token)
> {code:title=DelegationTokenAuthenticatedURL#openConnection}
> if (!creds.getAllTokens().isEmpty()) {
> InetSocketAddress serviceAddr = new InetSocketAddress(url.getHost(),
> url.getPort());
> Text service = SecurityUtil.buildTokenService(serviceAddr);
> dToken = creds.getToken(service);
> {code}
> But KMS doc states:
> {quote}
> Delegation Tokens
> Similar to HTTP authentication, KMS uses Hadoop Authentication for delegation 
> tokens too.
> Under HA, A KMS instance must verify the delegation token given by another 
> KMS instance, by checking the shared secret used to sign the delegation 
> token. To do this, all KMS instances must be able to retrieve the shared 
> secret from ZooKeeper.
> {quote}
> We should either update the KMS documentation, or fix this code to share 
> delegation tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15357) Configuration.getPropsWithPrefix no longer does variable substitution

2018-04-03 Thread Jim Brennan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-15357:
-
Status: Patch Available  (was: Open)

> Configuration.getPropsWithPrefix no longer does variable substitution
> -
>
> Key: HADOOP-15357
> URL: https://issues.apache.org/jira/browse/HADOOP-15357
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-15357.001.patch
>
>
> Before [HADOOP-13556], Configuration.getPropsWithPrefix() used the 
> Configuration.get() method to get the value of the variables.   After 
> [HADOOP-13556], it now uses props.getProperty().
> The difference is that Configuration.get() does deprecation handling and more 
> importantly variable substitution on the value.  So if a property has a 
> variable specified with ${variable_name}, it will no longer be expanded when 
> retrieved via getPropsWithPrefix().
> Was this change in behavior intentional?  I am using this function in the fix 
> for [MAPREDUCE-7069], but we do want variable expansion to happen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15357) Configuration.getPropsWithPrefix no longer does variable substitution

2018-04-03 Thread Jim Brennan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HADOOP-15357:
-
Attachment: HADOOP-15357.001.patch

> Configuration.getPropsWithPrefix no longer does variable substitution
> -
>
> Key: HADOOP-15357
> URL: https://issues.apache.org/jira/browse/HADOOP-15357
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HADOOP-15357.001.patch
>
>
> Before [HADOOP-13556], Configuration.getPropsWithPrefix() used the 
> Configuration.get() method to get the value of the variables.   After 
> [HADOOP-13556], it now uses props.getProperty().
> The difference is that Configuration.get() does deprecation handling and more 
> importantly variable substitution on the value.  So if a property has a 
> variable specified with ${variable_name}, it will no longer be expanded when 
> retrieved via getPropsWithPrefix().
> Was this change in behavior intentional?  I am using this function in the fix 
> for [MAPREDUCE-7069], but we do want variable expansion to happen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-13500) Concurrency issues when using Configuration iterator

2018-04-03 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reopened HADOOP-13500:
-

This is not a duplicate of HADOOP-13556.  That JIRA only changed the 
getPropsWithPrefix method which was not involved in the error reported by this 
JIRA or TEZ-3413.  AFAICT iterating a shared configuration object is still 
unsafe.

> Concurrency issues when using Configuration iterator
> 
>
> Key: HADOOP-13500
> URL: https://issues.apache.org/jira/browse/HADOOP-13500
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Reporter: Jason Lowe
>Assignee: Ajay Kumar
>Priority: Major
>
> It is possible to encounter a ConcurrentModificationException while trying to 
> iterate a Configuration object.  The iterator method tries to walk the 
> underlying Property object without proper synchronization, so another thread 
> simultaneously calling the set method can trigger it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-15357) Configuration.getPropsWithPrefix no longer does variable substitution

2018-04-03 Thread Jim Brennan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan reassigned HADOOP-15357:


Assignee: Jim Brennan

> Configuration.getPropsWithPrefix no longer does variable substitution
> -
>
> Key: HADOOP-15357
> URL: https://issues.apache.org/jira/browse/HADOOP-15357
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
>
> Before [HADOOP-13556], Configuration.getPropsWithPrefix() used the 
> Configuration.get() method to get the value of the variables.   After 
> [HADOOP-13556], it now uses props.getProperty().
> The difference is that Configuration.get() does deprecation handling and more 
> importantly variable substitution on the value.  So if a property has a 
> variable specified with ${variable_name}, it will no longer be expanded when 
> retrieved via getPropsWithPrefix().
> Was this change in behavior intentional?  I am using this function in the fix 
> for [MAPREDUCE-7069], but we do want variable expansion to happen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15357) Configuration.getPropsWithPrefix no longer does variable substitution

2018-04-03 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424056#comment-16424056
 ] 

Jim Brennan commented on HADOOP-15357:
--

[~lmccay], thanks for the prompt replies.  I will happy to put up a patch later 
today!

> Configuration.getPropsWithPrefix no longer does variable substitution
> -
>
> Key: HADOOP-15357
> URL: https://issues.apache.org/jira/browse/HADOOP-15357
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jim Brennan
>Priority: Major
>
> Before [HADOOP-13556], Configuration.getPropsWithPrefix() used the 
> Configuration.get() method to get the value of the variables.   After 
> [HADOOP-13556], it now uses props.getProperty().
> The difference is that Configuration.get() does deprecation handling and more 
> importantly variable substitution on the value.  So if a property has a 
> variable specified with ${variable_name}, it will no longer be expanded when 
> retrieved via getPropsWithPrefix().
> Was this change in behavior intentional?  I am using this function in the fix 
> for [MAPREDUCE-7069], but we do want variable expansion to happen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15358) SFTPConnectionPool connections leakage

2018-04-03 Thread Mikhail Pryakhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Pryakhin updated HADOOP-15358:
--
Description: 
Methods of SFTPFileSystem operate on poolable ChannelSftp instances, thus some 
methods of SFTPFileSystem are chained together resulting in establishing 
multiple connections to the SFTP server to accomplish one compound action, 
those methods are listed below:
 # mkdirs method
the public mkdirs method acquires a new ChannelSftp from the pool [1]
and then recursively creates directories, checking for the directory existence 
beforehand by calling the method exists[2] which delegates to the 
getFileStatus(ChannelSftp channel, Path file) method [3] and so on until it 
ends up in returning the FilesStatus instance [4]. The resource leakage occurs 
in the method getWorkingDirectory which calls the getHomeDirectory method [5] 
which in turn establishes a new connection to the sftp server instead of using 
an already created connection. As the mkdirs method is recursive this results 
in creating a huge number of connections.
 # open method [6]. This method returns an instance of FSDataInputStream which 
consumes SFTPInputStream instance which doesn't return an acquired ChannelSftp 
instance back to the pool but instead it closes it[7]. This leads to 
establishing another connection to an SFTP server when the next method is 
called on the FileSystem instance.


[1] 
https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L658

[2] 
https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L321

[3] 
https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L202

[4] 
https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L290

[5] 
https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L640

[6] 
https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L504

[7] 
https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPInputStream.java#L123

  was:
Methods of SFTPFileSystem operate on poolable ChannelSftp instances, thus some 
methods of SFTPFileSystem are chained together resulting in establishing 
multiple connections to the SFTP server to accomplish one compound action, 
those methods are listed below:
 # mkdirs method
the public mkdirs method acquires a new ChannelSftp [from the 
pool|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L658]]
and then recursively creates directories, checking for the directory existence 
beforehand by calling the method 
[exists|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L321]
] which delegates to the getFileStatus(ChannelSftp channel, Path file) 
[method|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L202]]
 and so on until it ends up in returning the [FilesStatus 
instance|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L290]].
 The resource leakage occurs in the method getWorkingDirectory which calls the 
getHomeDirectory 
[method|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L640]]
 which in turn establishes a new connection to the sftp server instead of using 
an already created connection. As the mkdirs method is recursive this results 
in creating a huge number of connections.
 # open 
[method|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L504]].
 This method returns an instance of FSDataInputStream which consumes 
SFTPInputStream instance which doesn't return an acquired ChannelSftp 

[jira] [Updated] (HADOOP-15358) SFTPConnectionPool connections leakage

2018-04-03 Thread Mikhail Pryakhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Pryakhin updated HADOOP-15358:
--
Description: 
Methods of SFTPFileSystem operate on poolable ChannelSftp instances, thus some 
methods of SFTPFileSystem are chained together resulting in establishing 
multiple connections to the SFTP server to accomplish one compound action, 
those methods are listed below:
 # mkdirs method
the public mkdirs method acquires a new ChannelSftp [from the 
pool|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L658]]
and then recursively creates directories, checking for the directory existence 
beforehand by calling the method 
[exists|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L321]
] which delegates to the getFileStatus(ChannelSftp channel, Path file) 
[method|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L202]]
 and so on until it ends up in returning the [FilesStatus 
instance|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L290]].
 The resource leakage occurs in the method getWorkingDirectory which calls the 
getHomeDirectory 
[method|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L640]]
 which in turn establishes a new connection to the sftp server instead of using 
an already created connection. As the mkdirs method is recursive this results 
in creating a huge number of connections.
 # open 
[method|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L504]].
 This method returns an instance of FSDataInputStream which consumes 
SFTPInputStream instance which doesn't return an acquired ChannelSftp instance 
back to the pool but instead it 
[closes|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPInputStream.java#L123]]
 it. This leads to establishing another connection to an SFTP server when the 
next method is called on the FileSystem instance.

 

  was:
Methods of SFTPFileSystem operate on poolable ChannelSftp instances, thus some 
methods of SFTPFileSystem are chained together resulting in establishing 
multiple connections to the SFTP server to accomplish one compound action, 
those methods are listed below:
 # mkdirs method
the public mkdirs method acquires a new ChannelSftp from the pool [1]
and then recursively creates directories, checking for the directory existence 
beforehand by calling the method exists[2] which delegates to the 
getFileStatus(ChannelSftp channel, Path file) method [3] and so on until it 
ends up in returning the FilesStatus instance [4]. The resource leakage occurs 
in the method getWorkingDirectory which calls the getHomeDirectory method [5] 
which in turn establishes a new connection to the sftp server instead of using 
an already created connection. As the mkdirs method is recursive this results 
in creating a huge number of connections.
 # open method [6]   This method returns an instance of FSDataInputStream which 
consumes SFTPInputStream instance which doesn't return an acquired ChannelSftp 
instance back to the pool but instead it closes it[7]. This leads to 
establishing another connection to an SFTP server when the next method is 
called on the FileSystem instance.


[1] 
https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L658

[2] 
https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L321

[3] 
https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L202

[4] 
https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L290

[5] 
https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L640

[6] 

[jira] [Created] (HADOOP-15358) SFTPConnectionPool connections leakage

2018-04-03 Thread Mikhail Pryakhin (JIRA)
Mikhail Pryakhin created HADOOP-15358:
-

 Summary: SFTPConnectionPool connections leakage
 Key: HADOOP-15358
 URL: https://issues.apache.org/jira/browse/HADOOP-15358
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 3.0.0
Reporter: Mikhail Pryakhin


Methods of SFTPFileSystem operate on poolable ChannelSftp instances, thus some 
methods of SFTPFileSystem are chained together resulting in establishing 
multiple connections to the SFTP server to accomplish one compound action, 
those methods are listed below:
 # mkdirs method
the public mkdirs method acquires a new ChannelSftp from the pool [1]
and then recursively creates directories, checking for the directory existence 
beforehand by calling the method exists[2] which delegates to the 
getFileStatus(ChannelSftp channel, Path file) method [3] and so on until it 
ends up in returning the FilesStatus instance [4]. The resource leakage occurs 
in the method getWorkingDirectory which calls the getHomeDirectory method [5] 
which in turn establishes a new connection to the sftp server instead of using 
an already created connection. As the mkdirs method is recursive this results 
in creating a huge number of connections.
 # open method [6]   This method returns an instance of FSDataInputStream which 
consumes SFTPInputStream instance which doesn't return an acquired ChannelSftp 
instance back to the pool but instead it closes it[7]. This leads to 
establishing another connection to an SFTP server when the next method is 
called on the FileSystem instance.


[1] 
https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L658

[2] 
https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L321

[3] 
https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L202

[4] 
https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L290

[5] 
https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L640

[6] 
https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L504

[7] 
https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPInputStream.java#L123



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14999) AliyunOSS: provide one asynchronous multi-part based uploading mechanism

2018-04-03 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423924#comment-16423924
 ] 

genericqa commented on HADOOP-14999:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  2m 
15s{color} | {color:red} Docker failed to build yetus/hadoop:dbd69cb. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HADOOP-14999 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12917350/HADOOP-14999-branch-2.001.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14427/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> AliyunOSS: provide one asynchronous multi-part based uploading mechanism
> 
>
> Key: HADOOP-14999
> URL: https://issues.apache.org/jira/browse/HADOOP-14999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/oss
>Affects Versions: 3.0.0-beta1
>Reporter: Genmao Yu
>Assignee: Genmao Yu
>Priority: Major
> Attachments: HADOOP-14999-branch-2.001.patch, HADOOP-14999.001.patch, 
> HADOOP-14999.002.patch, HADOOP-14999.003.patch, HADOOP-14999.004.patch, 
> HADOOP-14999.005.patch, HADOOP-14999.006.patch, HADOOP-14999.007.patch, 
> HADOOP-14999.008.patch, HADOOP-14999.009.patch, HADOOP-14999.010.patch, 
> HADOOP-14999.011.patch, asynchronous_file_uploading.pdf, 
> diff-between-patch7-and-patch8.txt
>
>
> This mechanism is designed for uploading file in parallel and asynchronously:
>  - improve the performance of uploading file to OSS server. Firstly, this 
> mechanism splits result to multiple small blocks and upload them in parallel. 
> Then, getting result and uploading blocks are asynchronous.
>  - avoid buffering too large result into local disk. To cite an extreme 
> example, there is a task which will output 100GB or even larger, we may need 
> to output this 100GB to local disk and then upload it. Sometimes, it is 
> inefficient and limited to disk space.
> This patch reuse {{SemaphoredDelegatingExecutor}} as executor service and 
> depends on HADOOP-15039.
> Attached {{asynchronous_file_uploading.pdf}} illustrated the difference 
> between previous {{AliyunOSSOutputStream}} and 
> {{AliyunOSSBlockOutputStream}}, i.e. this asynchronous multi-part based 
> uploading mechanism.
> 1. {{AliyunOSSOutputStream}}: we need to output the whole result to local 
> disk before we can upload it to OSS. This will poses two problems:
>  - if the output file is too large, it will run out of the local disk.
>  - if the output file is too large, task will wait long time to upload result 
> to OSS before finish, wasting much compute resource.
> 2. {{AliyunOSSBlockOutputStream}}: we cut the task output into small blocks, 
> i.e. some small local file, and each block will be packaged into a uploading 
> task. These tasks will be submitted into {{SemaphoredDelegatingExecutor}}. 
> {{SemaphoredDelegatingExecutor}} will upload this blocks in parallel, this 
> will improve performance greatly.
> 3. Each task will retry 3 times to upload block to Aliyun OSS. If one of 
> those tasks failed, the whole file uploading will failed, and we will abort 
> current uploading.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14999) AliyunOSS: provide one asynchronous multi-part based uploading mechanism

2018-04-03 Thread Genmao Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423921#comment-16423921
 ] 

Genmao Yu commented on HADOOP-14999:


[~Sammi]

attach HADOOP-14999-branch-2.001.patch for branch-2

> AliyunOSS: provide one asynchronous multi-part based uploading mechanism
> 
>
> Key: HADOOP-14999
> URL: https://issues.apache.org/jira/browse/HADOOP-14999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/oss
>Affects Versions: 3.0.0-beta1
>Reporter: Genmao Yu
>Assignee: Genmao Yu
>Priority: Major
> Attachments: HADOOP-14999-branch-2.001.patch, HADOOP-14999.001.patch, 
> HADOOP-14999.002.patch, HADOOP-14999.003.patch, HADOOP-14999.004.patch, 
> HADOOP-14999.005.patch, HADOOP-14999.006.patch, HADOOP-14999.007.patch, 
> HADOOP-14999.008.patch, HADOOP-14999.009.patch, HADOOP-14999.010.patch, 
> HADOOP-14999.011.patch, asynchronous_file_uploading.pdf, 
> diff-between-patch7-and-patch8.txt
>
>
> This mechanism is designed for uploading file in parallel and asynchronously:
>  - improve the performance of uploading file to OSS server. Firstly, this 
> mechanism splits result to multiple small blocks and upload them in parallel. 
> Then, getting result and uploading blocks are asynchronous.
>  - avoid buffering too large result into local disk. To cite an extreme 
> example, there is a task which will output 100GB or even larger, we may need 
> to output this 100GB to local disk and then upload it. Sometimes, it is 
> inefficient and limited to disk space.
> This patch reuse {{SemaphoredDelegatingExecutor}} as executor service and 
> depends on HADOOP-15039.
> Attached {{asynchronous_file_uploading.pdf}} illustrated the difference 
> between previous {{AliyunOSSOutputStream}} and 
> {{AliyunOSSBlockOutputStream}}, i.e. this asynchronous multi-part based 
> uploading mechanism.
> 1. {{AliyunOSSOutputStream}}: we need to output the whole result to local 
> disk before we can upload it to OSS. This will poses two problems:
>  - if the output file is too large, it will run out of the local disk.
>  - if the output file is too large, task will wait long time to upload result 
> to OSS before finish, wasting much compute resource.
> 2. {{AliyunOSSBlockOutputStream}}: we cut the task output into small blocks, 
> i.e. some small local file, and each block will be packaged into a uploading 
> task. These tasks will be submitted into {{SemaphoredDelegatingExecutor}}. 
> {{SemaphoredDelegatingExecutor}} will upload this blocks in parallel, this 
> will improve performance greatly.
> 3. Each task will retry 3 times to upload block to Aliyun OSS. If one of 
> those tasks failed, the whole file uploading will failed, and we will abort 
> current uploading.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14999) AliyunOSS: provide one asynchronous multi-part based uploading mechanism

2018-04-03 Thread Genmao Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Genmao Yu updated HADOOP-14999:
---
Attachment: HADOOP-14999-branch-2.001.patch

> AliyunOSS: provide one asynchronous multi-part based uploading mechanism
> 
>
> Key: HADOOP-14999
> URL: https://issues.apache.org/jira/browse/HADOOP-14999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/oss
>Affects Versions: 3.0.0-beta1
>Reporter: Genmao Yu
>Assignee: Genmao Yu
>Priority: Major
> Attachments: HADOOP-14999-branch-2.001.patch, HADOOP-14999.001.patch, 
> HADOOP-14999.002.patch, HADOOP-14999.003.patch, HADOOP-14999.004.patch, 
> HADOOP-14999.005.patch, HADOOP-14999.006.patch, HADOOP-14999.007.patch, 
> HADOOP-14999.008.patch, HADOOP-14999.009.patch, HADOOP-14999.010.patch, 
> HADOOP-14999.011.patch, asynchronous_file_uploading.pdf, 
> diff-between-patch7-and-patch8.txt
>
>
> This mechanism is designed for uploading file in parallel and asynchronously:
>  - improve the performance of uploading file to OSS server. Firstly, this 
> mechanism splits result to multiple small blocks and upload them in parallel. 
> Then, getting result and uploading blocks are asynchronous.
>  - avoid buffering too large result into local disk. To cite an extreme 
> example, there is a task which will output 100GB or even larger, we may need 
> to output this 100GB to local disk and then upload it. Sometimes, it is 
> inefficient and limited to disk space.
> This patch reuse {{SemaphoredDelegatingExecutor}} as executor service and 
> depends on HADOOP-15039.
> Attached {{asynchronous_file_uploading.pdf}} illustrated the difference 
> between previous {{AliyunOSSOutputStream}} and 
> {{AliyunOSSBlockOutputStream}}, i.e. this asynchronous multi-part based 
> uploading mechanism.
> 1. {{AliyunOSSOutputStream}}: we need to output the whole result to local 
> disk before we can upload it to OSS. This will poses two problems:
>  - if the output file is too large, it will run out of the local disk.
>  - if the output file is too large, task will wait long time to upload result 
> to OSS before finish, wasting much compute resource.
> 2. {{AliyunOSSBlockOutputStream}}: we cut the task output into small blocks, 
> i.e. some small local file, and each block will be packaged into a uploading 
> task. These tasks will be submitted into {{SemaphoredDelegatingExecutor}}. 
> {{SemaphoredDelegatingExecutor}} will upload this blocks in parallel, this 
> will improve performance greatly.
> 3. Each task will retry 3 times to upload block to Aliyun OSS. If one of 
> those tasks failed, the whole file uploading will failed, and we will abort 
> current uploading.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14651) Update okhttp version to 2.7.5

2018-04-03 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423823#comment-16423823
 ] 

genericqa commented on HADOOP-14651:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  2m 
18s{color} | {color:red} Docker failed to build yetus/hadoop:dbd69cb. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HADOOP-14651 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12917344/HADOOP-14651-branch-2.0.004.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14426/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Update okhttp version to 2.7.5
> --
>
> Key: HADOOP-14651
> URL: https://issues.apache.org/jira/browse/HADOOP-14651
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/adl
>Affects Versions: 3.0.0-beta1
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HADOOP-14651-branch-2.0.004.patch, 
> HADOOP-14651-branch-2.0.004.patch, HADOOP-14651-branch-3.0.004.patch, 
> HADOOP-14651-branch-3.0.004.patch, HADOOP-14651.001.patch, 
> HADOOP-14651.002.patch, HADOOP-14651.003.patch, HADOOP-14651.004.patch
>
>
> The current artifact is:
> com.squareup.okhttp:okhttp:2.4.0
> That version could either be bumped to 2.7.5 (the latest of that line), or 
> use the latest artifact:
> com.squareup.okhttp3:okhttp:3.8.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14651) Update okhttp version to 2.7.5

2018-04-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14651:

Status: Patch Available  (was: Open)

> Update okhttp version to 2.7.5
> --
>
> Key: HADOOP-14651
> URL: https://issues.apache.org/jira/browse/HADOOP-14651
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/adl
>Affects Versions: 3.0.0-beta1
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HADOOP-14651-branch-2.0.004.patch, 
> HADOOP-14651-branch-2.0.004.patch, HADOOP-14651-branch-3.0.004.patch, 
> HADOOP-14651-branch-3.0.004.patch, HADOOP-14651.001.patch, 
> HADOOP-14651.002.patch, HADOOP-14651.003.patch, HADOOP-14651.004.patch
>
>
> The current artifact is:
> com.squareup.okhttp:okhttp:2.4.0
> That version could either be bumped to 2.7.5 (the latest of that line), or 
> use the latest artifact:
> com.squareup.okhttp3:okhttp:3.8.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14651) Update okhttp version to 2.7.5

2018-04-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14651:

Attachment: HADOOP-14651-branch-2.0.004.patch

> Update okhttp version to 2.7.5
> --
>
> Key: HADOOP-14651
> URL: https://issues.apache.org/jira/browse/HADOOP-14651
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/adl
>Affects Versions: 3.0.0-beta1
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HADOOP-14651-branch-2.0.004.patch, 
> HADOOP-14651-branch-2.0.004.patch, HADOOP-14651-branch-3.0.004.patch, 
> HADOOP-14651-branch-3.0.004.patch, HADOOP-14651.001.patch, 
> HADOOP-14651.002.patch, HADOOP-14651.003.patch, HADOOP-14651.004.patch
>
>
> The current artifact is:
> com.squareup.okhttp:okhttp:2.4.0
> That version could either be bumped to 2.7.5 (the latest of that line), or 
> use the latest artifact:
> com.squareup.okhttp3:okhttp:3.8.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14651) Update okhttp version to 2.7.5

2018-04-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14651:

Status: Open  (was: Patch Available)

> Update okhttp version to 2.7.5
> --
>
> Key: HADOOP-14651
> URL: https://issues.apache.org/jira/browse/HADOOP-14651
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/adl
>Affects Versions: 3.0.0-beta1
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HADOOP-14651-branch-2.0.004.patch, 
> HADOOP-14651-branch-3.0.004.patch, HADOOP-14651-branch-3.0.004.patch, 
> HADOOP-14651.001.patch, HADOOP-14651.002.patch, HADOOP-14651.003.patch, 
> HADOOP-14651.004.patch
>
>
> The current artifact is:
> com.squareup.okhttp:okhttp:2.4.0
> That version could either be bumped to 2.7.5 (the latest of that line), or 
> use the latest artifact:
> com.squareup.okhttp3:okhttp:3.8.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15355) TestCommonConfigurationFields is broken by HADOOP-15312

2018-04-03 Thread LiXin Ge (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423574#comment-16423574
 ] 

LiXin Ge commented on HADOOP-15355:
---

Thanks [~xiaochen] for reviewing and committing this!

> TestCommonConfigurationFields is broken by HADOOP-15312
> ---
>
> Key: HADOOP-15355
> URL: https://issues.apache.org/jira/browse/HADOOP-15355
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: LiXin Ge
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.0.2, 3.1.1
>
> Attachments: HADOOP-15355.001.patch, HADOOP-15355.002.patch
>
>
> TestCommonConfigurationFields is failing after HADOOP-15312.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15312) Undocumented KeyProvider configuration keys

2018-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423546#comment-16423546
 ] 

Hudson commented on HADOOP-15312:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13916 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13916/])
HADOOP-15355. TestCommonConfigurationFields is broken by HADOOP-15312. (xiao: 
rev 1077392eaad303ddd82bcbe259a4045d8a028c20)
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyProvider.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java


> Undocumented KeyProvider configuration keys
> ---
>
> Key: HADOOP-15312
> URL: https://issues.apache.org/jira/browse/HADOOP-15312
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Assignee: LiXin Ge
>Priority: Major
> Fix For: 3.1.0, 2.10.0, 3.0.2
>
> Attachments: HADOOP-15312.001.patch, HADOOP-15312.002.patch, 
> HADOOP-15312.003.patch
>
>
> Via HADOOP-14445, I found two undocumented configuration keys: 
> hadoop.security.key.default.bitlength and hadoop.security.key.default.cipher



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15355) TestCommonConfigurationFields is broken by HADOOP-15312

2018-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423545#comment-16423545
 ] 

Hudson commented on HADOOP-15355:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13916 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13916/])
HADOOP-15355. TestCommonConfigurationFields is broken by HADOOP-15312. (xiao: 
rev 1077392eaad303ddd82bcbe259a4045d8a028c20)
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyProvider.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java


> TestCommonConfigurationFields is broken by HADOOP-15312
> ---
>
> Key: HADOOP-15355
> URL: https://issues.apache.org/jira/browse/HADOOP-15355
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: LiXin Ge
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.0.2, 3.1.1
>
> Attachments: HADOOP-15355.001.patch, HADOOP-15355.002.patch
>
>
> TestCommonConfigurationFields is failing after HADOOP-15312.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15317) Improve NetworkTopology chooseRandom's loop

2018-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423526#comment-16423526
 ] 

Hudson commented on HADOOP-15317:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13915 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13915/])
HADOOP-15317. Improve NetworkTopology chooseRandom's loop. (xiao: rev 
57374c4737ab0fccf52dae3cea911fc6bd90e1b7)
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/net/TestNetworkTopology.java


> Improve NetworkTopology chooseRandom's loop
> ---
>
> Key: HADOOP-15317
> URL: https://issues.apache.org/jira/browse/HADOOP-15317
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Major
> Fix For: 2.10.0, 2.8.4, 3.2.0, 3.0.2, 3.1.1, 2.9.2
>
> Attachments: HADOOP-15317.01.patch, HADOOP-15317.02.patch, 
> HADOOP-15317.03.patch, HADOOP-15317.04.patch, HADOOP-15317.05.patch, 
> HADOOP-15317.06.patch, Screen Shot 2018-03-28 at 7.23.32 PM.png
>
>
> Recently we found a postmortem case where the ANN seems to be in an infinite 
> loop. From the logs it seems it just went through a rolling restart, and DNs 
> are getting registered.
> Later the NN become unresponsive, and from the stacktrace it's inside a 
> do-while loop inside {{NetworkTopology#chooseRandom}} - part of what's done 
> in HDFS-10320.
> Going through the code and logs I'm not able to come up with any theory 
> (thought about incorrect locking, or the Node object being modified outside 
> of NetworkTopology, both seem impossible) why this is happening, but we 
> should eliminate this loop.
> stacktrace:
> {noformat}
>  Stack:
> java.util.HashMap.hash(HashMap.java:338)
> java.util.HashMap.containsKey(HashMap.java:595)
> java.util.HashSet.contains(HashSet.java:203)
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:786)
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:732)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseDataNode(BlockPlacementPolicyDefault.java:757)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:692)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:666)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:573)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:461)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:368)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:243)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:115)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4AdditionalDatanode(BlockManager.java:1596)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:3599)
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:717)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15355) TestCommonConfigurationFields is broken by HADOOP-15312

2018-04-03 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-15355:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.1.1
   3.0.2
   3.2.0
   2.10.0
   Status: Resolved  (was: Patch Available)

Committed this to trunk, branch-3.[0-1], branch-2.
Thanks [~GeLiXin] and [~shv]!

> TestCommonConfigurationFields is broken by HADOOP-15312
> ---
>
> Key: HADOOP-15355
> URL: https://issues.apache.org/jira/browse/HADOOP-15355
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: LiXin Ge
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.0.2, 3.1.1
>
> Attachments: HADOOP-15355.001.patch, HADOOP-15355.002.patch
>
>
> TestCommonConfigurationFields is failing after HADOOP-15312.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15355) TestCommonConfigurationFields is broken by HADOOP-15312

2018-04-03 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423520#comment-16423520
 ] 

Xiao Chen commented on HADOOP-15355:


+1, let's fix the broken test. Committing this

> TestCommonConfigurationFields is broken by HADOOP-15312
> ---
>
> Key: HADOOP-15355
> URL: https://issues.apache.org/jira/browse/HADOOP-15355
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: LiXin Ge
>Priority: Major
> Attachments: HADOOP-15355.001.patch, HADOOP-15355.002.patch
>
>
> TestCommonConfigurationFields is failing after HADOOP-15312.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15317) Improve NetworkTopology chooseRandom's loop

2018-04-03 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-15317:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.9.2
   3.1.1
   3.0.2
   3.2.0
   2.8.4
   2.10.0
   Status: Resolved  (was: Patch Available)

> Improve NetworkTopology chooseRandom's loop
> ---
>
> Key: HADOOP-15317
> URL: https://issues.apache.org/jira/browse/HADOOP-15317
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Major
> Fix For: 2.10.0, 2.8.4, 3.2.0, 3.0.2, 3.1.1, 2.9.2
>
> Attachments: HADOOP-15317.01.patch, HADOOP-15317.02.patch, 
> HADOOP-15317.03.patch, HADOOP-15317.04.patch, HADOOP-15317.05.patch, 
> HADOOP-15317.06.patch, Screen Shot 2018-03-28 at 7.23.32 PM.png
>
>
> Recently we found a postmortem case where the ANN seems to be in an infinite 
> loop. From the logs it seems it just went through a rolling restart, and DNs 
> are getting registered.
> Later the NN become unresponsive, and from the stacktrace it's inside a 
> do-while loop inside {{NetworkTopology#chooseRandom}} - part of what's done 
> in HDFS-10320.
> Going through the code and logs I'm not able to come up with any theory 
> (thought about incorrect locking, or the Node object being modified outside 
> of NetworkTopology, both seem impossible) why this is happening, but we 
> should eliminate this loop.
> stacktrace:
> {noformat}
>  Stack:
> java.util.HashMap.hash(HashMap.java:338)
> java.util.HashMap.containsKey(HashMap.java:595)
> java.util.HashSet.contains(HashSet.java:203)
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:786)
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:732)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseDataNode(BlockPlacementPolicyDefault.java:757)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:692)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:666)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:573)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:461)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:368)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:243)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:115)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4AdditionalDatanode(BlockManager.java:1596)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:3599)
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:717)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15317) Improve NetworkTopology chooseRandom's loop

2018-04-03 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423516#comment-16423516
 ] 

Xiao Chen commented on HADOOP-15317:


Pushed this to all branches (trunk, branch-3.[0-1], branch-2, branch-2.[8-9]) 
to match HDFS-10320. 

Thanks Ajay and Eddy for the reviews!

> Improve NetworkTopology chooseRandom's loop
> ---
>
> Key: HADOOP-15317
> URL: https://issues.apache.org/jira/browse/HADOOP-15317
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Major
> Fix For: 2.10.0, 2.8.4, 3.2.0, 3.0.2, 3.1.1, 2.9.2
>
> Attachments: HADOOP-15317.01.patch, HADOOP-15317.02.patch, 
> HADOOP-15317.03.patch, HADOOP-15317.04.patch, HADOOP-15317.05.patch, 
> HADOOP-15317.06.patch, Screen Shot 2018-03-28 at 7.23.32 PM.png
>
>
> Recently we found a postmortem case where the ANN seems to be in an infinite 
> loop. From the logs it seems it just went through a rolling restart, and DNs 
> are getting registered.
> Later the NN become unresponsive, and from the stacktrace it's inside a 
> do-while loop inside {{NetworkTopology#chooseRandom}} - part of what's done 
> in HDFS-10320.
> Going through the code and logs I'm not able to come up with any theory 
> (thought about incorrect locking, or the Node object being modified outside 
> of NetworkTopology, both seem impossible) why this is happening, but we 
> should eliminate this loop.
> stacktrace:
> {noformat}
>  Stack:
> java.util.HashMap.hash(HashMap.java:338)
> java.util.HashMap.containsKey(HashMap.java:595)
> java.util.HashSet.contains(HashSet.java:203)
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:786)
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:732)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseDataNode(BlockPlacementPolicyDefault.java:757)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:692)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:666)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:573)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:461)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:368)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:243)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:115)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4AdditionalDatanode(BlockManager.java:1596)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:3599)
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:717)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org