[jira] [Commented] (HADOOP-12739) Deadlock with OrcInputFormat split threads and Jets3t connections, since, NativeS3FileSystem does not release connections with seek()

2017-06-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16058143#comment-16058143
 ] 

Hadoop QA commented on HADOOP-12739:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 11s{color} | {color:orange} hadoop-tools/hadoop-aws: The patch generated 2 
new + 36 unchanged - 0 fixed = 38 total (was 36) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
23s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 20m 23s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HADOOP-12739 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12784120/HADOOP-12739.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux d1c184f826a3 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / e806c6e |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12588/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-aws.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12588/artifact/patchprocess/whitespace-eol.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12588/testReport/ |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12588/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Deadlock with OrcInputFormat split threads and Jets3t connections, since, 
> NativeS3FileSystem does not release connections with seek()
> -
>
> Key: HADOOP-12739
> URL: https://is

[jira] [Commented] (HADOOP-12739) Deadlock with OrcInputFormat split threads and Jets3t connections, since, NativeS3FileSystem does not release connections with seek()

2017-06-21 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16058093#comment-16058093
 ] 

Steve Loughran commented on HADOOP-12739:
-

I'm really tempted to close this as a wontfix. just because we're trying to 
move everyone on to S3

S3A has a lot of performance updates for reading columnar data, where seek() 
performance is a key feature.

Can you upgrade to Hadoop 2.8?

> Deadlock with OrcInputFormat split threads and Jets3t connections, since, 
> NativeS3FileSystem does not release connections with seek()
> -
>
> Key: HADOOP-12739
> URL: https://issues.apache.org/jira/browse/HADOOP-12739
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.6.0, 2.7.0
>Reporter: Pavan Srinivas
>Assignee: Pavan Srinivas
> Attachments: 11600.txt, HADOOP-12739.patch
>
>
> Recently, we came across a deadlock situation with OrcInputFormat while 
> computing splits. 
> - In Orc, for split computation, it needs file listing and file sizes. 
> - Multiple threads are invoked for listing the files and if the data is 
> located in S3, NativeS3FileSystem is used. 
> - NativeS3FileSystem in turn uses JetS3t Lib to talk to AWS and maintain 
> connection pool. 
> - When # of threads from OrcInputFormat exceeds JetS3t's max # of 
> connections, a deadlock occurs. stack trace: 
> {code}
> "ORC_GET_SPLITS #5" daemon prio=10 tid=0x7f8568108800 nid=0x1e29 in 
> Object.wait() [0x7f8565696000]
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   - waiting on <0xdf9ed450> (a 
> org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool)
>   at 
> org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.doGetConnection(MultiThreadedHttpConnectionManager.java:518)
>   - locked <0xdf9ed450> (a 
> org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool)
>   at 
> org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.getConnectionWithTimeout(MultiThreadedHttpConnectionManager.java:416)
>   at 
> org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:153)
>   at 
> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
>   at 
> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
>   at 
> org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:370)
>   at 
> org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestGet(RestStorageService.java:929)
>   at 
> org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:2007)
>   at 
> org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:1944)
>   at org.jets3t.service.S3Service.getObject(S3Service.java:2625)
>   at 
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNativeFileSystemStore.java:254)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>   at org.apache.hadoop.fs.s3native.$Proxy12.retrieve(Unknown Source)
>   at 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.reopen(NativeS3FileSystem.java:269)
>   - locked <0xdb01eec0> (a 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream)
>   at 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.seek(NativeS3FileSystem.java:258)
>   - locked <0xdb01eec0> (a 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream)
>   at 
> org.apache.hadoop.fs.BufferedFSInputStream.seek(BufferedFSInputStream.java:98)
>   at 
> org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:63)
>   - locked <0xdb01ee70> (a org.apache.hadoop.fs.FSDataInputStream)
>   at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:329)
>   at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.(ReaderImpl.java:292)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:197)
>   at 
> o

[jira] [Commented] (HADOOP-12739) Deadlock with OrcInputFormat split threads and Jets3t connections, since, NativeS3FileSystem does not release connections with seek()

2016-01-25 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15115004#comment-15115004
 ] 

Steve Loughran commented on HADOOP-12739:
-

If there's something which scares us, it's patches for s3n. Something always 
breaks somewhere else. So while I don't doubt your discovery of the bug, i 
worry about the implications for fixing it. In particular, we know that the 
latest jets3t uses an http client lib which close()s connections by reading in 
the rest of the stream ... not what we want to do when seeking a few bytes in a 
many GB file. I don't know if the patch here makes that any worse, or just 
hurts seek more.

# the patch submission process for objects stores is listed, as noted, please 
reassure us that you ran all the aws test suite and they worked.: 
https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure

# What happens on S3a? It's the better performing FS, and there's actually a 
pending patch there for lazy-seeks: the input stream isn't even opened until 
the read


> Deadlock with OrcInputFormat split threads and Jets3t connections, since, 
> NativeS3FileSystem does not release connections with seek()
> -
>
> Key: HADOOP-12739
> URL: https://issues.apache.org/jira/browse/HADOOP-12739
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.6.0, 2.7.0
>Reporter: Pavan Srinivas
>Assignee: Pavan Srinivas
> Attachments: 11600.txt, HADOOP-12739.patch
>
>
> Recently, we came across a deadlock situation with OrcInputFormat while 
> computing splits. 
> - In Orc, for split computation, it needs file listing and file sizes. 
> - Multiple threads are invoked for listing the files and if the data is 
> located in S3, NativeS3FileSystem is used. 
> - NativeS3FileSystem in turn uses JetS3t Lib to talk to AWS and maintain 
> connection pool. 
> - When # of threads from OrcInputFormat exceeds JetS3t's max # of 
> connections, a deadlock occurs. stack trace: 
> {code}
> "ORC_GET_SPLITS #5" daemon prio=10 tid=0x7f8568108800 nid=0x1e29 in 
> Object.wait() [0x7f8565696000]
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   - waiting on <0xdf9ed450> (a 
> org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool)
>   at 
> org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.doGetConnection(MultiThreadedHttpConnectionManager.java:518)
>   - locked <0xdf9ed450> (a 
> org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool)
>   at 
> org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.getConnectionWithTimeout(MultiThreadedHttpConnectionManager.java:416)
>   at 
> org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:153)
>   at 
> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
>   at 
> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
>   at 
> org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:370)
>   at 
> org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestGet(RestStorageService.java:929)
>   at 
> org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:2007)
>   at 
> org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:1944)
>   at org.jets3t.service.S3Service.getObject(S3Service.java:2625)
>   at 
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNativeFileSystemStore.java:254)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>   at org.apache.hadoop.fs.s3native.$Proxy12.retrieve(Unknown Source)
>   at 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.reopen(NativeS3FileSystem.java:269)
>   - locked <0xdb01eec0> (a 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream)
>   at 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInpu

[jira] [Commented] (HADOOP-12739) Deadlock with OrcInputFormat split threads and Jets3t connections, since, NativeS3FileSystem does not release connections with seek()

2016-01-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15114999#comment-15114999
 ] 

Hadoop QA commented on HADOOP-12739:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
58s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 12s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
9s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 20s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 10s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 11s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 11s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 10s 
{color} | {color:red} hadoop-tools/hadoop-aws: patch generated 2 new + 35 
unchanged - 0 fixed = 37 total (was 35) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 9s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 12s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.7.0_91. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 14m 7s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12784120/HADOOP-12739.patch |
| JIRA Issue | HADOOP-12739 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 260ba1b14c51 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x8