[ 
https://issues.apache.org/jira/browse/AVRO-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13724816#comment-13724816
 ] 

Hudson commented on AVRO-1144:
------------------------------

SUCCESS: Integrated in AvroJava #387 (See 
[https://builds.apache.org/job/AvroJava/387/])
AVRO-1144 Deadlock with FSInput and Hadoop NativeS3FileSystem (scottcarey: rev 
1508713)
* /avro/trunk/CHANGES.txt
* /avro/trunk/lang/java/mapred/src/main/java/org/apache/avro/mapred/FsInput.java

                
> Deadlock with FSInput and Hadoop NativeS3FileSystem.
> ----------------------------------------------------
>
>                 Key: AVRO-1144
>                 URL: https://issues.apache.org/jira/browse/AVRO-1144
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.7.0
>         Environment: Hadoop 1.0.3
>            Reporter: Shawn Smith
>            Assignee: Scott Carey
>             Fix For: 1.7.5
>
>         Attachments: AVRO-1144.patch
>
>
> Deadlock can occur when using org.apache.avro.mapred.FsInput to read files 
> from S3 using the Hadoop NativeS3FileSystem and multiple threads.
> There are a lot of components involved, but the basic cause is pretty simple: 
> Apache Commons HttpClient can deadlock waiting for a free HTTP connection 
> when the number of threads downloading from S3 is greater than or equal to 
> the maximum allowed HTTP connections per host.
> I've filed this bug against Avro because the bug is easiest to fix in Avro.  
> Swap the order of the FileSystem.open() and FileSystem.getFileStatus() calls 
> in the FSInput constructor:
> {noformat}
> /** Construct given a path and a configuration. */
> public FsInput(Path path, Configuration conf) throws IOException {
>   this.stream = path.getFileSystem(conf).open(path);
>   this.len = path.getFileSystem(conf).getFileStatus(path).getLen();
> }
> {noformat}
> to
> {noformat}
> /** Construct given a path and a configuration. */
> public FsInput(Path path, Configuration conf) throws IOException {
>   this.len = path.getFileSystem(conf).getFileStatus(path).getLen();
>   this.stream = path.getFileSystem(conf).open(path);
> }
> {noformat}
> Here's what triggers the deadlock:
> * FSInput calls FileSystem.open() which calls Jets3t to connect to S3 and 
> open an HTTP connection for downloading content.  This acquires an HTTP 
> connection but does not release it.
> * FSInput calls FileSystem.getFileStatus() which calls Jets3t to connect to 
> S3 and perform a HEAD request to get object metadata.  This attempts to 
> acquire a second HTTP connection.
> * Jets3t uses Apache Commons HTTP Client which limits the number of 
> simultaneous HTTP connections to a given host.  Lets say this maximum is 4 
> (the default)...  If 4 threads all call the FSInput constructor concurrently, 
> the 4 FileSystem.open() calls can acquire all 4 available connections and the 
> FileSystem.getFileStatus() calls block forever waiting for a thread to 
> release an HTTP connection back to the connection pool.
> A simple way to reproduce the problem this problem is to create 
> "jets3t.properties" in your classpath with "httpclient.max-connections=1".  
> Then try to open a file using FSInput and the Native S3 file system (new 
> Path("s3n://<bucket>/<path>")).  It will hang indefinitely inside the FSInput 
> constructor.
> Swapping the order of the open() and getFileStatus() calls ensures that a 
> given thread using FSInput has at most one outstanding connection S3 at a 
> time.  As a result, one thread should always be able to make progress, 
> avoiding deadlock.
> Here's a sample stack trace of a deadlocked thread:
> {noformat}
> "pool-10-thread-3" prio=5 tid=11026f800 nid=0x116a04000 in Object.wait() 
> [116a02000]
>    java.lang.Thread.State: WAITING (on object monitor)
>       at java.lang.Object.wait(Native Method)
>       - waiting on <785892cc0> (a 
> org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool)
>       at 
> org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.doGetConnection(MultiThreadedHttpConnectionManager.java:518)
>       - locked <785892cc0> (a 
> org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool)
>       at 
> org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.getConnectionWithTimeout(MultiThreadedHttpConnectionManager.java:416)
>       at 
> org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:153)
>       at 
> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
>       at 
> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
>       at 
> org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest(RestS3Service.java:357)
>       at 
> org.jets3t.service.impl.rest.httpclient.RestS3Service.performRestHead(RestS3Service.java:652)
>       at 
> org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectImpl(RestS3Service.java:1556)
>       at 
> org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectDetailsImpl(RestS3Service.java:1492)
>       at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1793)
>       at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1225)
>       at 
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:111)
>       at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>       at org.apache.hadoop.fs.s3native.$Proxy25.retrieveMetadata(Unknown 
> Source)
>       at 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:326)
>       at org.apache.avro.mapred.FsInput.<init>(FsInput.java:38)
>       at 
> org.apache.crunch.io.avro.AvroFileReaderFactory.read(AvroFileReaderFactory.java:70)
>       at 
> org.apache.crunch.io.CompositePathIterable$2.<init>(CompositePathIterable.java:80)
>       at 
> org.apache.crunch.io.CompositePathIterable.iterator(CompositePathIterable.java:78)
>       at com.example.load.BulkLoader$1.run(BulkLoadCommand.java:109)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>       at java.lang.Thread.run(Thread.java:680)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to