Hi James, Sorry - you are right. I searched for close and didn't notice it was in a try with resources block. The weirder thing is we only allow a single task to schedule at a time, which lead me to think it was leaking resources.
I'll continue investigating and see if I can find out what's going on. Thanks! On 2017-03-22 16:34 (-0400), James Wing <[email protected]> wrote: > David, > > Can you clarify which part of the FetchS3Object code looks problematic to > you? From a quick look, I found one use of S3Object in FetchS3Object.java, > line ~106: > > try (final S3Object s3Object = client.getObject(request)) { > flowFile = session.importFrom(s3Object.getObjectContent(), > flowFile); > attributes.put("s3.bucket", s3Object.getBucketName()); > > I believe declaring the variable within the try block will lead to its > proper and certain closure, but I'm not 100% on all the fine print with > that. Is this what you are referring to, and does it not work as I hope? > > https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java#L106 > > Thanks, > > James > > > On Wed, Mar 22, 2017 at 12:41 PM, David Hesson <[email protected]> wrote: > > > Greetings, > > > > In investigating a connection pool issue we were having during development, > > I was checking the FetchS3Object code to see how it reads content from S3. > > I don't see a close() > > <http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/ > > amazonaws/services/s3/model/S3Object.html#close-->invocation > > on the S3Object in the FetchS3Object processor. I believe this can lead to > > leaks on that object. > > > > We we're seeing logs like the following after trying to process some 90k > > objects from S3: > > INFO [Timer-Driven Process Thread-55] com.amazonaws.http.AmazonHttpClient > > Unable to execute HTTP request: Timeout waiting for connection from pool > > > > Is the S3Object not closed because the stream content is lazily loaded > > later in the flow (when accessed)? I didn't check the processSession > > implementation which reads the input stream. Just figured I'd ask and see > > if you all were aware, or that this is for some reason by design. > > > > Thanks, > > dh > > >
