Anderson Vaz created CAMEL-14929:
------------------------------------

             Summary: camel-aws2-s3 - Doesn't support stream download of large 
files. 
                 Key: CAMEL-14929
                 URL: https://issues.apache.org/jira/browse/CAMEL-14929
             Project: Camel
          Issue Type: Improvement
          Components: camel-aws2
    Affects Versions: 3.2.0
            Reporter: Anderson Vaz


Hi,

The component `*camel-aws2-s3*` should be able to support streaming 
consume/download  to allow the copy/download of large files from S3. The 
current implementation or 'saves' the contents of input stream into the memory 
or completely disregard it not giving a change for the next components to 
manipulate the stream. This seems to be a no ideal implementation.

The issue essentially is on class 
`*org.apache.camel.component.aws2.s3.AWS2S3Endpoint*` in between lines *169* to 
*178* and lines *201* to *212*.

The logic on lines 169 to 178 there is: 
 * if the parameter `*includeBody*` is *true* it will consume the S3 stream 
into the memory which is no ideal for large files.
 * if the parameter `*includeBody*` is *false* it won't consume the S3 stream 
however the S3 stream will be lost, I couldn't find any other way to access it 
therefore the S3 is open for nothing on this case. This doesn't seem reasonable 
as well. I think the S3 stream should be put in the `*body*` raw so the next 
component in the pipeline can consume it.

The logic on lines 201 to 212 is:
 * if the parameter `*includeBody*` is *false* it surprisingly close the S3 
input stream confirming that there will be no way to consume it afterwards.
 * if the parameter `*includeBody*` is *true* the S3 input stream will be left 
open however there is way to access it as it is created on line 77 of 
`*org.apache.camel.component.aws2.s3.AWS2S3Consumer*` and afterwards if not 
included in the body it get lost.

The ideal behaviour I think would be:
 * if `*includedBody*` is *true* then consume S3 input stream into the memory, 
save it in the body and close it.
 * if `*includeBody*` is *false* then put the raw S3 input stream in the body 
and don't close it.
 * if `*autoCloseBody*` is *true* then schedule the S3 input stream closing for 
when exchange is finished.
 * if `*autoCloseBody*` is *false* then leave to caller to close it which I'm 
not sure how this can be done in the current implementation.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to