[
https://issues.apache.org/jira/browse/NIFI-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15459061#comment-15459061
]
ASF GitHub Bot commented on NIFI-2631:
--------------------------------------
Github user gresockj commented on the issue:
https://github.com/apache/nifi/pull/917
Yeah, I'm on a cluster. I'll keep digging.
> ListS3 improvements: "Use versions" and "Commit mode"
> -----------------------------------------------------
>
> Key: NIFI-2631
> URL: https://issues.apache.org/jira/browse/NIFI-2631
> Project: Apache NiFi
> Issue Type: Improvement
> Affects Versions: 0.7.0
> Reporter: Joseph Gresock
> Assignee: Joseph Gresock
> Priority: Minor
> Fix For: 1.1.0, 0.8.0
>
>
> Our team needs to be able to list individual versions in S3. We also ran
> into a use case where a bucket with many objects (over 1 million in our case)
> seemed to cause ListS3 to run forever. The S3 list command finished in a few
> minutes, but we believe it was taking a very long time for NiFi to commit all
> the flow files at once.
> To handle this use case, we added a Commit Mode property to ListS3 that
> allows you specify that you want to commit "Per page" vs. "Once". This has
> proven to correctly emit the flow files as the S3 paging progresses.
> We also implemented support for S3 List Versions, which includes the
> "s3.version" and "s3.isLatest" attributes if applicable. The "s3.version"
> attribute can in turn be used in the FetchS3 processor.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)