[jira] [Commented] (NIFI-7579) Create a GetS3Object Processor
[ https://issues.apache.org/jira/browse/NIFI-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152860#comment-17152860 ] Pierre Villard commented on NIFI-7579: -- I think the comment here is that GenerateFlowFile (where you can set flow files attributes as dynamic properties) -> FetchS3 would provide the exact feature you're looking for. Another option would be to make optional the incoming connection on FetchS3, but not sure this would provide a good user experience. > Create a GetS3Object Processor > -- > > Key: NIFI-7579 > URL: https://issues.apache.org/jira/browse/NIFI-7579 > Project: Apache NiFi > Issue Type: New Feature >Reporter: ArpStorm1 >Assignee: YoungGyu Chun >Priority: Major > > Sometimes the client needs to get only specific object or a subset of objects > from its bucket. Now, the only way to do it is using ListS3 Processor and > after that using FetchS3Object processor. Creating a GetS3Object processor > for such cases can be great -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (NIFI-7579) Create a GetS3Object Processor
[ https://issues.apache.org/jira/browse/NIFI-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17148915#comment-17148915 ] ArpStorm1 commented on NIFI-7579: - The problem with the List/Fetch pattern regarding S3 is the need to first list all the objects, and the list operation can be very heavy. S3 is a common standard today of Object storage, and not only Amazon implemented it. Using listS3 processor can create heavy workload on the backend storage, resulting in slow answer which can fail the entire flow process. And sometimes that can be avoided by getting the exact object the user needs. GetS3Object not has to be the solution - maybe implement this logic to the FetchS3Object processor would be enough. > Create a GetS3Object Processor > -- > > Key: NIFI-7579 > URL: https://issues.apache.org/jira/browse/NIFI-7579 > Project: Apache NiFi > Issue Type: New Feature >Reporter: ArpStorm1 >Assignee: YoungGyu Chun >Priority: Major > > Sometimes the client needs to get only specific object or a subset of objects > from its bucket. Now, the only way to do it is using ListS3 Processor and > after that using FetchS3Object processor. Creating a GetS3Object processor > for such cases can be great -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (NIFI-7579) Create a GetS3Object Processor
[ https://issues.apache.org/jira/browse/NIFI-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17148342#comment-17148342 ] Wouter de Vries commented on NIFI-7579: --- [~ArpStorm1]do you have a specific reason why that should not happen? As far as I can see the code of this new processor would be by and large identical to the existing FetchS3Object processor, is that not the case? > Create a GetS3Object Processor > -- > > Key: NIFI-7579 > URL: https://issues.apache.org/jira/browse/NIFI-7579 > Project: Apache NiFi > Issue Type: New Feature >Reporter: ArpStorm1 >Assignee: YoungGyu Chun >Priority: Major > > Sometimes the client needs to get only specific object or a subset of objects > from its bucket. Now, the only way to do it is using ListS3 Processor and > after that using FetchS3Object processor. Creating a GetS3Object processor > for such cases can be great -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (NIFI-7579) Create a GetS3Object Processor
[ https://issues.apache.org/jira/browse/NIFI-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17148081#comment-17148081 ] Mark Payne commented on NIFI-7579: -- I'm not sure that we should introduce another processor just to avoid needing to connect a List/Fetch pair of processors. The pattern of GetXYZ is an older pattern and most of the newer processors that are responsible for gathering files/blobs of data and the like tend to follow the List/Fetch pattern. This pattern has proven to provide many advantages over the Get pattern. It allows for easy and powerful filtering of data before fetching the data. It separates the concerns of listing and maintaining state about what's been seen from actually gathering data. It provides a very powerful mechanism for distributing the data and processing load across the cluster. It makes it far easier to handle flows that are more batch-oriented, with the introduction of NIFI-7476. I would be a -1 on adding a new processor just to avoid needing to connect an upstream List processor. It would mean additional code that must be maintained and would lead to confusion for users when trying to determine which Processor they need, especially for newer users. > Create a GetS3Object Processor > -- > > Key: NIFI-7579 > URL: https://issues.apache.org/jira/browse/NIFI-7579 > Project: Apache NiFi > Issue Type: New Feature >Reporter: ArpStorm1 >Assignee: YoungGyu Chun >Priority: Major > > Sometimes the client needs to get only specific object or a subset of objects > from its bucket. Now, the only way to do it is using ListS3 Processor and > after that using FetchS3Object processor. Creating a GetS3Object processor > for such cases can be great -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (NIFI-7579) Create a GetS3Object Processor
[ https://issues.apache.org/jira/browse/NIFI-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17147862#comment-17147862 ] ArpStorm1 commented on NIFI-7579: - I don't think this behavior should be merged into the FetchS3Object processor. My suggestion is to create a processor that shouldn't need upstream connection for working with S3 > Create a GetS3Object Processor > -- > > Key: NIFI-7579 > URL: https://issues.apache.org/jira/browse/NIFI-7579 > Project: Apache NiFi > Issue Type: New Feature >Reporter: ArpStorm1 >Assignee: YoungGyu Chun >Priority: Major > > Sometimes the client needs to get only specific object or a subset of objects > from its bucket. Now, the only way to do it is using ListS3 Processor and > after that using FetchS3Object processor. Creating a GetS3Object processor > for such cases can be great -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (NIFI-7579) Create a GetS3Object Processor
[ https://issues.apache.org/jira/browse/NIFI-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17147661#comment-17147661 ] Wouter de Vries commented on NIFI-7579: --- Would that not be solved with the GenerateFlowFile processor? If that does not solve it, I would still argue that this behavior should be merged with the FetchS3Object processor, so that it supports being triggered with and without an upstream connection. > Create a GetS3Object Processor > -- > > Key: NIFI-7579 > URL: https://issues.apache.org/jira/browse/NIFI-7579 > Project: Apache NiFi > Issue Type: New Feature >Reporter: ArpStorm1 >Assignee: YoungGyu Chun >Priority: Major > > Sometimes the client needs to get only specific object or a subset of objects > from its bucket. Now, the only way to do it is using ListS3 Processor and > after that using FetchS3Object processor. Creating a GetS3Object processor > for such cases can be great -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (NIFI-7579) Create a GetS3Object Processor
[ https://issues.apache.org/jira/browse/NIFI-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17147223#comment-17147223 ] ArpStorm1 commented on NIFI-7579: - The FetchS3Object depends on upstream connection. You can't start your flow with it. That makes users use processors like ListS3 or something else. But sometimes the user wants to start his workflow with fetching objects > Create a GetS3Object Processor > -- > > Key: NIFI-7579 > URL: https://issues.apache.org/jira/browse/NIFI-7579 > Project: Apache NiFi > Issue Type: New Feature >Reporter: ArpStorm1 >Assignee: YoungGyu Chun >Priority: Major > > Sometimes the client needs to get only specific object or a subset of objects > from its bucket. Now, the only way to do it is using ListS3 Processor and > after that using FetchS3Object processor. Creating a GetS3Object processor > for such cases can be great -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (NIFI-7579) Create a GetS3Object Processor
[ https://issues.apache.org/jira/browse/NIFI-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17146562#comment-17146562 ] Mark Payne commented on NIFI-7579: -- Can you explain how you see a GetS3Object processor being different than FetchS3Object? > Create a GetS3Object Processor > -- > > Key: NIFI-7579 > URL: https://issues.apache.org/jira/browse/NIFI-7579 > Project: Apache NiFi > Issue Type: New Feature >Reporter: ArpStorm1 >Assignee: YoungGyu Chun >Priority: Major > > Sometimes the client needs to get only specific object or a subset of objects > from its bucket. Now, the only way to do it is using ListS3 Processor and > after that using FetchS3Object processor. Creating a GetS3Object processor > for such cases can be great -- This message was sent by Atlassian Jira (v8.3.4#803005)