[jira] [Commented] (NIFI-7579) Create a GetS3Object Processor

2020-07-07 Thread Pierre Villard (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152860#comment-17152860
 ] 

Pierre Villard commented on NIFI-7579:
--

I think the comment here is that

GenerateFlowFile (where you can set flow files attributes as dynamic 
properties) -> FetchS3

would provide the exact feature you're looking for.

Another option would be to make optional the incoming connection on FetchS3, 
but not sure this would provide a good user experience.

> Create a GetS3Object Processor
> --
>
> Key: NIFI-7579
> URL: https://issues.apache.org/jira/browse/NIFI-7579
> Project: Apache NiFi
>  Issue Type: New Feature
>Reporter: ArpStorm1
>Assignee: YoungGyu Chun
>Priority: Major
>
> Sometimes the client needs to get only specific object or a subset of objects 
> from its bucket. Now, the only way to do it is using ListS3 Processor and 
> after that using FetchS3Object processor. Creating a GetS3Object processor 
> for such cases can be great 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (NIFI-7579) Create a GetS3Object Processor

2020-06-30 Thread ArpStorm1 (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148915#comment-17148915
 ] 

ArpStorm1 commented on NIFI-7579:
-

The problem with the List/Fetch pattern regarding S3 is the need to first list 
all the objects, and the list operation can be very heavy.
S3 is a common standard today of Object storage, and not only Amazon 
implemented it.
Using listS3 processor can create heavy workload on the backend storage, 
resulting in slow answer which can fail the entire flow process. 
And sometimes that can be avoided by getting the exact object the user needs.
GetS3Object not has to be the solution - maybe implement this logic to the 
FetchS3Object processor would be enough. 

> Create a GetS3Object Processor
> --
>
> Key: NIFI-7579
> URL: https://issues.apache.org/jira/browse/NIFI-7579
> Project: Apache NiFi
>  Issue Type: New Feature
>Reporter: ArpStorm1
>Assignee: YoungGyu Chun
>Priority: Major
>
> Sometimes the client needs to get only specific object or a subset of objects 
> from its bucket. Now, the only way to do it is using ListS3 Processor and 
> after that using FetchS3Object processor. Creating a GetS3Object processor 
> for such cases can be great 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (NIFI-7579) Create a GetS3Object Processor

2020-06-30 Thread Wouter de Vries (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148342#comment-17148342
 ] 

Wouter de Vries commented on NIFI-7579:
---

[~ArpStorm1]do you have a specific reason why that should not happen? 

As far as I can see the code of this new processor would be by and large 
identical to the existing FetchS3Object processor, is that not the case?

> Create a GetS3Object Processor
> --
>
> Key: NIFI-7579
> URL: https://issues.apache.org/jira/browse/NIFI-7579
> Project: Apache NiFi
>  Issue Type: New Feature
>Reporter: ArpStorm1
>Assignee: YoungGyu Chun
>Priority: Major
>
> Sometimes the client needs to get only specific object or a subset of objects 
> from its bucket. Now, the only way to do it is using ListS3 Processor and 
> after that using FetchS3Object processor. Creating a GetS3Object processor 
> for such cases can be great 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (NIFI-7579) Create a GetS3Object Processor

2020-06-29 Thread Mark Payne (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148081#comment-17148081
 ] 

Mark Payne commented on NIFI-7579:
--

I'm not sure that we should introduce another processor just to avoid needing 
to connect a List/Fetch pair of processors. The pattern of GetXYZ is an older 
pattern and most of the newer processors that are responsible for gathering 
files/blobs of data and the like tend to follow the List/Fetch pattern. This 
pattern has proven to provide many advantages over the Get pattern. It allows 
for easy and powerful filtering of data before fetching the data. It separates 
the concerns of listing and maintaining state about what's been seen from 
actually gathering data. It provides a very powerful mechanism for distributing 
the data and processing load across the cluster. It makes it far easier to 
handle flows that are more batch-oriented, with the introduction of NIFI-7476.

I would be a -1 on adding a new processor just to avoid needing to connect an 
upstream List processor. It would mean additional code that must be maintained 
and would lead to confusion for users when trying to determine which Processor 
they need, especially for newer users.

> Create a GetS3Object Processor
> --
>
> Key: NIFI-7579
> URL: https://issues.apache.org/jira/browse/NIFI-7579
> Project: Apache NiFi
>  Issue Type: New Feature
>Reporter: ArpStorm1
>Assignee: YoungGyu Chun
>Priority: Major
>
> Sometimes the client needs to get only specific object or a subset of objects 
> from its bucket. Now, the only way to do it is using ListS3 Processor and 
> after that using FetchS3Object processor. Creating a GetS3Object processor 
> for such cases can be great 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (NIFI-7579) Create a GetS3Object Processor

2020-06-29 Thread ArpStorm1 (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147862#comment-17147862
 ] 

ArpStorm1 commented on NIFI-7579:
-

I don't think this behavior should be merged into the FetchS3Object processor. 
My suggestion is to create a processor that shouldn't need upstream connection 
for working with S3

> Create a GetS3Object Processor
> --
>
> Key: NIFI-7579
> URL: https://issues.apache.org/jira/browse/NIFI-7579
> Project: Apache NiFi
>  Issue Type: New Feature
>Reporter: ArpStorm1
>Assignee: YoungGyu Chun
>Priority: Major
>
> Sometimes the client needs to get only specific object or a subset of objects 
> from its bucket. Now, the only way to do it is using ListS3 Processor and 
> after that using FetchS3Object processor. Creating a GetS3Object processor 
> for such cases can be great 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (NIFI-7579) Create a GetS3Object Processor

2020-06-29 Thread Wouter de Vries (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147661#comment-17147661
 ] 

Wouter de Vries commented on NIFI-7579:
---

Would that not be solved with the GenerateFlowFile processor?

If that does not solve it, I would still argue that this behavior should be 
merged with the FetchS3Object processor, so that it supports being triggered 
with and without an upstream connection.

> Create a GetS3Object Processor
> --
>
> Key: NIFI-7579
> URL: https://issues.apache.org/jira/browse/NIFI-7579
> Project: Apache NiFi
>  Issue Type: New Feature
>Reporter: ArpStorm1
>Assignee: YoungGyu Chun
>Priority: Major
>
> Sometimes the client needs to get only specific object or a subset of objects 
> from its bucket. Now, the only way to do it is using ListS3 Processor and 
> after that using FetchS3Object processor. Creating a GetS3Object processor 
> for such cases can be great 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (NIFI-7579) Create a GetS3Object Processor

2020-06-28 Thread ArpStorm1 (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147223#comment-17147223
 ] 

ArpStorm1 commented on NIFI-7579:
-

The FetchS3Object depends on upstream connection. You can't start your flow 
with it. That makes users use processors like ListS3 or something else. But 
sometimes the user wants to start his workflow with fetching objects

> Create a GetS3Object Processor
> --
>
> Key: NIFI-7579
> URL: https://issues.apache.org/jira/browse/NIFI-7579
> Project: Apache NiFi
>  Issue Type: New Feature
>Reporter: ArpStorm1
>Assignee: YoungGyu Chun
>Priority: Major
>
> Sometimes the client needs to get only specific object or a subset of objects 
> from its bucket. Now, the only way to do it is using ListS3 Processor and 
> after that using FetchS3Object processor. Creating a GetS3Object processor 
> for such cases can be great 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (NIFI-7579) Create a GetS3Object Processor

2020-06-26 Thread Mark Payne (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17146562#comment-17146562
 ] 

Mark Payne commented on NIFI-7579:
--

Can you explain how you see a GetS3Object processor being different than 
FetchS3Object?

> Create a GetS3Object Processor
> --
>
> Key: NIFI-7579
> URL: https://issues.apache.org/jira/browse/NIFI-7579
> Project: Apache NiFi
>  Issue Type: New Feature
>Reporter: ArpStorm1
>Assignee: YoungGyu Chun
>Priority: Major
>
> Sometimes the client needs to get only specific object or a subset of objects 
> from its bucket. Now, the only way to do it is using ListS3 Processor and 
> after that using FetchS3Object processor. Creating a GetS3Object processor 
> for such cases can be great 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)