X-Posted to NiFi Users Mailing List
<http://apache-nifi-users-list.2361937.n4.nabble.com/>.

The team I work in is doing a good deal of work with NiFi S3 Processors
amongst others and writing some of our own custom processors. Our team has
a similar use-case requirement for a variation on the ListS3 Processor as
Martijn Dekkers in this post here
<http://apache-nifi-users-list.2361937.n4.nabble.com/Listing-S3-tp5777p5850.html>.


For context, the reader may wish to refer to this entire thread from the
beginning
<http://apache-nifi-users-list.2361937.n4.nabble.com/Listing-S3-td5777.html#a5850>
.

In our case we would like the processor to allow for incoming FlowFiles and
be able to change the S3 bucket it "listens to" by making the s3.bucket
attribute modifiable using the NiFi expression language while continuing to
maintain the internal state of the Processor. We would simultaneously
restrict the prefix property to be updated, making it a fixed value for the
entire lifetime of the Processor's running.  In other words, we want a
WatchMultipleS3Buckets Processor that maintains state for multiple Buckets.

To make this work requires a change in the state management behavior of the
processor. The currentKeys field is a Set that holds the collection of
unique "keys" (filenames) that correspond to each of the StateMap's file
entry that it is tracking. Each key is the S3 Object's associated "
*filename*".

In practice this means, that our new processor would modify the state store
logic. Currently, the value for each entry in the StateMap is simply the
filename of the S3 object. Our suggested change in the StateMap's HashMap
would have this value now be  of the *bucketName + some delimiter +
filename* of the S3 Object.

Our team is working on our variation of this *WatchMultipleS3Buckets*. We
would like to offer to contribute back this effort as follows. Since there
will be a great deal of common code between the current ListS3 Processor
and our newly proposed WatchMultipleS3Buckets Processor, I propose a
refactoring to create a new Abstract class: *AbstractS3WatchProcessor* with
the existing ListS3 and the newly created WatchMultipleS3Buckets as
subclasses of this new AbstractS3WatchProcessor.

Is this additional Processor & modification something the community would
be interested in? We are asking because we want to know if this is a
direction that the community would like to go in with the existing ListS3
processor. We will be happy to do the work to contribute this Processor
variation back to the project, but would prefer not to put the extra work
in to contributing *if that is not the desired direction* by the NiFi
Maintainers.

If yes, the new Processor contribution is desired, should I simply go ahead
and add a new item to the NiFi project JIRA here
<https://issues.apache.org/jira/projects/NIFI> and then follow section 6
<https://cwiki.apache.org/confluence/display/NIFI/Contributor+Guide#ContributorGuide-providingCodeOrDocumentationContributionProvidingcodeordocumentationcontributions>
(Providing
code or documentation contributions) of the Contributor Guide?

Thanks

--aramcodez

Reply via email to