Agreed, excellent write up. When this thread started I had forgotten about prior discussions of using SQS instead a ListS3 processor. I am familiar with S3 but not as much SQS, but Adam's article makes it very accessible.
If SQS is preferred to a ListS3 processor, should the ListS3 related tickets be closed? On Mon, Feb 1, 2016 at 9:28 AM, Mark Payne <[email protected]> wrote: > Adam, > > Just read through your post - fantastic write-up! Just wanted to say > thanks for sharing. This is > a question we've seen a few times in the last couple of weeks, and this is > a great resource to > point people to. > > Thanks > -Mark > > > On Jan 31, 2016, at 1:57 AM, Adam Lamar <[email protected]> wrote: > > > > Kyle/Joe, > > > > I've been meaning to document this process myself, and just finished a > post with some details: > > > https://adamlamar.github.io/2016-01-30-monitoring-an-s3-bucket-in-apache-nifi/ > > > > Hope that helps, > > Adam > > > > On 1/30/16 9:29 PM, Joe Witt wrote: > >> Kyle, > >> > >> The ideal case for communicating how to do this would be both a > >> template and an associated doc. Great for a blog or wiki page or > >> something. We can of course give you perms to write to a wiki page on > >> the nifi wiki if interested. The template itself can also be > >> annotated with comments that show up right in the flow itself. That > >> may be a fine option too. > >> > >> Thanks > >> Joe > >> > >> On Sat, Jan 30, 2016 at 2:52 PM, Kyle Burke <[email protected]> > wrote: > >>> Joe/Joe, > >>> Thanks for the response. It makes sense to use SNS and SQS to > respond to > >>> S3 file changes. I’m going see if my company will give me access to > those > >>> Amazon services. I found an article that explains how to setup on this > >>> functionality in the Amazon console. Once that’s setup it seems pretty > >>> straight forward to use GetSQS/DeleteSQS. I suspect many will want this > >>> functionality but I’m not sure what’s the best method (i.e. Template > or user > >>> doc) that explains how to solve this in nifi. I’ll be happy to submit > >>> something if you let me know the right method. > >>> > >>> http://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html > >>> > >>> Respectfully, > >>> > >>> Kyle Burke | Data Science Engineer > >>> IgnitionOne - Marketing Technology. Simplified. > >>> Office: 1545 Peachtree St NE, Suite 500 | Atlanta, GA | 30309 > >>> > >>> > >>> From: Joe Witt > >>> Reply-To: "[email protected]" > >>> Date: Saturday, January 30, 2016 at 2:06 PM > >>> To: "[email protected]" > >>> Subject: Re: ListS3 processor? > >>> > >>> Kyle > >>> > >>> Let us know if that doesn't get you what you need. We have a decent > set of > >>> templates but I didn't see one that demonstrates interaction with > amazon > >>> services. > >>> > >>> Thanks > >>> Joe > >>> > >>> On Jan 30, 2016 12:56 PM, "Joey Frazee" <[email protected]> > wrote: > >>>> Kyle, > >>>> > >>>> I think you can do what you want right now without ListS3 by using S3 > >>>> event notifications. You can configure an event notification to > publish to > >>>> SQS and then use GetSQS to retrieve the events and FetchS3Object to > get the > >>>> JSON file and the rest of the flow could be written as you have in > mind. > >>>> > >>>> Depending on your scale, this might be preferable because it's > >>>> slow/expensive to do listings on S3 prefixes that have a lot of file > >>>> matches. > >>>> > >>>> > >>>> -joey > >>>> > >>>> On Jan 30, 2016, at 11:40 AM, Joe Skora <[email protected]> wrote: > >>>> > >>>> Kyle, > >>>> > >>>> Processors exist to Put, Fetch, and Delete S3Objects, but ListS3 is > in the > >>>> backlog on ticket NIFI-840 at the moment. It should fit the > List/Fetch > >>>> metaphor like the List/Fetch processors pairs for xFile, xHDFS, > xSFTP, etc. > >>>> > >>>> Regards, > >>>> Joe Skora > >>>> > >>>> On Sat, Jan 30, 2016 at 10:14 AM, Kyle Burke < > [email protected]> > >>>> wrote: > >>>>> All, > >>>>> I'm trying to get Nifi set up to a move data around S3. My first > >>>>> attempt is to just monitor a S3 folder where json files are placed > and then > >>>>> copy the file, convert it to Avro, and the drop it in a different S3 > folder. > >>>>> The documentation is pretty slim for working with S3. I can't seem > to get it > >>>>> working and was wondering if anyone had any S3 examples for > monitoring an S3 > >>>>> folder (i.e.. something like a ListS3 processer similar to what is > available > >>>>> on a local file system?) > >>>>> > >>>>> Respectfully, > >>>>> > >>>>> Kyle Burke | Data Science Engineer > >>>>> IgnitionOne - Marketing Technology. Simplified. > >>>>> Office: 1545 Peachtree St NE, Suite 500 | Atlanta, GA | 30309 > >>>>> Direct: 404.961.3918 > >>>>> > > > >
