Re: Amazon S3 Processor

Ryan Blue Fri, 17 Jul 2015 13:05:41 -0700

Ryan,

There are a couple of AWS-related bundles in the build, thought I've notlooked into them very closely and don't know if they have the S3 supportyou're looking for.

You should, however, be able to get the HDFSPutProcessor to do what youwant. The Hadoop FileSystem API can be used to interact with S3 usings3n: and s3a: path URIs. The HDFS processors have jets3t in thedependencies, though I don't see hadoop-aws so I'm not sure whetheryou'd have to drop extra dependencies into the lib directory to use theHadoop S3 URIs.

Once you've fixed the library issue (or confirmed that the libraries arealready there), you can also use the Kite processor with S3 to write topartitioned datasets in S3 stored as either Parquet or Avro.


I hope that helps!

rb

On 07/16/2015 06:04 PM, Ryan Hendrickson wrote:

Hi all,
    I thought there was an Amazon S3 Processor.. I dug around the Jira
Issues.. Looks like one was discussed:
https://issues.apache.org/jira/browse/NIFI-25

    Although I don't see it in 0.2.0, or 0.1.0... Am I missing something
simple here, like an extra nar for extensions?

Thanks,
Ryan



--
Ryan Blue
Software Engineer
Cloudera, Inc.

Re: Amazon S3 Processor

Reply via email to