Take a look at the MergeRecord processor, you can use that before
PutParquet to create the appropriately sized files.
On Tue, Dec 5, 2017 at 10:36 PM Madhukar Thota
wrote:
> Thanks Joey,
>
> It worked. Do you know how to control the parquet file size when it writes
>
Thanks Joey,
It worked. Do you know how to control the parquet file size when it writes
to S3. I see lot of small files to s3. Is it possible to right either 512mb
or 1GB size file?
On Tue, Dec 5, 2017 at 8:57 PM, Joey Frazee wrote:
> PutParquet doesn't have the AWS S3
PutParquet doesn't have the AWS S3 SDK included in it itself but it provides an
"Additional Classpath Resources" property that you need to point at a directory
with all the S3 dependencies. I just tested this the other day with the
following jars:
aws-java-sdk-1.7.4.jar
hadoop-aws-2.7.3.jar
Hi
Is it possible to use PutParquet processor to write files into S3? I tried
by setting s3 bucket in core-site.xml file but i am getting *No FileSystem
for scheme: s3a*
*core-site.xml*
fs.defaultFS
s3a://testing
fs.s3a.access.key
fs.s3a.secret.key