Thanks Joey, It worked. Do you know how to control the parquet file size when it writes to S3. I see lot of small files to s3. Is it possible to right either 512mb or 1GB size file?
On Tue, Dec 5, 2017 at 8:57 PM, Joey Frazee <[email protected]> wrote: > PutParquet doesn't have the AWS S3 SDK included in it itself but it > provides an "Additional Classpath Resources" property that you need to > point at a directory with all the S3 dependencies. I just tested this the > other day with the following jars: > > aws-java-sdk-1.7.4.jar > hadoop-aws-2.7.3.jar > hadoop-common-2.7.3.jar > httpclient-4.5.3.jar > httpcore-4.4.4.jar > jackson-annotations-2.6.0.jar > jackson-core-2.6.1.jar > jackson-databind-2.6.1.jar > > So just grab those from maven central and you should be good to go. > > -joey > > On Dec 5, 2017, 6:53 PM -0600, Madhukar Thota <[email protected]>, > wrote: > > Hi > > Is it possible to use PutParquet processor to write files into S3? I tried > by setting s3 bucket in core-site.xml file but i am getting *No > FileSystem for scheme: s3a* > > *core-site.xml* > > <?xml version="1.0"?> > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > <!-- > Licensed to the Apache Software Foundation (ASF) under one or more > contributor license agreements. See the NOTICE file distributed with > this work for additional information regarding copyright ownership. > The ASF licenses this file to You under the Apache License, Version 2.0 > (the "License"); you may not use this file except in compliance with > the License. You may obtain a copy of the License at > http://www.apache.org/licenses/LICENSE-2.0 > Unless required by applicable law or agreed to in writing, software > distributed under the License is distributed on an "AS IS" BASIS, > WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. > See the License for the specific language governing permissions and > limitations under the License. > --> > > <!-- Put site-specific property overrides in this file. --> > > <configuration> > <property> > <name>fs.defaultFS</name> > <value>s3a://testing</value> > </property> > <property> > <name>fs.s3a.access.key</name> > <value>xxxxxxxxxxxxxxxx</value> > </property> > <property> > <name>fs.s3a.secret.key</name> > <value>xxxxxxxxxxxxxxxxxxx</value> > </property> > </configuration> > >
