Re: Splitting data input to Distcp

2012-05-04 Thread Pedro Figueiredo
On 3 May 2012, at 23:47, Himanshu Vijay wrote: > Pedro, > > Thanks for the response. Unfortunately I am running it on in-house cluster > and from there I need to upload to S3. > Hi, Last night I was thinking about this... what happens if you copy s3://region.elasticmapreduce/libs/s3distcp/1.0

Re: Splitting data input to Distcp

2012-05-03 Thread Himanshu Vijay
Pedro, Thanks for the response. Unfortunately I am running it on in-house cluster and from there I need to upload to S3. -Himanshu On Wed, May 2, 2012 at 2:03 PM, Pedro Figueiredo wrote: > > On 2 May 2012, at 18:29, Himanshu Vijay wrote: > > > Hi, > > > > I have 100 files each of ~3 GB. I need

Re: Splitting data input to Distcp

2012-05-02 Thread Pedro Figueiredo
On 2 May 2012, at 18:29, Himanshu Vijay wrote: > Hi, > > I have 100 files each of ~3 GB. I need to distcp them to S3 but copying > fails because of large size of files. The files are not gzipped so they are > splittable. Is there a way or property to tell Distcp to first split the > input files