there is nothing to split files in Export Snapshot because you don't need
it.

take a look at
http://docs.aws.amazon.com/AmazonS3/latest/dev/UploadingObjects.html
"With a single PUT operation you can upload objects up to 5 GB in size"
"Using the Multipart upload API you can upload large objects, up to 5 TB."

you just have to configure the s3 connector to use multipart.
and you'll be able to upload files > 5G

Matteo


On Thu, Feb 4, 2016 at 7:50 PM, Vishnu Amdiyala <[email protected]>
wrote:

> Thank you guys for the quick response. My question is how do I generate
> part files out of these Hfiles to upload to S3 ? Export Snapshot tool which
> I use doesn't allow more mappers than the number of files[correct me if I
> am wrong]. So, how will I be able to generate splits out of the each bulk
> file>5GB?
>
>
> On Thu, Feb 4, 2016 at 7:14 PM, Ted Yu <[email protected]> wrote:
>
> > Vishnu:
> > Please take a look
> > at
> hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
> > for multipart related config parameters (other than the one mentioned by
> > Matteo):
> >
> > fs.s3n.multipart.uploads.block.size
> > fs.s3n.multipart.copy.block.size
> >
> > Cheers
> >
> > On Thu, Feb 4, 2016 at 7:00 PM, Matteo Bertozzi <[email protected]
> >
> > wrote:
> >
> > > the multipart upload is on the s3 connector.
> > > you can tune your connector, to use multipart
> > > fs.s3n.multipart.uploads.enabled = true
> > >
> > > Matteo
> > >
> > >
> > > On Thu, Feb 4, 2016 at 6:34 PM, Vishnu Amdiyala <
> > [email protected]>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I am trying to back up snapshots of Hbase table to S3 bucket of which
> > > each
> > > > Hfile is sized>5GB which fails due to S3 bucket's 5gb limitation.
> The
> > > > export snapshot's source says that the mappers are set to max of
> total
> > > > number of files. Is there a way to use this tool to split files and
> > > upload
> > > > to S3 in parts?
> > > >
> > > >
> > > > Thanks!
> > > > Vishnu
> > > >
> > >
> >
>

Reply via email to