there is nothing to split files in Export Snapshot because you don't need it.
take a look at http://docs.aws.amazon.com/AmazonS3/latest/dev/UploadingObjects.html "With a single PUT operation you can upload objects up to 5 GB in size" "Using the Multipart upload API you can upload large objects, up to 5 TB." you just have to configure the s3 connector to use multipart. and you'll be able to upload files > 5G Matteo On Thu, Feb 4, 2016 at 7:50 PM, Vishnu Amdiyala <[email protected]> wrote: > Thank you guys for the quick response. My question is how do I generate > part files out of these Hfiles to upload to S3 ? Export Snapshot tool which > I use doesn't allow more mappers than the number of files[correct me if I > am wrong]. So, how will I be able to generate splits out of the each bulk > file>5GB? > > > On Thu, Feb 4, 2016 at 7:14 PM, Ted Yu <[email protected]> wrote: > > > Vishnu: > > Please take a look > > at > hadoop-common-project/hadoop-common/src/main/resources/core-default.xml > > for multipart related config parameters (other than the one mentioned by > > Matteo): > > > > fs.s3n.multipart.uploads.block.size > > fs.s3n.multipart.copy.block.size > > > > Cheers > > > > On Thu, Feb 4, 2016 at 7:00 PM, Matteo Bertozzi <[email protected] > > > > wrote: > > > > > the multipart upload is on the s3 connector. > > > you can tune your connector, to use multipart > > > fs.s3n.multipart.uploads.enabled = true > > > > > > Matteo > > > > > > > > > On Thu, Feb 4, 2016 at 6:34 PM, Vishnu Amdiyala < > > [email protected]> > > > wrote: > > > > > > > Hi, > > > > > > > > I am trying to back up snapshots of Hbase table to S3 bucket of which > > > each > > > > Hfile is sized>5GB which fails due to S3 bucket's 5gb limitation. > The > > > > export snapshot's source says that the mappers are set to max of > total > > > > number of files. Is there a way to use this tool to split files and > > > upload > > > > to S3 in parts? > > > > > > > > > > > > Thanks! > > > > Vishnu > > > > > > > > > >
