It's Apache Hadoop, but I see multi-part upload is in the works for this as well. https://issues.apache.org/jira/browse/HADOOP-9454
Another solution would be to limit the size of the HFiles to 5GB, but I don't know yet what effect this would have on cluster performance. On Thu, Oct 10, 2013 at 7:32 PM, Nick Dimiduk <[email protected]> wrote: > On Tue, Oct 8, 2013 at 6:35 AM, Adrian Sandulescu < > [email protected]> wrote: > > > Imports work great, but only when using the s3n:// protocol (which means > > and HFile limit of 5GB). > > > > Are you using Apache Hadoop or an EMR build? From what I recall, EMR ships > a customized s3n implementation that uses multi-part upload feature of S3 > to chunk files and bypass the old 5GB limitation. You might give that shot. >
