Re: Reading ORC Files from S3

David Rosenstrauch Tue, 29 Sep 2015 09:27:19 -0700

Great!  I'll follow up with you guys off-list.

DR


On 09/29/2015 02:00 AM, Gopal Vijayaraghavan wrote:

Hi,

OK, well that was easy.  Figured out my issue and managed to get ORC
working over s3a.  And got a huge speed-up over s3n!  (On the order of
10x!)


Cool! S3n is rather old now, while the aws-sdk updates keep s3a moving.

So yeah, I'm game for testing some new code when/if you're feeling
motivated to work on this.  Feel free to email me off-list and we can
get into the details.


+Rajesh - who's actively chasing down the ORC + S3 changes today.

Your email came at an opportune moment, since Rajesh's ORC changes landed
in hive-2.0 branch today

https://github.com/apache/hive/commit/a4c43f0335b33a75d2e9f3dc53b3cd33f8f11
5cf


Cheers,
Gopal


On 09/28/2015 10:43 PM, David Rosenstrauch wrote:

Super helpful response - thanks so much!  At least I know I'm not crazy
now!  (And shouldn't spend any more time on tweaks trying to get this to
work on s3n.)

Let me try to start testing this using out-of-the-box s3a protocol.  (I
haven't been able to get that to work at all yet - keep getting "Unable
to load AWS credentials from any provider in the chain" errors.)  Once
I'm able to get that far I'd be up for trying to test some new code. (As
long as it doesn't wind up taking too much time.)

Will report back soon.

Thanks again!

DR

On 09/28/2015 06:14 PM, Gopal Vijayaraghavan wrote:

avail.  I was hoping perhaps someone on the list here might
be able to shed some light as to why we're having these problems
and/or
have some suggestions on how we might be able to work around them.

...

   (I.e., theoretically ORC should be able to skip reading large
portions
of the index files by jumping directly to the index
records that match the supplied search criteria. (Or at least jumping
to
a stripe close to them.))  But this is proving not to be the case.


Not theoretically. ORC does that and that's the issue.

S3n is badly broken for a columnar format & even S3A is missing a
couple
of features which are essential to get read performance over HTTP.

Here's one example - every seek() disconnects & restablishes an SSL
connection in S3 (that fix is a ~2x perf increase for S3a).

https://issues.apache.org/jira/browse/HADOOP-12444


In another scenario we found that a readFully(colOffset,... colSize)
will
open an unbounded reader in S3n instead of reading the fixed chunk off
HTTP.

https://issues.apache.org/jira/browse/HADOOP-11867


The lack of this means that even the short-live keep-alive gets turned
off
by the S3 impl, when doing a forward-seek read pattern, because it is a
recv buffer-dropping disconnect, not a complete request.

The Amazon proprietary S3 drivers are not subject to these problems, so
they work with ORC very well. It's the open source S3 filesystem impls
which are holding us back.

Is ORC simply unable to work efficiently against data stored on S3n?
(I.e., due to network round-trips taking too long.)


S3n is unable to handle any columnar format efficiently - it fires an
HTTP
GET for each seek, marked till end of the file. Any format which
requires
forward seeks or bounded readers is going to die via TCP window &
round-trip thrashing.


I know what's needed for s3a to work well with columnar readers
(Parquet/ORC/RCFile included) and future proof it so that it will work
fine when HTTP/2 arrives.

If you're interested in being guinea pig for S3a fixes, it is currently
sitting on my back burner (I'm not a hadoop committer) - the FS fixes
are
about two weeks worth of work for a single motivated dev.

Cheers,
Gopal

Re: Reading ORC Files from S3

Reply via email to