Re: First impressions with Drill+Parquet+S3

Uwe Korn Fri, 28 Oct 2016 01:54:39 -0700

Hello Parth,

I filed JIRAs for S3 performance:
 * https://issues.apache.org/jira/browse/DRILL-4977
 * https://issues.apache.org/jira/browse/DRILL-4976
 * https://issues.apache.org/jira/browse/DRILL-4978


and one for execution of drillbits inside Apache Mesos+Aurora:
 * https://issues.apache.org/jira/browse/DRILL-4979

As a start I would look first into the latter as this is a requirementto actually safely use Drill on such a cluster. I have commented with abasic implementation idea, I’d love to get some feedback on that as itwould be my first Drill contribution.


Uwe


On 28.10.16 00:26, Parth Chandra wrote:

Hi Uwe,

   Can you log JIRA's for the performance issues that you encounter while
working on S3? Not many folks are working on optimizing that path, so any
patches that you might be able to contribute would be appreciated.

Parth

On Thu, Oct 6, 2016 at 1:56 PM, Uwe Korn <[email protected]> wrote:

Yes. Performance was much better with a real file system (i.e. I ran
locally on my laptop using the SSD installed there). I don’t expect to have
the exact same performance with S3 as I don’t have things like data
locality there. My use case is mainly querying „cold“ datasets, i.e. ones
that are not touched often and when, only a few queries are done on them.

Am 06.10.2016 um 22:47 schrieb Ted Dunning <[email protected]>:

Have you tried running against a real file system interface? Or even just
against HDFS?



On Thu, Oct 6, 2016 at 12:35 PM, Uwe Korn <[email protected] <mailto:

[email protected]>> wrote:

Hello,

We had some test runs with Drill 1.8 in the last days and wanted to

share

the experience with you as we've made some interesting findings that
astonished us. We did run on our internal company cluster and thus used

the

S3 API to access our internal storage cluster, not AWS (the behavior

should

still be the same).

Setup experience: Awesome, it took me less than 30min to have a

multimode

Drill setup running on Mesos+Aurora with S3 configured. Really nice.

Performance with the 1.8 release: Awful. Compared to the queries I ran
locally with Drill on a small dataset, runtimes were magnitudes higher

than

on my laptop. After some debugging, I saw that hadoop-s3a is always
requesting via S3 the byte range from the position we want to start to

read

until the end of the file. This gave the following HTTP pattern:
* GET bytes=8k-100M
* GET bytes=2M-100M
* GET bytes=4M-100M
Although the HTTP request were normally aborted before all the data was
send by the server, it was still about 10-15x the size of the input

files

that went over the network. Using Parquet, I actually hoped to achieve

the

opposite, i.e. that less the whole file was transferred (my test queries
were only using 2 of 15 columns).

In Hadoop 3.0.0-alpha1 [2], there are a lot of improvements w.r.t. S3
access. You can now select via fs.s3a.experimental.input.fadvise=random

new reading mode that will only request via S3 the asked range plus a

small

readahead buffer. While this keeps the number of requests constant, we

now

only request the actual data we need. With that, performance is not

amazing

but in an acceptable range.

Still query planning always took at least 35s. This was an effect of
fs.s3a.experimental.input.fadvise=random. While the Parquet access is
specifying really good which ranges it wants to read, the parser for the
metadata cache actually only request 8000 bytes at once and thus lead to
several thousand HTTP requests for a single sequential read. As a
workaround, we have added a call to FSDataInputStream.
setReadahead(metadata-filesize) to limit the access to a single

request.

This brought reading metadata down to 3s.

Another problem with the metadata cache was, that it actually was

rebuild

on every query. Drill relies here on the change timestamp of the

directory

which is not support by S3 [1] and thus always the current time was
returned as the modification date of the directory.

These were just our initial, basic findings with Drill. At the moment it
looks promising enough so that we will probably do some more usability

and

performance testing. If we already did something wrong with the initial

S3

tests, it would be nice to get to know some pointers what it could have
been. The bad S3 I/O performance was really surprising for us.

Kind regards,
Uwe

[1] https://hadoop.apache.org/docs/r3.0.0-alpha1/hadoop-aws/
tools/hadoop-aws/index.html#Warning_2:_Because_Object_

stores_dont_track_

modification_times_of_directories <https://hadoop.apache.org/ <

https://hadoop.apache.org/>

docs/r3.0.0-alpha1/hadoop-aws/tools/hadoop-aws/index.html#
Warning_2:_Because_Object_stores_dont_track_modification_times_of_
directories>
[2] From here on, the tests were made with Drill-master+hadoop-3.0.0-

alpha1+aws-sdk-1.11.35,

i.e. custom Drill and Hadoop builds to have dependencies in newer

versions.

Re: First impressions with Drill+Parquet+S3

Reply via email to