Hello Parth,

I filed JIRAs for S3 performance:
 * https://issues.apache.org/jira/browse/DRILL-4977
 * https://issues.apache.org/jira/browse/DRILL-4976
 * https://issues.apache.org/jira/browse/DRILL-4978

and one for execution of drillbits inside Apache Mesos+Aurora:
 * https://issues.apache.org/jira/browse/DRILL-4979

As a start I would look first into the latter as this is a requirement to actually safely use Drill on such a cluster. I have commented with a basic implementation idea, I’d love to get some feedback on that as it would be my first Drill contribution.

Uwe


On 28.10.16 00:26, Parth Chandra wrote:
Hi Uwe,

   Can you log JIRA's for the performance issues that you encounter while
working on S3? Not many folks are working on optimizing that path, so any
patches that you might be able to contribute would be appreciated.

Parth

On Thu, Oct 6, 2016 at 1:56 PM, Uwe Korn <[email protected]> wrote:

Yes. Performance was much better with a real file system (i.e. I ran
locally on my laptop using the SSD installed there). I don’t expect to have
the exact same performance with S3 as I don’t have things like data
locality there. My use case is mainly querying „cold“ datasets, i.e. ones
that are not touched often and when, only a few queries are done on them.


Am 06.10.2016 um 22:47 schrieb Ted Dunning <[email protected]>:

Have you tried running against a real file system interface? Or even just
against HDFS?



On Thu, Oct 6, 2016 at 12:35 PM, Uwe Korn <[email protected] <mailto:
[email protected]>> wrote:
Hello,

We had some test runs with Drill 1.8 in the last days and wanted to
share
the experience with you as we've made some interesting findings that
astonished us. We did run on our internal company cluster and thus used
the
S3 API to access our internal storage cluster, not AWS (the behavior
should
still be the same).

Setup experience: Awesome, it took me less than 30min to have a
multimode
Drill setup running on Mesos+Aurora with S3 configured. Really nice.

Performance with the 1.8 release: Awful. Compared to the queries I ran
locally with Drill on a small dataset, runtimes were magnitudes higher
than
on my laptop. After some debugging, I saw that hadoop-s3a is always
requesting via S3 the byte range from the position we want to start to
read
until the end of the file. This gave the following HTTP pattern:
* GET bytes=8k-100M
* GET bytes=2M-100M
* GET bytes=4M-100M
Although the HTTP request were normally aborted before all the data was
send by the server, it was still about 10-15x the size of the input
files
that went over the network. Using Parquet, I actually hoped to achieve
the
opposite, i.e. that less the whole file was transferred (my test queries
were only using 2 of 15 columns).

In Hadoop 3.0.0-alpha1 [2], there are a lot of improvements w.r.t. S3
access. You can now select via fs.s3a.experimental.input.fadvise=random
a
new reading mode that will only request via S3 the asked range plus a
small
readahead buffer. While this keeps the number of requests constant, we
now
only request the actual data we need. With that, performance is not
amazing
but in an acceptable range.

Still query planning always took at least 35s. This was an effect of
fs.s3a.experimental.input.fadvise=random. While the Parquet access is
specifying really good which ranges it wants to read, the parser for the
metadata cache actually only request 8000 bytes at once and thus lead to
several thousand HTTP requests for a single sequential read. As a
workaround, we have added a call to FSDataInputStream.
setReadahead(metadata-filesize) to limit the access to a single
request.
This brought reading metadata down to 3s.

Another problem with the metadata cache was, that it actually was
rebuild
on every query. Drill relies here on the change timestamp of the
directory
which is not support by S3 [1] and thus always the current time was
returned as the modification date of the directory.

These were just our initial, basic findings with Drill. At the moment it
looks promising enough so that we will probably do some more usability
and
performance testing. If we already did something wrong with the initial
S3
tests, it would be nice to get to know some pointers what it could have
been. The bad S3 I/O performance was really surprising for us.

Kind regards,
Uwe

[1] https://hadoop.apache.org/docs/r3.0.0-alpha1/hadoop-aws/
tools/hadoop-aws/index.html#Warning_2:_Because_Object_
stores_dont_track_
modification_times_of_directories <https://hadoop.apache.org/ <
https://hadoop.apache.org/>
docs/r3.0.0-alpha1/hadoop-aws/tools/hadoop-aws/index.html#
Warning_2:_Because_Object_stores_dont_track_modification_times_of_
directories>
[2] From here on, the tests were made with Drill-master+hadoop-3.0.0-
alpha1+aws-sdk-1.11.35,
i.e. custom Drill and Hadoop builds to have dependencies in newer
versions.



Reply via email to