Yes. Performance was much better with a real file system (i.e. I ran locally on 
my laptop using the SSD installed there). I don’t expect to have the exact same 
performance with S3 as I don’t have things like data locality there. My use 
case is mainly querying „cold“ datasets, i.e. ones that are not touched often 
and when, only a few queries are done on them. 


> Am 06.10.2016 um 22:47 schrieb Ted Dunning <[email protected]>:
> 
> Have you tried running against a real file system interface? Or even just
> against HDFS?
> 
> 
> 
> On Thu, Oct 6, 2016 at 12:35 PM, Uwe Korn <[email protected] 
> <mailto:[email protected]>> wrote:
> 
>> Hello,
>> 
>> We had some test runs with Drill 1.8 in the last days and wanted to share
>> the experience with you as we've made some interesting findings that
>> astonished us. We did run on our internal company cluster and thus used the
>> S3 API to access our internal storage cluster, not AWS (the behavior should
>> still be the same).
>> 
>> Setup experience: Awesome, it took me less than 30min to have a multimode
>> Drill setup running on Mesos+Aurora with S3 configured. Really nice.
>> 
>> Performance with the 1.8 release: Awful. Compared to the queries I ran
>> locally with Drill on a small dataset, runtimes were magnitudes higher than
>> on my laptop. After some debugging, I saw that hadoop-s3a is always
>> requesting via S3 the byte range from the position we want to start to read
>> until the end of the file. This gave the following HTTP pattern:
>> * GET bytes=8k-100M
>> * GET bytes=2M-100M
>> * GET bytes=4M-100M
>> Although the HTTP request were normally aborted before all the data was
>> send by the server, it was still about 10-15x the size of the input files
>> that went over the network. Using Parquet, I actually hoped to achieve the
>> opposite, i.e. that less the whole file was transferred (my test queries
>> were only using 2 of 15 columns).
>> 
>> In Hadoop 3.0.0-alpha1 [2], there are a lot of improvements w.r.t. S3
>> access. You can now select via fs.s3a.experimental.input.fadvise=random a
>> new reading mode that will only request via S3 the asked range plus a small
>> readahead buffer. While this keeps the number of requests constant, we now
>> only request the actual data we need. With that, performance is not amazing
>> but in an acceptable range.
>> 
>> Still query planning always took at least 35s. This was an effect of
>> fs.s3a.experimental.input.fadvise=random. While the Parquet access is
>> specifying really good which ranges it wants to read, the parser for the
>> metadata cache actually only request 8000 bytes at once and thus lead to
>> several thousand HTTP requests for a single sequential read. As a
>> workaround, we have added a call to FSDataInputStream.
>> setReadahead(metadata-filesize) to limit the access to a single request.
>> This brought reading metadata down to 3s.
>> 
>> Another problem with the metadata cache was, that it actually was rebuild
>> on every query. Drill relies here on the change timestamp of the directory
>> which is not support by S3 [1] and thus always the current time was
>> returned as the modification date of the directory.
>> 
>> These were just our initial, basic findings with Drill. At the moment it
>> looks promising enough so that we will probably do some more usability and
>> performance testing. If we already did something wrong with the initial S3
>> tests, it would be nice to get to know some pointers what it could have
>> been. The bad S3 I/O performance was really surprising for us.
>> 
>> Kind regards,
>> Uwe
>> 
>> [1] https://hadoop.apache.org/docs/r3.0.0-alpha1/hadoop-aws/
>> tools/hadoop-aws/index.html#Warning_2:_Because_Object_stores_dont_track_
>> modification_times_of_directories <https://hadoop.apache.org/ 
>> <https://hadoop.apache.org/>
>> docs/r3.0.0-alpha1/hadoop-aws/tools/hadoop-aws/index.html#
>> Warning_2:_Because_Object_stores_dont_track_modification_times_of_
>> directories>
>> [2] From here on, the tests were made with 
>> Drill-master+hadoop-3.0.0-alpha1+aws-sdk-1.11.35,
>> i.e. custom Drill and Hadoop builds to have dependencies in newer versions.

Reply via email to