Interesting, so day=2017-10-04 doesn't actually filter any data out, right?
--num_hdfs_worker_threads I believe controls the size of the thread pool for HDFS metadata operations like moves, deletions, etc (the argument name isn't very specific) The option that would make a difference in your case is --num_remote_hdfs_io_threads, which controls the number of concurrent remote HDFS operations. The default is 8, which could be too low if you are doing a lot of remote reads. On Wed, Dec 6, 2017 at 4:09 PM, Piyush Narang <[email protected]> wrote: > Thanks for getting back Mostafa. IO rates while Presto was running seemed > to be higher (60-125 MB/s). If it was a slowdown due to GC / HDFS, I’d > imagine some of the Presto runs would be affected but that doesn’t seem to > be the case. We’ve run the query on a few occasions on the same day (fairly > close to each other) and also on different days to reduce the likelihood of > the fs cache and while we do see subsequent runs being a little quicker on > Presto & Impala, it always seems like the Impala run is much slower by 2x – > 3x. I’ll check out the GC metrics on our NameNode to be double check and > get back with that. > > > > The number of partitions & files is an interesting dimension. The data is > partitioned in this fashion (currently only have one day’s worth of data): > > partitionkey1={0|1}/day=2017-10-04/hour={00-23}/platform={ > US|EU|AS}/partitionKey2={0|1}/partitionKey3={0|1} > > 2 * 1 * 24 * 3 * 2 * 2 = 576 > > > > Looking at the number of files we end up touching as part of the > day=2017-10-04 query, it is: ~15K. Do you think this is on the high end > given we have only 4 machines (with 48 cores each?). I noticed that there’s > a start-up Impala option “-num_hdfs_worker_threads”. I didn’t find much > documentation on this, but wondering if bumping this up would potentially > help? (Could try this out tomorrow). > > > > Thanks, > > > > -- Piyush > > > > > > *From: *Mostafa Mokhtar <[email protected]> > *Reply-To: *"[email protected]" <[email protected]> > *Date: *Wednesday, December 6, 2017 at 5:51 PM > *To: *"[email protected]" <[email protected]> > *Subject: *Re: Debugging slow Impala hdfs-scans > > > > The query profile shows that a lot of time is spent in "- > TotalStorageWaitTime: 20h4m", usually this is an indicator that Impala is > waiting on IO from HDFS. > > > > Number of files can also be an issue, I recommend checking GC time for the > HDFS NameNode. > > > > The filter is not selective so it is interesting to see that Presto is > running the same query faster, did you observe IO rates while Presto was > running or the data was read from the file system cache? > > > > > > > > On Wed, Dec 6, 2017 at 2:01 PM, Piyush Narang <[email protected]> wrote: > > Hi folks, > > > > I’m trying to debug why the performance of our new Impala setup is > performing a bit worse on the same queries as Presto. We’re running Impala > 2.10.0 and starting it up with 225G of memory (-mem-limit). It runs on the > same set of nodes as Presto (4 nodes, 48 cores each, 10G ethernet). We > noticed a couple of queries in Impala were fairly slow in the hdfs-scan > stage so we tried to isolate the behavior with a slightly simpler query: > > select max(hour + nb_display) from my_large_parquet_table where day = > '2017-10-04'; > > > > This query takes around 28-30 mins on Impala and seems to run in around > 8.5 mins on Presto. The input table is 11.64TB and uses Parquet (snappy > compressed). > > > > Looking at the performance profile (attached in case anyone’s interested), > I noticed that during the query Impala seems to be averaging around 1.1 - > 1.2 MB/s per thread and the total read throughput is around 2.13 MB/s. I > tried bumping up the number of scanner threads (from 48 to 96) but that > seemed to only help marginally (improved runtimes by ~10s). Running “sar –n > DEV 1” on some of our hosts while the query is running, seems to show that > Impala is reading at a rate of 30-50 MB/s (whereas we see this go up to > 60-125 MB/s on our other runs). > > > > We haven’t tweaked our Impala setup much beyond the defaults that come out > of the box. I’m wondering if I’m missing some tuning settings that help > improve read rates from HDFS when Impala is running outside of the > datanodes. If anyone has any ideas / suggestions, they’d be welcome. I am > happy to provide more details if needed as well. > > > > Thanks, > > > > -- Piyush > > > > >
