For explanation regarding why we are rebuilding the metadata cache, take a look at Padma's previous email. Most likely, there is a data change in the folder. If not we should refresh the metadata cache and its a bug.
Drill currently does not do incremental metadata refreshes. Now lets say you have a table "transactions" (with 100 partitions) and you added a new partition to the "transaction" folder. Drill refreshes the metadata cache for all the partitions (even though we only added 1 partition and nothing has changed in the remaining partitions) automatically for the subsequent query. This might explain why its taking a long time. - Rahul On Mon, Mar 6, 2017 at 9:24 AM, Chetan Kothari <[email protected]> wrote: > Hi All > > > > Any inputs on this? > > > > Why creating metadata files recursively should took 1457445 ms when > refresh metadata on this path is already done? > > > > Regards > > Chetan > > > > - -----Original Message----- > From: Jeena Vinod > Sent: Sunday, March 5, 2017 10:44 PM > To: [email protected] > Subject: RE: Explain Plan for Parquet data is taking a lot of timre > > > > Re- attaching the log file as zip file. > > > > Regards > > Jeena > > > > -----Original Message----- > > From: Jeena Vinod > > Sent: Sunday, March 05, 2017 9:23 PM > > To: HYPERLINK "mailto:[email protected]"[email protected] > > Subject: RE: Explain Plan for Parquet data is taking a lot of timre > > > > Hi Kunal, > > > > Thanks for the response. > > Attaching the log with DEBUG enabled for the mentioned loggers. I had to > trim the log for the query, since this mailer allows max 1MB. > > > > From the log files, the below step seems to be taking the most time. Since > refresh metadata on this path is already done, I am unsure what this means. > > -Creating metadata files recursively took 1457445 ms > > > > Also I have 4 core nodes and the planner.width.max_per_node value is > currently 3. > > I tried with values 6 and 8, but did not see significant improvement in > response time. How do we get the optimal value for this property on a > cluster? > > > > Regards > > Jeena > > > > -----Original Message----- > > From: Kunal Khatua [mailto:[email protected]] > > Sent: Thursday, March 02, 2017 7:25 AM > > To: HYPERLINK "mailto:[email protected]"[email protected] > > Subject: Re: Explain Plan for Parquet data is taking a lot of timre > > > > Hi Jeena > > > > > > The JSON profile does not reveal much about why the planning time took so > long, but only give you information from the physical plan and when the > planning approximately completed (7+min for 2node; 15+min for 5node). > > > > Drillbit logs, however, will give you more information. For this, you'll > need to look in the log for with information like > > > > > > 2017-02-23 14:00:54,609 [27513143-8718-7a47-a2d4-06850755568a:foreman] > DEBUG o.a.d.e.p.s.h.DefaultSqlHandler - VOLCANO:Physical Planning > (49588ms): > > > > You might need to enable your logback.xml to pop out this information (by > enabling DEBUG level logging for these classes). > > These are the recommended loggers you can enable DEBUG for: > > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler > > org.apache.drill.exec.work.foreman.Foreman > > > > > > You can share the drillbit log file with these as a run thereafter. > > > > > > Also, Your profile JSONs indicate that you have a fairly slow underlying > filesystem. > > > > > > I'm seeing an average of > > > > 3m45s to read 945K rows (2 node setup) > > > > and > > > > 2m25s to read 1.5M rows (5node setup) > > > > > > Your 2node setup shows 6 fragments processing 29 batches; while 5node > setup shows 15 fragments processing 46 batches. For the number of rows, the > amount of time spent is very high, which makes me believe that your > filesystem (Oracle Cloud Storage service) is itself quite slow. > > > > For speeding up execution, you can try changing the > planner.width.max_per_node to a higher value (like the number of cores on > the node). This should increase parallelization and the utilization of all > the cores by Drill. > > > > > > Kunal Khatua > > > > Engineering > > > > [MapR]<http://www.mapr.com/> > > > > HYPERLINK "http://www.mapr.com%3chttp:/www.mapr.com/"www.mapr.com<http > ://www.mapr.com/> > > > > ________________________________ > > From: Jeena Vinod <HYPERLINK "mailto:[email protected]" > [email protected]> > > Sent: Tuesday, February 28, 2017 12:24:52 PM > > To: HYPERLINK "mailto:[email protected]"[email protected] > > Subject: RE: Explain Plan for Parquet data is taking a lot of timre > > > > Kindly let know if there are any pointers on how to improve response time > for parquet data here. > > > > Regards > > Jeena > > > > -----Original Message----- > > From: Jeena Vinod > > Sent: Tuesday, February 28, 2017 4:25 AM > > To: HYPERLINK "mailto:[email protected]"[email protected] > > Subject: RE: Explain Plan for Parquet data is taking a lot of timre > > > > Hi, > > > > I have 2 Drill 1.9 installations. One is a 5 node 32GB cluster and other > is a 2 node 16GB cluster. And I am running the same query in both the > places. > > select * from `testdata` where <condition> limit 100; testdata is 1GB > uncompressed parquet data. > > > > The query response time is found as below: > > 2 node cluster - 13min > > 5 node cluster - 19min > > > > I was expecting 5 node cluster to be faster, but the results say otherwise. > > In the query profile, as expected, 5 node cluster has more minor > fragments, but still the scan time is higher. Attached the json profile for > both. > > Is this in anyway related to the max batches/max records for row group > scan? > > > > Any suggestions on how we can get better response time in the 5 node > cluster is appreciated. > > > > Regards > > Jeena > > > > -----Original Message----- > > From: Jeena Vinod > > Sent: Sunday, February 26, 2017 2:22 AM > > To: HYPERLINK "mailto:[email protected]"[email protected] > > Subject: RE: Explain Plan for Parquet data is taking a lot of timre > > > > Please find attached the full JSON profile. > > > > Regards > > Jeena > > > > -----Original Message----- > > From: Padma Penumarthy [mailto:[email protected]] > > Sent: Saturday, February 25, 2017 3:31 AM > > To: HYPERLINK "mailto:[email protected]"[email protected] > > Subject: Re: Explain Plan for Parquet data is taking a lot of timre > > > > Yes, please do send the JSON profile. > > > > Thanks, > > Padma > > > > > On Feb 24, 2017, at 1:56 PM, Jeena Vinod <HYPERLINK "mailto: > [email protected]"[email protected]> wrote: > > > > > > Thanks for the suggestions. > > > > > > I did run REFRESH TABLE METADATA command on this path before firing > select query. > > > > > > In Drill 1.9, there is an improvement in performance. I have 1.9 setup > on a 2 node 16GB cluster and here select * with limit 100 is taking less > time than 1.8, though the number of rows in ParquetGroupScan remains > unchanged. Select query is taking around 8 minutes and explain plan took > around 7 minutes. Also in the Web console profile, the query stays in the > STARTING status for almost 7 minutes. > > > > > > Query Plan for 1.9: > > > 00-00 Screen : rowType = RecordType(ANY *): rowcount = 100.0, > cumulative cost = {32810.0 rows, 33110.0 cpu, 0.0 io, 0.0 network, 0.0 > memory}, id = 1721 > > > 00-01 Project(*=[$0]) : rowType = RecordType(ANY *): rowcount = > 100.0, cumulative cost = {32800.0 rows, 33100.0 cpu, 0.0 io, 0.0 network, > 0.0 memory}, id = 1720 > > > 00-02 SelectionVectorRemover : rowType = (DrillRecordRow[*]): > rowcount = 100.0, cumulative cost = {32800.0 rows, 33100.0 cpu, 0.0 io, 0.0 > network, 0.0 memory}, id = 1719 > > > 00-03 Limit(fetch=[100]) : rowType = (DrillRecordRow[*]): > rowcount = 100.0, cumulative cost = {32700.0 rows, 33000.0 cpu, 0.0 io, 0.0 > network, 0.0 memory}, id = 1718 > > > 00-04 Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=/testdata/part-r-00000- > 097f7399-7bfb-4e93-b883-3348655fc658.parquet]], selectionRoot=/testdata, > numFiles=1, usedMetadataFile=true, cacheFileRoot=/testdata, > columns=[`*`]]]) : rowType = (DrillRecordRow[*]): rowcount = 32600.0, > cumulative cost = {32600.0 rows, 32600.0 cpu, 0.0 io, 0.0 network, 0.0 > memory}, id = 1717 > > > > > > And from the query profile, it looks like the most time is spent in > PARQUET_ROW_GROUP_SCAN. I can attach the full JSON profile if it helps. > > > > > > Can there be further improvement in performance with 1.9? > > > > > > Regards > > > Jeena > > > > > > > > > -----Original Message----- > > > From: Padma Penumarthy [mailto:[email protected]] > > > Sent: Friday, February 24, 2017 11:22 PM > > > To: HYPERLINK "mailto:[email protected]"[email protected] > > > Subject: Re: Explain Plan for Parquet data is taking a lot of timre > > > > > > Yes, limit is pushed down to parquet reader in 1.9. But, that will not > help with planning time. > > > It is definitely worth trying with 1.9 though. > > > > > > Thanks, > > > Padma > > > > > > > > >> On Feb 24, 2017, at 7:26 AM, Andries Engelbrecht <HYPERLINK "mailto: > [email protected]"[email protected]> wrote: > > >> > > >> Looks like the metadata cache is being used "usedMetadataFile=true, ". > But to be sure did you perform a REFRESH TABLE METADATA <path to table> on > the parquet data? > > >> > > >> > > >> However it looks like it is reading a full batch " rowcount = 32600.0, > cumulative cost = {32600.0 rows, 32600.0" > > >> > > >> > > >> Didn't the limit operator get pushed down to the parquet reader in 1.9? > > >> > > >> Perhaps try 1.9 and see if in the ParquetGroupScan the number of rows > gets reduced to 100. > > >> > > >> > > >> Can you look in the query profile where time is spend, also how long it > takes before the query starts to run in the WebUI profile. > > >> > > >> > > >> Best Regards > > >> > > >> > > >> Andries Engelbrecht > > >> > > >> > > >> Senior Solutions Architect > > >> > > >> MapR Alliances and Channels Engineering > > >> > > >> > > >> HYPERLINK "mailto:[email protected]"[email protected] > > >> > > >> > > >> [1483990071965_mapr-logo-signature.png] > > >> > > >> ________________________________ > > >> From: Jinfeng Ni <HYPERLINK "mailto:[email protected]"[email protected]> > > >> Sent: Thursday, February 23, 2017 4:53:34 PM > > >> To: user > > >> Subject: Re: Explain Plan for Parquet data is taking a lot of timre > > >> > > >> The reason the plan shows only one single parquet file is because > > >> "LIMIT 100" is applied and filter out the rest of them. > > >> > > >> Agreed that parquet metadata caching might help reduce planning time, > > >> when there are large number of parquet files. > > >> > > >> On Thu, Feb 23, 2017 at 4:44 PM, rahul challapalli > > >> <HYPERLINK "mailto:[email protected]"challapallirahul@ > gmail.com> wrote: > > >>> You said there are 2144 parquet files but the plan suggests that you > > >>> only have a single parquet file. In any case its a long time to plan > the query. > > >>> Did you try the metadata caching feature [1]? > > >>> > > >>> Also how many rowgroups and columns are present in the parquet file? > > >>> > > >>> [1] > > >>> https://drill.apache.org/docs/optimizing-parquet-metadata-reading/ > > >>> > > >>> - Rahul > > >>> > > >>> On Thu, Feb 23, 2017 at 4:24 PM, Jeena Vinod <HYPERLINK "mailto: > [email protected]"[email protected]> wrote: > > >>> > > >>>> Hi, > > >>>> > > >>>> > > >>>> > > >>>> Drill is taking 23 minutes for a simple select * query with limit > > >>>> 100 on 1GB uncompressed parquet data. EXPLAIN PLAN for this query > > >>>> is also taking that long(~23 minutes). > > >>>> > > >>>> Query: select * from <plugin>.root.`testdata` limit 100; > > >>>> > > >>>> Query Plan: > > >>>> > > >>>> 00-00 Screen : rowType = RecordType(ANY *): rowcount = 100.0, > > >>>> cumulative cost = {32810.0 rows, 33110.0 cpu, 0.0 io, 0.0 network, > > >>>> 0.0 memory}, id = 1429 > > >>>> > > >>>> 00-01 Project(*=[$0]) : rowType = RecordType(ANY *): rowcount = > > >>>> 100.0, cumulative cost = {32800.0 rows, 33100.0 cpu, 0.0 io, 0.0 > > >>>> network, > > >>>> 0.0 memory}, id = 1428 > > >>>> > > >>>> 00-02 SelectionVectorRemover : rowType = (DrillRecordRow[*]): > > >>>> rowcount = 100.0, cumulative cost = {32800.0 rows, 33100.0 cpu, 0.0 > > >>>> io, 0.0 network, 0.0 memory}, id = 1427 > > >>>> > > >>>> 00-03 Limit(fetch=[100]) : rowType = (DrillRecordRow[*]): > > >>>> rowcount = 100.0, cumulative cost = {32700.0 rows, 33000.0 cpu, 0.0 > > >>>> io, 0.0 network, 0.0 memory}, id = 1426 > > >>>> > > >>>> 00-04 Scan(groupscan=[ParquetGroupScan > > >>>> [entries=[ReadEntryWithPath [path=/testdata/part-r-00000- > > >>>> 097f7399-7bfb-4e93-b883-3348655fc658.parquet]], > > >>>> selectionRoot=/testdata, numFiles=1, usedMetadataFile=true, > > >>>> cacheFileRoot=/testdata, > > >>>> columns=[`*`]]]) : rowType = (DrillRecordRow[*]): rowcount = > > >>>> 32600.0, cumulative cost = {32600.0 rows, 32600.0 cpu, 0.0 io, 0.0 > > >>>> network, 0.0 memory}, id = 1425 > > >>>> > > >>>> > > >>>> > > >>>> I am using Drill1.8 and it is setup on 5 node 32GB cluster and the > > >>>> data is in Oracle Storage Cloud Service. When I run the same query > > >>>> on 1GB TSV file in this location it is taking only 38 seconds . > > >>>> > > >>>> Also testdata contains around 2144 .parquet files each around 500KB. > > >>>> > > >>>> > > >>>> > > >>>> Is there any additional configuration required for parquet? > > >>>> > > >>>> Kindly suggest how to improve the response time here. > > >>>> > > >>>> > > >>>> > > >>>> Regards > > >>>> Jeena > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > > > > > > >
