Re: Explain Plan for Parquet data is taking a lot of timre

rahul challapalli Mon, 06 Mar 2017 09:36:09 -0800

For explanation regarding why we are rebuilding the metadata cache, take a
look at Padma's previous email. Most likely, there is a data change in the
folder. If not we should refresh the metadata cache and its a bug.


Drill currently does not do incremental metadata refreshes. Now lets say
you have a table "transactions" (with 100 partitions) and you added a new
partition to the "transaction" folder. Drill refreshes the metadata cache
for all the partitions (even though we only added 1 partition and nothing
has changed in the remaining partitions) automatically for the subsequent
query. This might explain why its taking a long time.

- Rahul

On Mon, Mar 6, 2017 at 9:24 AM, Chetan Kothari <[email protected]>
wrote:

> Hi All
>
>
>
> Any inputs on this?
>
>
>
> Why creating metadata files recursively should took 1457445 ms when
> refresh metadata on this path is already done?
>
>
>
> Regards
>
> Chetan
>
>
>
>               - -----Original Message-----
> From: Jeena Vinod
> Sent: Sunday, March 5, 2017 10:44 PM
> To: [email protected]
> Subject: RE: Explain Plan for Parquet data is taking a lot of timre
>
>
>
> Re- attaching the log file as zip file.
>
>
>
> Regards
>
> Jeena
>
>
>
> -----Original Message-----
>
> From: Jeena Vinod
>
> Sent: Sunday, March 05, 2017 9:23 PM
>
> To: HYPERLINK "mailto:[email protected]"[email protected]
>
> Subject: RE: Explain Plan for Parquet data is taking a lot of timre
>
>
>
> Hi Kunal,
>
>
>
> Thanks for the response.
>
> Attaching the log with DEBUG enabled for the mentioned loggers. I had to
> trim the log for the query, since this mailer allows max 1MB.
>
>
>
> From the log files, the below step seems to be taking the most time. Since
> refresh metadata on this path is already done, I am unsure what this means.
>
>       -Creating metadata files recursively took 1457445 ms
>
>
>
> Also I have 4 core nodes and the planner.width.max_per_node value is
> currently 3.
>
> I tried with values 6 and 8, but did not see significant improvement in
> response time. How do we get the optimal value for this property on a
> cluster?
>
>
>
> Regards
>
> Jeena
>
>
>
> -----Original Message-----
>
> From: Kunal Khatua [mailto:[email protected]]
>
> Sent: Thursday, March 02, 2017 7:25 AM
>
> To: HYPERLINK "mailto:[email protected]"[email protected]
>
> Subject: Re: Explain Plan for Parquet data is taking a lot of timre
>
>
>
> Hi Jeena
>
>
>
>
>
> The JSON profile does not reveal much about why the planning time took so
> long, but only give you information from the physical plan and when the
> planning approximately completed (7+min for 2node; 15+min for 5node).
>
>
>
> Drillbit logs, however, will give you more information. For this, you'll
> need to look in the log for with information like
>
>
>
>
>
> 2017-02-23 14:00:54,609 [27513143-8718-7a47-a2d4-06850755568a:foreman]
> DEBUG o.a.d.e.p.s.h.DefaultSqlHandler - VOLCANO:Physical Planning
> (49588ms):
>
>
>
> You might need to enable your logback.xml to pop out this information (by
> enabling DEBUG level logging for these classes).
>
> These are the recommended loggers you can enable DEBUG for:
>
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler
>
> org.apache.drill.exec.work.foreman.Foreman
>
>
>
>
>
> You can share the drillbit log file with these as a run thereafter.
>
>
>
>
>
> Also, Your profile JSONs indicate that you have a fairly slow underlying
> filesystem.
>
>
>
>
>
> I'm seeing an average of
>
>
>
> 3m45s to read 945K rows (2 node setup)
>
>
>
> and
>
>
>
> 2m25s to read 1.5M rows (5node setup)
>
>
>
>
>
> Your 2node setup shows 6 fragments processing 29 batches; while 5node
> setup shows 15 fragments processing 46 batches. For the number of rows, the
> amount of time spent is very high, which makes me believe that your
> filesystem (Oracle Cloud Storage service) is itself quite slow.
>
>
>
> For speeding up execution, you can try changing the
> planner.width.max_per_node to a higher value (like the number of cores on
> the node). This should increase parallelization and the utilization of all
> the cores by Drill.
>
>
>
>
>
> Kunal Khatua
>
>
>
> Engineering
>
>
>
> [MapR]<http://www.mapr.com/>
>
>
>
> HYPERLINK "http://www.mapr.com%3chttp:/www.mapr.com/"www.mapr.com<http
> ://www.mapr.com/>
>
>
>
> ________________________________
>
> From: Jeena Vinod <HYPERLINK "mailto:[email protected]";
> [email protected]>
>
> Sent: Tuesday, February 28, 2017 12:24:52 PM
>
> To: HYPERLINK "mailto:[email protected]"[email protected]
>
> Subject: RE: Explain Plan for Parquet data is taking a lot of timre
>
>
>
> Kindly let know if there are any pointers on how to improve response time
> for parquet data here.
>
>
>
> Regards
>
> Jeena
>
>
>
> -----Original Message-----
>
> From: Jeena Vinod
>
> Sent: Tuesday, February 28, 2017 4:25 AM
>
> To: HYPERLINK "mailto:[email protected]"[email protected]
>
> Subject: RE: Explain Plan for Parquet data is taking a lot of timre
>
>
>
> Hi,
>
>
>
> I have 2 Drill 1.9 installations. One is a 5 node 32GB cluster and other
> is a 2 node 16GB cluster. And I am running the same query in both the
> places.
>
> select * from `testdata` where <condition> limit 100; testdata is 1GB
> uncompressed parquet data.
>
>
>
> The query response time is found as below:
>
> 2 node cluster - 13min
>
> 5 node cluster - 19min
>
>
>
> I was expecting 5 node cluster to be faster, but the results say otherwise.
>
> In the query profile, as expected, 5 node cluster has more minor
> fragments, but still the scan time is higher. Attached the json profile for
> both.
>
> Is this in anyway related to the max batches/max records for row group
> scan?
>
>
>
> Any suggestions on how we can get better response time in the 5 node
> cluster is appreciated.
>
>
>
> Regards
>
> Jeena
>
>
>
> -----Original Message-----
>
> From: Jeena Vinod
>
> Sent: Sunday, February 26, 2017 2:22 AM
>
> To: HYPERLINK "mailto:[email protected]"[email protected]
>
> Subject: RE: Explain Plan for Parquet data is taking a lot of timre
>
>
>
> Please find attached the full JSON profile.
>
>
>
> Regards
>
> Jeena
>
>
>
> -----Original Message-----
>
> From: Padma Penumarthy [mailto:[email protected]]
>
> Sent: Saturday, February 25, 2017 3:31 AM
>
> To: HYPERLINK "mailto:[email protected]"[email protected]
>
> Subject: Re: Explain Plan for Parquet data is taking a lot of timre
>
>
>
> Yes, please do send the JSON profile.
>
>
>
> Thanks,
>
> Padma
>
>
>
> > On Feb 24, 2017, at 1:56 PM, Jeena Vinod <HYPERLINK "mailto:
> [email protected]"[email protected]> wrote:
>
> >
>
> > Thanks for the suggestions.
>
> >
>
> > I did run REFRESH TABLE METADATA command on this path before firing
> select query.
>
> >
>
> > In Drill 1.9, there is an improvement in performance. I have 1.9 setup
> on a 2 node 16GB cluster and here select * with limit 100 is taking less
> time than 1.8, though the number of rows in ParquetGroupScan remains
> unchanged. Select query is taking around 8 minutes and explain plan took
> around 7 minutes. Also in the Web console profile, the query stays in the
> STARTING status for almost 7 minutes.
>
> >
>
> > Query Plan for 1.9:
>
> > 00-00    Screen : rowType = RecordType(ANY *): rowcount = 100.0,
> cumulative cost = {32810.0 rows, 33110.0 cpu, 0.0 io, 0.0 network, 0.0
> memory}, id = 1721
>
> > 00-01      Project(*=[$0]) : rowType = RecordType(ANY *): rowcount =
> 100.0, cumulative cost = {32800.0 rows, 33100.0 cpu, 0.0 io, 0.0 network,
> 0.0 memory}, id = 1720
>
> > 00-02        SelectionVectorRemover : rowType = (DrillRecordRow[*]):
> rowcount = 100.0, cumulative cost = {32800.0 rows, 33100.0 cpu, 0.0 io, 0.0
> network, 0.0 memory}, id = 1719
>
> > 00-03          Limit(fetch=[100]) : rowType = (DrillRecordRow[*]):
> rowcount = 100.0, cumulative cost = {32700.0 rows, 33000.0 cpu, 0.0 io, 0.0
> network, 0.0 memory}, id = 1718
>
> > 00-04            Scan(groupscan=[ParquetGroupScan
> [entries=[ReadEntryWithPath [path=/testdata/part-r-00000-
> 097f7399-7bfb-4e93-b883-3348655fc658.parquet]], selectionRoot=/testdata,
> numFiles=1, usedMetadataFile=true, cacheFileRoot=/testdata,
> columns=[`*`]]]) : rowType = (DrillRecordRow[*]): rowcount = 32600.0,
> cumulative cost = {32600.0 rows, 32600.0 cpu, 0.0 io, 0.0 network, 0.0
> memory}, id = 1717
>
> >
>
> > And from the query profile, it looks like the most time is spent in
> PARQUET_ROW_GROUP_SCAN. I can attach the full JSON profile if it helps.
>
> >
>
> > Can there be further improvement in performance with 1.9?
>
> >
>
> > Regards
>
> > Jeena
>
> >
>
> >
>
> > -----Original Message-----
>
> > From: Padma Penumarthy [mailto:[email protected]]
>
> > Sent: Friday, February 24, 2017 11:22 PM
>
> > To: HYPERLINK "mailto:[email protected]"[email protected]
>
> > Subject: Re: Explain Plan for Parquet data is taking a lot of timre
>
> >
>
> > Yes, limit is pushed down to parquet reader in 1.9. But, that will not
> help with planning time.
>
> > It is definitely worth trying with 1.9 though.
>
> >
>
> > Thanks,
>
> > Padma
>
> >
>
> >
>
> >> On Feb 24, 2017, at 7:26 AM, Andries Engelbrecht <HYPERLINK "mailto:
> [email protected]"[email protected]> wrote:
>
> >>
>
> >> Looks like the metadata cache is being used  "usedMetadataFile=true, ".
> But to be sure did you perform a REFRESH TABLE METADATA <path to table> on
> the parquet data?
>
> >>
>
> >>
>
> >> However it looks like it is reading a full batch " rowcount = 32600.0,
> cumulative cost = {32600.0 rows, 32600.0"
>
> >>
>
> >>
>
> >> Didn't the limit operator get pushed down to the parquet reader in 1.9?
>
> >>
>
> >> Perhaps try 1.9 and see if in the ParquetGroupScan the number of rows
> gets reduced to 100.
>
> >>
>
> >>
>
> >> Can you look in the query profile where time is spend, also how long it
> takes before the query starts to run in the WebUI profile.
>
> >>
>
> >>
>
> >> Best Regards
>
> >>
>
> >>
>
> >> Andries Engelbrecht
>
> >>
>
> >>
>
> >> Senior Solutions Architect
>
> >>
>
> >> MapR Alliances and Channels Engineering
>
> >>
>
> >>
>
> >> HYPERLINK "mailto:[email protected]"[email protected]
>
> >>
>
> >>
>
> >> [1483990071965_mapr-logo-signature.png]
>
> >>
>
> >> ________________________________
>
> >> From: Jinfeng Ni <HYPERLINK "mailto:[email protected]"[email protected]>
>
> >> Sent: Thursday, February 23, 2017 4:53:34 PM
>
> >> To: user
>
> >> Subject: Re: Explain Plan for Parquet data is taking a lot of timre
>
> >>
>
> >> The reason the plan shows only one single parquet file is because
>
> >> "LIMIT 100" is applied and filter out the rest of them.
>
> >>
>
> >> Agreed that parquet metadata caching might help reduce planning time,
>
> >> when there are large number of parquet files.
>
> >>
>
> >> On Thu, Feb 23, 2017 at 4:44 PM, rahul challapalli
>
> >> <HYPERLINK "mailto:[email protected]"challapallirahul@
> gmail.com> wrote:
>
> >>> You said there are 2144 parquet files but the plan suggests that you
>
> >>> only have a single parquet file. In any case its a long time to plan
> the query.
>
> >>> Did you try the metadata caching feature [1]?
>
> >>>
>
> >>> Also how many rowgroups and columns are present in the parquet file?
>
> >>>
>
> >>> [1]
>
> >>> https://drill.apache.org/docs/optimizing-parquet-metadata-reading/
>
> >>>
>
> >>> - Rahul
>
> >>>
>
> >>> On Thu, Feb 23, 2017 at 4:24 PM, Jeena Vinod <HYPERLINK "mailto:
> [email protected]"[email protected]> wrote:
>
> >>>
>
> >>>> Hi,
>
> >>>>
>
> >>>>
>
> >>>>
>
> >>>> Drill is taking 23 minutes for a simple select * query with limit
>
> >>>> 100 on 1GB uncompressed parquet data. EXPLAIN PLAN for this query
>
> >>>> is also taking that long(~23 minutes).
>
> >>>>
>
> >>>> Query: select * from <plugin>.root.`testdata` limit 100;
>
> >>>>
>
> >>>> Query  Plan:
>
> >>>>
>
> >>>> 00-00    Screen : rowType = RecordType(ANY *): rowcount = 100.0,
>
> >>>> cumulative cost = {32810.0 rows, 33110.0 cpu, 0.0 io, 0.0 network,
>
> >>>> 0.0 memory}, id = 1429
>
> >>>>
>
> >>>> 00-01      Project(*=[$0]) : rowType = RecordType(ANY *): rowcount =
>
> >>>> 100.0, cumulative cost = {32800.0 rows, 33100.0 cpu, 0.0 io, 0.0
>
> >>>> network,
>
> >>>> 0.0 memory}, id = 1428
>
> >>>>
>
> >>>> 00-02        SelectionVectorRemover : rowType = (DrillRecordRow[*]):
>
> >>>> rowcount = 100.0, cumulative cost = {32800.0 rows, 33100.0 cpu, 0.0
>
> >>>> io, 0.0 network, 0.0 memory}, id = 1427
>
> >>>>
>
> >>>> 00-03          Limit(fetch=[100]) : rowType = (DrillRecordRow[*]):
>
> >>>> rowcount = 100.0, cumulative cost = {32700.0 rows, 33000.0 cpu, 0.0
>
> >>>> io, 0.0 network, 0.0 memory}, id = 1426
>
> >>>>
>
> >>>> 00-04            Scan(groupscan=[ParquetGroupScan
>
> >>>> [entries=[ReadEntryWithPath [path=/testdata/part-r-00000-
>
> >>>> 097f7399-7bfb-4e93-b883-3348655fc658.parquet]],
>
> >>>> selectionRoot=/testdata, numFiles=1, usedMetadataFile=true,
>
> >>>> cacheFileRoot=/testdata,
>
> >>>> columns=[`*`]]]) : rowType = (DrillRecordRow[*]): rowcount =
>
> >>>> 32600.0, cumulative cost = {32600.0 rows, 32600.0 cpu, 0.0 io, 0.0
>
> >>>> network, 0.0 memory}, id = 1425
>
> >>>>
>
> >>>>
>
> >>>>
>
> >>>> I am using Drill1.8 and it is setup on 5 node 32GB cluster and the
>
> >>>> data is in Oracle Storage Cloud Service. When I run the same query
>
> >>>> on 1GB TSV file in this location it is taking only 38 seconds .
>
> >>>>
>
> >>>> Also testdata contains around 2144 .parquet files each around 500KB.
>
> >>>>
>
> >>>>
>
> >>>>
>
> >>>> Is there any additional configuration required for parquet?
>
> >>>>
>
> >>>> Kindly suggest how to improve the response time here.
>
> >>>>
>
> >>>>
>
> >>>>
>
> >>>> Regards
>
> >>>> Jeena
>
> >>>>
>
> >>>>
>
> >>>>
>
> >>>>
>
> >>>>
>
> >>>>
>
> >>>>
>
> >>>>
>
> >>>>
>
> >>>>
>
> >>>>
>
> >
>
>
>
>
>

Re: Explain Plan for Parquet data is taking a lot of timre

Reply via email to