Hi, Another observation is My query had where conditions based on the partition values Before Metadata refresh - Its reading only 4 files After metadata refresh - its reading 102290 files
Thanks, Divya On 17 August 2017 at 13:03, Padma Penumarthy <ppenumar...@mapr.com> wrote: > Does your query have partition filter ? > Execution time is increased most likely because partition pruning is not > happening. > Did you get a chance to look at the logs ? That might give some clues. > > Thanks, > Padma > > > > On Aug 16, 2017, at 9:32 PM, Divya Gehlot <divya.htco...@gmail.com> > wrote: > > > > Hi, > > Even I am surprised . > > I am running Drill version 1.10 on MapR enterprise version. > > *Query *- Selecting all the columns on partitioned parquet table > > > > I observed few things from Query statistics : > > > > Value > > > > Before Refresh Metadata > > > > After Refresh Metadata > > > > Fragments > > > > 1 > > > > 13 > > > > DURATION > > > > 01 min 0.233 sec > > > > 18 min 0.744 sec > > > > PLANNING > > > > 59.818 sec > > > > 33.087 sec > > > > QUEUED > > > > Not Available > > > > Not Available > > > > EXECUTION > > > > 0.415 sec > > > > 17 min 27.657 sec > > > > The planning time is being reduced by approx 60% but the execution time > > increased drastically. > > I would like to understand why the exceution time increases after the > > metadata refresh . > > > > > > Appreciate the help. > > > > Thanks, > > divya > > > > > > On 17 August 2017 at 11:54, Padma Penumarthy <ppenumar...@mapr.com> > wrote: > > > >> Refresh table metadata should help reduce query planning time. > >> It is odd that it went up after you did refresh table metadata. > >> Did you check the logs to see what is happening ? You might have to > >> turn on some debugs if needed. > >> BTW, what version of Drill are you running ? > >> > >> Thanks, > >> Padma > >> > >> > >>> On Aug 16, 2017, at 8:15 PM, Divya Gehlot <divya.htco...@gmail.com> > >> wrote: > >>> > >>> Hi, > >>> I have data in parquet file format . > >>> when I run the query the data and see the execution plan I could see > >>> following > >>> statistics > >>> > >>>> TOTAL FRAGMENTS: 1 > >>>>> DURATION: 01 min 0.233 sec > >>>>> PLANNING: 59.818 sec > >>>>> QUEUED: Not Available > >>>>> EXECUTION: 0.415 sec > >>>> > >>>> > >>> > >>> As its a paquet file format I tried enabling refresh meta data > >>> and run below command > >>> REFRESH TABLE METADATA <path to table> ; > >>> then run the same query again on the same table same data (no changes > in > >>> data) and could find the statistics as show below : > >>> > >>> TOTAL FRAGMENTS: 13 > >>>>> DURATION: 14 min 14.604 sec > >>>>> PLANNING: 33.087 sec > >>>>> QUEUED: Not Available > >>>>> EXECUTION: Not Available > >>>> > >>>> > >>> The query is still running . > >>> > >>> Can somebody help me understand why the query taking so long once I > >> issue > >>> the refresh metadata command. > >>> > >>> Aprreciate the help ! > >>> > >>> Thanks, > >>> Divya > >> > >> > >