Hi,
Another observation is
My query had where conditions based on the partition values
Before Metadata refresh - Its reading only 4 files
After metadata refresh - its reading 102290 files

Thanks,
Divya

On 17 August 2017 at 13:03, Padma Penumarthy <ppenumar...@mapr.com> wrote:

> Does your query have partition filter ?
> Execution time is increased most likely because partition pruning is not
> happening.
> Did you get a chance to look at the logs ?  That might give some clues.
>
> Thanks,
> Padma
>
>
> > On Aug 16, 2017, at 9:32 PM, Divya Gehlot <divya.htco...@gmail.com>
> wrote:
> >
> > Hi,
> > Even I am surprised .
> > I am running Drill version 1.10  on MapR enterprise version.
> > *Query *- Selecting all the columns on partitioned parquet table
> >
> > I observed few things from Query statistics :
> >
> > Value
> >
> > Before Refresh Metadata
> >
> > After Refresh Metadata
> >
> > Fragments
> >
> > 1
> >
> > 13
> >
> > DURATION
> >
> > 01 min 0.233 sec
> >
> > 18 min 0.744 sec
> >
> > PLANNING
> >
> > 59.818 sec
> >
> > 33.087 sec
> >
> > QUEUED
> >
> > Not Available
> >
> > Not Available
> >
> > EXECUTION
> >
> > 0.415 sec
> >
> > 17 min 27.657 sec
> >
> > The planning time is being reduced by approx 60% but the execution time
> > increased  drastically.
> > I would like to understand why the exceution time increases after the
> > metadata refresh .
> >
> >
> > Appreciate the help.
> >
> > Thanks,
> > divya
> >
> >
> > On 17 August 2017 at 11:54, Padma Penumarthy <ppenumar...@mapr.com>
> wrote:
> >
> >> Refresh table metadata should  help reduce query planning time.
> >> It is odd that it went up after you did refresh table metadata.
> >> Did you check the logs to see what is happening ? You might have to
> >> turn on some debugs if needed.
> >> BTW, what version of Drill are you running ?
> >>
> >> Thanks,
> >> Padma
> >>
> >>
> >>> On Aug 16, 2017, at 8:15 PM, Divya Gehlot <divya.htco...@gmail.com>
> >> wrote:
> >>>
> >>> Hi,
> >>> I have data in parquet file format .
> >>> when I run the query the data and see the execution plan I could see
> >>> following
> >>> statistics
> >>>
> >>>> TOTAL FRAGMENTS: 1
> >>>>> DURATION: 01 min 0.233 sec
> >>>>> PLANNING: 59.818 sec
> >>>>> QUEUED: Not Available
> >>>>> EXECUTION: 0.415 sec
> >>>>
> >>>>
> >>>
> >>> As its a paquet file format I tried enabling refresh meta data
> >>> and run below command
> >>> REFRESH TABLE METADATA <path to table> ;
> >>> then run the same query again on the same table same data (no changes
> in
> >>> data)  and could find the statistics as show below :
> >>>
> >>> TOTAL FRAGMENTS: 13
> >>>>> DURATION: 14 min 14.604 sec
> >>>>> PLANNING: 33.087 sec
> >>>>> QUEUED: Not Available
> >>>>> EXECUTION: Not Available
> >>>>
> >>>>
> >>> The query is still running .
> >>>
> >>> Can somebody help me  understand why the query taking so long once I
> >> issue
> >>> the refresh metadata command.
> >>>
> >>> Aprreciate the help !
> >>>
> >>> Thanks,
> >>> Divya
> >>
> >>
>
>

Reply via email to