deng2 commented on issue #7592:
URL: https://github.com/apache/druid/issues/7592#issuecomment-700582857


   > I can't find the issue number right now, but essentially the reason for 
this is that 'granularity' for groupBy does not work the way that you would 
expect for a SQL GROUP BY when there's an ORDER BY or a LIMIT clause, so it 
cannot be used by the SQL planner. I think the issue was that dimension 
ordering is applied within each granular bucket for groupBy, rather than across 
all buckets as you would expect with SQL.
   > 
   > However, I would like to optimize the execution of `FLOOR(__time TO DAY)` 
enough that this is not an issue perf-wise. That seems like a good direction to 
me.
   
   
   
   > I can't find the issue number right now, but essentially the reason for 
this is that 'granularity' for groupBy does not work the way that you would 
expect for a SQL GROUP BY when there's an ORDER BY or a LIMIT clause, so it 
cannot be used by the SQL planner. I think the issue was that dimension 
ordering is applied within each granular bucket for groupBy, rather than across 
all buckets as you would expect with SQL.
   > 
   > However, I would like to optimize the execution of `FLOOR(__time TO DAY)` 
enough that this is not an issue perf-wise. That seems like a good direction to 
me.
   
   I am not using order by or limit clause in my SQL. It still uses the all 
granularity. Will we fix this?
   ```
   select
       TIME_FLOOR(__time, 'PT1H', TIME_PARSE('2020-03-01T04:00+12:00'), 
'Pacific/Kwajalein') as __timestamp,
       app, metric_name,
       count(*) as __count,
       sum(metric_value) as __sum
   from viaops_metric_specguide
   where
       (
           (__time >= TIME_PARSE('2020-02-23T04:00+12:00') and __time < 
TIME_PARSE('2020-03-01T04:00+12:00')) or
           (__time >= TIME_PARSE('2020-02-16T04:00+12:00') and __time < 
TIME_PARSE('2020-02-23T04:00+12:00')) or
           (__time >= TIME_PARSE('2020-02-09T04:00+12:00') and __time < 
TIME_PARSE('2020-02-16T04:00+12:00')) or
           (__time >= TIME_PARSE('2020-02-02T04:00+12:00') and __time < 
TIME_PARSE('2020-02-09T04:00+12:00')) or
           (__time >= TIME_PARSE('2020-01-26T04:00+12:00') and __time < 
TIME_PARSE('2020-02-02T04:00+12:00'))
       )
       and metric_name = 'Successful API Attempt'
       and app = 'Quantum'
       and api in (
             '',
             'stbDeviceEdgeRefresh',
             'deviceEdgeDevices',
             'avappsTermsAndConditions',
             'viewingHistoryEdgeViewingHistory',
             'loginEdgeSessionInfo',
             'searchEdgeSavedSearchRecent',
             'searchEdgeUnfiltered',
             'networkSettingsEdgeSettings',
             'viewingHistoryEdgeCollated',
             'watchListEdgeWatchList',
             'lrmEdgeEntitlements',
             'companionDevicesEdgeHandoff',
             'searchedgeV1CatalogUnfiltered',
             'loginEdgeLogin',
             'savedSearchEdgeSavedSearch'
       )
   group by 1, 2, 3
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to