gianm commented on issue #6066: Sorting rows when rollup is disabled
URL: 
https://github.com/apache/incubator-druid/issues/6066#issuecomment-410322187
 
 
   > In light of what you described, my intuition tells me that this might be a 
bad default in most cases (expect when queryGranularity == segmentGranularity). 
Can you think of reasons why __time this would be a hard requirement, or could 
this be changed? Of course we can't just change it as it would have complex 
implications on query perf, but maybe a new ingestion spec flag sortyBy that 
would allow putting __time elsewhere might help a lot in getting better 
compression.
   
   Yeah there would be a ton of value in this! I think we should do a separate 
issue for it -- I can write one up with some of the considerations that will 
need to be taken into account. There's a lot of assumptions throughout the 
Druid codebase that the `__time` column is always sorted, and it would be some 
work to get through all of those, but I think it is tractable.
   
   FWIW one workaround you can do is set queryGranularity = segmentGranularity 
and then create a "long" typed dimension with the real timestamp in it. You 
won't be able to do timeseries queries on it, but you can do groupBys with 
`timestamp_floor` expressions. Druid SQL would also work, with a query like 
`SELECT TIME_FLOOR(MILLIS_TO_TIMESTAMP(real_timestamp), 'PT1H') ...`. For time 
filters you can use a combination of `intervals` and bound filters on your 
"real_timestamp" dimension. It is a little janky (and perf for timeseries-style 
queries won't be as good as using the `__time` column) but it will work for 
improving your compression and locality.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to