[GitHub] justinborromeo commented on issue #6088: Scan query: time-ordering
justinborromeo commented on issue #6088: Scan query: time-ordering URL: https://github.com/apache/incubator-druid/issues/6088#issuecomment-459569465 If someone submits a Druid SQL query to order by timestamp with a limit > threshold, is the ideal behaviour to fail the query completely or silently turn off time ordering. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] justinborromeo commented on issue #6088: Scan query: time-ordering
justinborromeo commented on issue #6088: Scan query: time-ordering URL: https://github.com/apache/incubator-druid/issues/6088#issuecomment-459555345 Should time-ordering be supported for legacy mode scan queries? Just wondering since the timestamp format is different. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] justinborromeo commented on issue #6088: Scan query: time-ordering
justinborromeo commented on issue #6088: Scan query: time-ordering URL: https://github.com/apache/incubator-druid/issues/6088#issuecomment-459170520 The current output of the scan query is a collection of ScanResultValue objects (String segmentId, List columns, Object events) where each ScanResultValue contains the rows from a specific segment. Time-ordering scan results will cause events to not be nicely grouped by segment...for that reason, I don't think it makes sense to use that format. Would it make sense to have a separate output format for time-ordered scans? An example of such a format would be a collection of events where segmentId is a field of each event. I think changing the output format for _all_ Scan queries is a no-go because it would be a breaking change. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] justinborromeo commented on issue #6088: Scan query: time-ordering
justinborromeo commented on issue #6088: Scan query: time-ordering URL: https://github.com/apache/incubator-druid/issues/6088#issuecomment-458773614 The way Scan is written now, the "scan query doesn't retain all rows in memory before rows can be returned to client" (http://druid.io/docs/latest/querying/scan-query.html) and the design of the code reflects this. Afaik, a time-ordering feature would require the broker to store all the returned rows either in memory or on disk before it could start returning rows in a sorted order. It seems that this would violate the main principle behind Scan...is this time-ordering feature something that's appropriate for Scan or am I misunderstanding something? Alternatively, is there a way to do this time-order in a streaming manner? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] justinborromeo commented on issue #6088: Scan query: time-ordering
justinborromeo commented on issue #6088: Scan query: time-ordering URL: https://github.com/apache/incubator-druid/issues/6088#issuecomment-458658358 Would it make sense to have a parameter timeOrder _and_ the descending parameter in BaseQuery? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] justinborromeo commented on issue #6088: Scan query: time-ordering
justinborromeo commented on issue #6088: Scan query: time-ordering URL: https://github.com/apache/incubator-druid/issues/6088#issuecomment-458384495 @gianm How would this look from an interface point of view? Would we be adding a new JSON field or is there an existing field where a user could specify to time-order results? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org