clintropolis commented on code in PR #18262:
URL: https://github.com/apache/druid/pull/18262#discussion_r2221488866


##########
processing/src/main/java/org/apache/druid/data/input/impl/AggregateProjectionSpec.java:
##########
@@ -175,8 +174,8 @@ public String toString()
   private static ProjectionOrdering computeOrdering(VirtualColumns 
virtualColumns, List<DimensionSchema> groupingColumns)
   {
     if (groupingColumns.isEmpty()) {
-      // call it time ordered, there is no grouping columns so there is only 1 
row for this projection
-      return new ProjectionOrdering(Cursors.ascendingTimeOrder(), null);
+      // no ordering since there is only 1 row for this projection
+      return new ProjectionOrdering(List.of(), null);

Review Comment:
   >ah do you mean that if a query is bucket at DAY(__time), and segment is 
bucket at HOUR(__time), then the OrderBy in projection doesn't matter since 
it'd be grouped anyway?
   
   Ish, yea, the projection transformation of the __time column must be finer 
than or equal to the query transformation of the time_column for the projection 
to be able to compute the same query results, so a time_floor(__time, 'P1D') 
would group all of the hourly segment times into the same day (even if the 
projection has no __time transformation because all of the rows would still 
have a timestamp of the beginning of the hour or whatever).
   
   >I'm thinking that projection should inherit the granularity from segment, 
which it seems like it doesn't have that info right now? 
   
   A projection doesn't have an explicit granularity, which was a deliberate 
design decision. It either has a precomputed virtual column with a time_floor 
expression on __time (or even just __time directly from the base table) which 
is stored as a physical column in the segment, or it doesn't. The latter case 
as far as the engines are concerned still has the __time column due to the 
constraints of the segment interval, it just isn't a physical column and 
instead is a constant of min timestamp. The projection schema stored in the 
metadata defines which physical columns are present in the segment for a given 
projection, so a projection itself shouldn't really inherit a segment 
granularity directly if that is what you mean. Instead, the match code should 
consider the interval of the segment to determine if the projection __time 
transformation matter or not (probably updating `PhysicalColumnChecker` to add 
a method to get the data interval? though i havent thought too hard about it 
yet)
   
   In general, we have been downplaying the concept of granularity since it is 
basically an oddly specific way to define a grouping on time for a query, and 
the query engine implementations of it only work when a cursor is time ordered 
(we have some plans to transition this to virtual columns too but haven't got 
around to it yet).
   
   > it should also work with DESC __time too if the query bucket is larger?
   
   the descending thing isn't really directly related to __time matching, and 
would handled in a different place in code, i think by just dropping the 
direction from consideration. A query needing a descending order on __time 
should work if the base table or a projection is ordered by __time in either 
direction since the cursor can be done ascending or descending order (vector 
cursor is not yet, but that isn't this codes concern really). The other 
constraints of projection __time matching must also be true (such as finer than 
or equal to __time transformation), but the code that is done via the virtual 
column comparison because the ordering isn't part of the time_floor expression 
itself. While we don't have any implementations of it yet, the same would also 
be true if we had engines that could take advantage of ordering on things other 
than __time, if we need `order by x desc` then as long as the table is `order 
by x` in any direction we could use it for the query



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to