richardstartin commented on issue #8800:
URL: https://github.com/apache/pinot/issues/8800#issuecomment-1145362843
The inverted index group by algorithm is very close to how high dimensional
cubes could work:
1. Pre-aggregate measures data to reduce cardinality
2. Split dimensions into groups of 3, build bitmap indexes over aggregates
by all 1-, 2-, 3-tuples
3. To group by a dimension
1. When there is no filter, iterate over the bitmaps of the dimension,
evaluate count or apply an aggregation function to selected aggregates
2. When there is an equality filter on value x of another dimension (or
values x and y of 2 dimensions respectively) within the same group as the
grouping dimension, select all tuples (*, x) (or (*, x, y)) from the group,
apply the reduction for each bitmap
3. When the filter is in another group, iterate the bitmaps in the
grouping dimension but intersect with the filter bitmap on the fly, apply the
reduction for each nonempty result bitmap
4. To group by multiple dimensions in the same group, iterate over the
bitmaps for indexed tuples (*, *), apply filters as above, apply reduction for
nonempty bitmaps
5. To group by multiple dimensions in different groups requires
intersections between the cross product of the dimensions’ members on the fly,
then reduction for nonempty combination bitmaps. The groups can be tuned based
on query patterns to avoid this happening.
This can all be done without pre-aggregation but the approach doesn’t work
well with high cardinalities or high row counts (especially for applying
aggregation functions other than count to nonempty groups).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]