gortiz commented on issue #18740: URL: https://github.com/apache/pinot/issues/18740#issuecomment-4780738502
> Do you know what the potential impact of this is? What rules currently use cardinality? If I disable stats collection, do I revert back to the old behaviour or are all of my stats permanently updated? Currently, there is no rule that uses cardinality, which is what https://github.com/apache/pinot/pull/18741 is adding. If it were merged and the stat collection was disabled, the CBO wouldn't be able to infer anything, and it should do nothing. > What's the difference between this and join re-ordering? Reordering means how to order 3 or more joins, while _this_ means just swapping the order of a single join. They are different Calcite rules. > Will the column stats from phase 2 be useful in Calcite? Yes. They are an optimization on the rules we have in https://github.com/apache/pinot/pull/18741. We can provide better estimations knowing that data is split in segments and segments have their own max and min. For example, imagine we have the following segments and a query with `where col = 6` 1. seg1: min = 0, max = 10, ndv = 10, count = X 2. seg2: min = 5, max = 10, ndv = 5, count = 10 * X 3. seg3: min = 7, max = 15, ndv = 7, count = 100 * X ``` Total rows = X + 10X + 100X = 111X. ┌──────┬────────┬─────┬───────┬────────────────┬────────────────────────────────┐ │ seg │ range │ ndv │ count │ can contain 6? │ est. matching rows (count/ndv) │ ├──────┼────────┼─────┼───────┼────────────────┼────────────────────────────────┤ │ seg1 │ [0,10] │ 10 │ X │ yes │ X/10 = 0.1X │ ├──────┼────────┼─────┼───────┼────────────────┼────────────────────────────────┤ │ seg2 │ [5,10] │ 5 │ 10X │ yes │ 10X/5 = 2X │ ├──────┼────────┼─────┼───────┼────────────────┼────────────────────────────────┤ │ seg3 │ [7,15] │ 7 │ 100X │ no (6 < 7) │ pruned → 0 │ └──────┴────────┴─────┴───────┴────────────────┴────────────────────────────────┘ ``` - Per-segment: ≈ 2.1X matching → selectivity ≈ 1.9% - What we have in Phase 1: merged ndv = MAX(ndv) = 10 → 1/ndv = 0.1 → 0.1 × 111X = 11.1X → 10% >> Per-segment stats collected from ZooKeeper metadata the broker already watches (row count, size, time boundaries — effectively free), > > how accurate are these? how often are they updated? They are perfectly accurate, at least for committed realtime segments and offline segments. Given they are immutable, once ZK knows about the new segment, the broker is notified and changes its stats. The counts, max, min, etc are already stored per segment. > I'd love a more detailed document describing how all the components fit together and more details. e.g. what will Calcite do and what will we have to do? With current AI usage, it is easier for me to create the code, run it and verify how it works (or how it could be improved) than to document everything from scratch. What I can do is create the PRs and then formalize a broader overview document for discussion. > Postgres has the extension pg_hint_plan that makes debugging / developing CBO much easier. It can be used to force join order and then observe the cost, override bad planner decisions, compare different options. I don't see any kind of knobs like that mentioned in the plan. Questions I might have as a user of join reordering: > a. why did the planner choose this ordering and how close are we to choosing a different ordering? (In PG I can answer this by trying out different join orderings with the Leading hint) > b. how can I override what the planner chose? That is a good feature and something we can add. I don't think we should aim for the perfect CBO with all the possible features we can imagine in the first version, but that is something that could probably be added. Right now in https://github.com/apache/pinot/pull/18741 any join hint will prevent the optiomization, so the order would be the same as the one we were using right now (basically, the one decided by the parser). Remember we can apply _a lot_ of optimizations with CBO, not just limited to join ordering. https://github.com/apache/pinot/pull/18741 adds join ordering because it is the most simple one to understand and therefore can be used as an example for more optimizations -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
