somandal opened a new pull request, #8738:
URL: https://github.com/apache/pinot/pull/8738
A vast majority of the code changes are in the tests.
The existing behavior for EXPLAIN PLAN has the following limitations:
- The plan query is sent to only one random server with a random segment.
This can have the following issues:
- The segment may get pruned on the server side
- The segment may produce an Empty filter during the operator tree
creation
- The segment may produce a Match All filter during the operator tree
creation
- AND and OR operators may result in one or both predicate subtrees
getting degenerated into an Empty or Match All filter. Due to this the AND or
OR subtree may be converted into a leaf level predicate. This leads to
confusion in the user's mind regarding the output of the explain plan.
- The overall approach is not a good representation of the distributed query
planning for each segment
This PR address the above limitations by the following changes:
- Send the plan query to all segments on all servers (the ones chosen by the
Broker after broker side pruning)
- Each server returns a set of deduplicated plans along with the count
of number of segments matching each plan
- Each server returns the following data as part of the metadata:
- Number of segments pruned by the server side
- Number of segments with an empty filter tree
- Number of segments with a match all filter tree
- On the Broker side the set of plans returned by each server is again
deduplicated and the number of segments matching each plan is updated for each
unique plan
- Each broker returns the number of segments pruned by the broker side
as part of the BrokerResponse
- Adds a verbose explain plan option which returns all of the
deduplicated plans.
- If verbose is disabled (default option) then a single explain plan out
of the deduplicated plan is returned with the deepest plan tree. This is a
better approximation than the current explain plan functionality due to
deduplication across servers and segments.
Example query:
```
EXPLAIN PLAN FOR SELECT invertedIndexCol1, noIndexCol1 FROM testTable WHERE
startsWith (textIndexCol1, 'daff') AND noIndexCol4
```
Here's an example of the Explain Plan output with verbose mode enabled:
```
BROKER_REDUCE
COMBINE_SELECT
PLAN_START(numSegmentsForThisPlan:3)
SELECT(selectList:invertedIndexCol1, noIndexCol1)
TRANSFORM_PASSTHROUGH(invertedIndexCol1, noIndexCol1)
PROJECT(invertedIndexCol1, noIndexCol1)
DOC_ID_SET
FILTER_AND
FILTER_FULL_SCAN(operator:EQ,predicate:noIndexCol4 = 'true')"
FILTER_EXPRESSION(operator:EQ,predicate:startswith(textIndexCol1,'daff') =
'true')
PLAN_START(numSegmentsForThisPlan:1)
SELECT(selectList:invertedIndexCol1, noIndexCol1)
TRANSFORM_PASSTHROUGH(invertedIndexCol1, noIndexCol1)
PROJECT(invertedIndexCol1, noIndexCol1)
DOC_ID_SET
FILTER_EXPRESSION(operator:EQ,predicate:startswith(textIndexCol1,'daff') =
'true')
```
With verbose mode disabled, only the first plan will be selected as it has
the deepest tree:
```
BROKER_REDUCE
COMBINE_SELECT
PLAN_START(numSegmentsForThisPlan:3)
SELECT(selectList:invertedIndexCol1, noIndexCol1)
TRANSFORM_PASSTHROUGH(invertedIndexCol1, noIndexCol1)
PROJECT(invertedIndexCol1, noIndexCol1)
DOC_ID_SET
FILTER_AND
FILTER_FULL_SCAN(operator:EQ,predicate:noIndexCol4 = 'true')"
FILTER_EXPRESSION(operator:EQ,predicate:startswith(textIndexCol1,'daff') =
'true')
```
cc @siddharthteotia
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]