cryptoe commented on PR #14574:
URL: https://github.com/apache/druid/pull/14574#issuecomment-1632352786
@LakshSingla Thanks for the review.
> Instead of the controller determining that no boundary received means an
empty partition, I think it would be more appropriate if this logic is built
into the worker. If the output row is none, it would report
ClusterByStatisticsSnapshot.empty(), then it should report the empty partition.
Seems cleaner to me because, in that way, we are verifying that the output
rows are 0 on the worker, WDYT
I thought about this. The trickiness lies in how we calculate
`CompleteKeyStatisticsInformation` if the worker does not have any row, it
might return an empty snapshot but still in sequential merge, we need a
timebucket <-> worker to transition the worker to the next stage.
We would end up maintaining a custom "worker Map" in complete key stats
information for such snapshots which is no better in what I am doing . Hence I
decided against that approach.
> Can we add a test in MSQSelectTests that verify that this is working as
expected?
Since we donot gather stats in select q's I cannot do this via
MSQSelectTests.
For MSQInsertTest, the test case framework currently supports only one
worker, I cannot stimulate a test case where one worker has data and the other
does not without running into insert cannot be empty.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]