zachjsh commented on PR #15035: URL: https://github.com/apache/druid/pull/15035#issuecomment-1737809462
> I am not entirely sure we need a config here either. Irrespective of what we name the config, it is going to cause confusion with the existing `timeoutMs`, especially since the `SamplerConfig` is used with batch too where this config would not be meaningful. > > I think we could just have a **fixed** maximum number of empty polls. When we poll the topic for records, and say we don't get anything for say 10 tries, we finish the sampling. > > Since this is only used for sampling, there is no real obligation to keep waiting until we have read the requested number of records. Compare this to batch where if there is no file at the specified path, we just fail or return empty and don't wait for data to show up. > > @zachjsh , @abhishekagarwal87 , what do you think? thanks @kfaraz , just wanted to be clear that this new config is not strictly needed for things to work properly, most users will not have to touch this. However its usage enabled an improved UX About your idea described above, the poll timeout is hardcoded to 100 milliseconds. There could be cases where a user may want to allow for longer than 1 second to for the next record to be read. As for confusion with timeoutMs, let me know if there is something I can add that helps differentiate them -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
