MartijnVisser commented on pull request #18151: URL: https://github.com/apache/flink/pull/18151#issuecomment-997963353
I am wondering if LIMIT is the best option for the Datagen connector, given the special circumstances for this connector because it generates random rows every time. I think LIMIT implies that a user wants to limit the available (or in this case, generated) results. If I would apply a LIMIT in combination with a specific reading position for Kafka, I get the same results over and over. That's not the case for Datagen of course, because it generates random values. I rather be consistent in what LIMIT does for all connectors and keep the current method of `number-of-rows` to change the connector from unbounded to bounded, so a user doesn't have to read and understand the documentation that LIMIT does something differently for Datagen compared to other connectors. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
