metesynnada commented on PR #8021: URL: https://github.com/apache/arrow-datafusion/pull/8021#issuecomment-1791104489
Thx @tustvold, I believe that the proof of concept is looking good. My opinion: Instead of creating a new TableProvider for FIFO, we could use a `StreamingTableProvider` including its factory. We can obtain the correct factory by using the UNBOUNDED (can be change like STREAM) keyword in SQL. Additionally, we can leverage a new `StreamSinkExec` and valid `DataSink` implementations to handle different use cases like FIFO or Kafka. I think this approach could scale effectively. For FIFO, there are different types of formats as well, also a potential Kafka Sink may implement different types of `BatchSerializer`s. This was in my mind for a time but couldn't find the necessary time to code it. Does it seem like fine plan? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
