SemyonSinchenko commented on issue #957: URL: https://github.com/apache/datafusion-comet/issues/957#issuecomment-2450508099
@andygrove It looks like this task is very hard for me and I need more guidance. To avoid handling all the python-related configurations in comet and reuse as much as possible from spark I need create a class that is similar to the `BaseArrowPythonRunner` but instead of `Iterator[InternalRow]` it should take `CometExecIterator`:  That requires from me to implement the trait similar to the `BasicPythonArrowInput` that implements `PythonArrowInput`. It looks like I need to override only this method:  And I need to use `ArrowStreamWriter`. But I failed to realize how to do it having `CometExecIterator` and without copying the data... It seems to me that such a functionality should be already implemented in the `org.apache.comet.vector.NativeUtils` but I failed to find it. Can you please guide me a little to the right direction? Or maybe there is a part of comet's code that I can use as an inspiration? Also there is a big chance that I'm just going into a wrong direction... Thanks in advance! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
