andygrove commented on issue #1035: URL: https://github.com/apache/datafusion-comet/issues/1035#issuecomment-2435505709
I do think that some valid concerns have been raised here. The questions at the top of my mind right now are: 1. What additional tests could we add that would have caught this issue? Even if `take` was meeting the original assumptions today, it could have changed in the future, so how do we ensure that our use of the unsafe CometBuffer is safe and that we don't have regressions if the behavior in arrow-rs changes in the future? 2. What would an integration with the arrow-rs / datafusion Parquet reader/exec look like? We need to integrate with Spark to read the raw bytes from Spark data sources (could be HDFS, S3, and many others, including custom readers) and would then need to hand those bytes off to native code for decoding. I would certainly like to be able to take advantage of the new Utf8View support. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org