alamb commented on issue #14078: URL: https://github.com/apache/datafusion/issues/14078#issuecomment-2810394103
> The tricky part to implement is array encoding like REE or bit-packing for integer arrays. Maybe we can find some reusable code in Arrow Parquet writer implementation or use something like https://github.com/spiraldb/vortex. But it's okay to start without those encodings. I think we might get pretty far by simply compressing the arrow stream zstd or snappy (which would be far simpler to implement as the arrow writer already supports this) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org