rok commented on issue #47254: URL: https://github.com/apache/arrow/issues/47254#issuecomment-3157915796
Hey @lnkuiper, thank you for the insight! Parquet encryption implementations could do more integration testing, but there is some. E.g. encrypted tables from [parquet-testing](https://github.com/apache/parquet-testing/tree/master/data) repo are read by [PyArrow](https://github.com/apache/arrow/blob/6ee27d48257ec6f997d900f8be572227bcdcaef5/cpp/src/parquet/encryption/read_configurations_test.cc#L287) and [arrow-rs](https://github.com/apache/arrow-rs/blob/5dd34630c742f3cf78f539245a6fbfdd92dde891/parquet/tests/encryption/encryption.rs#L185) tests. I imagine other implementations do something similar. As @adamreeve points out, fixing compatibility later might be awkward if your users expect that you have by the spec implementation now. In part because you might need to keep supporting current implementation and the spec specified one. But in larger part because by-the-spec implementation implies you can move encrypted files between implementations without any additional work. Aside - uniformly encrypted test file can be generated with this [c++ gist](https://gist.github.com/rok/8a68066c51801458a3746772a2c736c5). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
