Hello,
On Mon, 27 May 2024 22:46:45 -0700 Micah Kornfield <emkornfi...@gmail.com> wrote: > > 2. Is anybody interested in looking more deeply into developing > integration tests between the different Parquet implementations and major > down-stream consumers of Parquet? I believe Apache arrow has a pretty good > model [3][4] in a lot of respects with cross-language integration tests, > and nightly (via crossbow) integration tests with other consumers, but > there are a wide variety of things that would improve the current state. > One other possible concern is the amount of CI resources this might > consume, and if we will need contributions to fund it. Caveat: Arrow has a lot less parameters to test for. The variability is mostly one-dimensional and falls under the data type rubric. As a matter of fact, other Arrow features such as compression or delta dictionaries are less well-tested. Testing Parquet interoperability could easily get into a combinatorial explosion of optional features, encodings, etc. I'm not saying that it shouldn't be done, but it may require a different approach than Arrow's approach of building and testing all implementations against each other in a single CI job. Regards Antoine.