Hello,

On Mon, 27 May 2024 22:46:45 -0700
Micah Kornfield <emkornfi...@gmail.com>
wrote:
> 
> 2.  Is anybody interested in looking more deeply into developing
> integration tests between the different Parquet implementations and major
> down-stream consumers of Parquet?  I believe Apache arrow has a pretty good
> model [3][4] in a lot of respects with cross-language integration tests,
> and nightly (via crossbow) integration tests with other consumers, but
> there are a wide variety of things that would improve the current state.
> One other possible concern is the amount of CI resources this might
> consume, and if we will need contributions to fund it.

Caveat: Arrow has a lot less parameters to test for. The variability is
mostly one-dimensional and falls under the data type rubric. As a
matter of fact, other Arrow features such as compression or delta
dictionaries are less well-tested.

Testing Parquet interoperability could easily get into a combinatorial
explosion of optional features, encodings, etc.

I'm not saying that it shouldn't be done, but it may require a different
approach than Arrow's approach of building and testing all
implementations against each other in a single CI job.

Regards

Antoine.


Reply via email to