The test TestByteBitPacking512VectorLE.unpackValuesUsingVectorBitWidth(TestByteBitPacking512VectorLE is flaky in the Parquet github PR testing environment [1].
I gave the error to Codex (the OpenAI coding agent) and asked it to fix the test. However, since I don't have enough confidence in my own understanding of the problem or the fix, I have not opened a PR. The fix can be found on my fork here <https://github.com/dossett/parquet-java/commit/7635c8599524aadee1164fc2168801c51390b118> . The codex summary of the problem and the fix is this: We addressed CI OOMs in TestByteBitPacking512VectorLE (parquet-encoding-vector) by bounding the test input size while keeping the same correctness coverage. The original getRangeData could allocate arrays on the order of hundreds of millions of ints per bit width, which can consume tens of GB of heap and fail in constrained CI environments. The updated test generates a single bounded dataset (min 64, max 2^20 values) and spans the full legal value range for each bit width (including the full signed int range for 32‑bit). The vector and scalar pack/unpack paths are still compared for equality across bit widths, but without the unbounded memory stress that was causing flakiness. I would appreciate any feedback on that or alternatively other ways to address the flaky test, I found it very frustrating recently when I was opening several PRs. Cheers, Aaron [1] Example failure: https://github.com/apache/parquet-java/actions/runs/20671204311/job/59352228516?pr=3385 -- Aaron Niskode-Dossett, Data Engineering -- Etsy
