k8ika0s commented on PR #48205: URL: https://github.com/apache/arrow/pull/48205#issuecomment-3568405152
@Vishwanatha-HD Working through this one, I’m reminded how many odd little corners show up when Arrow’s layout meets Parquet’s expectations — especially around levels, decimals, and the legacy INT96 bits. Looking at the pieces that overlap with the work I’ve been doing, the overall direction makes sense. A few notes from what I’ve seen on real s390x hardware: **• BIT_PACKED level headers** Your patch keeps the `data_size - 4` bound under `ARROW_LITTLE_ENDIAN`, whereas my tree leans on accepting the full BIT_PACKED buffer and logging failures rather than early-bounding. Neither approach is wrong, but on BE machines I’ve found that the “minus 4” guard sometimes rejects buffers that are actually fine, depending on how many values the upstream encoder produced. **• Decimal serialization** This is one of the trickier spots. Parquet expects decimals in a big-endian 128-bit payload, but Arrow materializes them in little-endian limbs even on BE hardware. In my implementation I reverse the Arrow words (`[low, high]` → `[high, low]`) before handing them to the writer so the final byte stream matches the canonical Parquet format. Your patch uses `ToBigEndian` on each limb directly in host order, which works for many cases but can produce a differently ordered representation when Arrow’s in-memory layout doesn’t match the 128-bit big-endian wire format. Just sharing that in case you’ve seen similar behavior when mixing different decimal widths. **• Half-floats in FLBA** The BE path you added with `ToLittleEndian(values[i])` aligns with the intent. I ended up staging the FLBA structs and the 2-byte payloads together in one scratch buffer, mostly because some downstream consumers treat the pointer lifetime very strictly. Either way, normalizing those 2-byte halves before page assembly helps avoid the cross-architecture drift I’ve run into. **• Paging / DoInBatches** Your rewrite to enforce `max_rows_per_page` is a meaningful cleanup. My patches didn’t touch this area, so no conflicts there — but just to mention it, keeping the paging logic predictable on BE made debugging the level stream quite a bit easier for me. **• INT96 (Impala timestamp)** Your implementation writes host-order limbs on BE and memcpy on LE. In my case I leaned heavily on always emitting LE limbs so the decode path doesn’t have to branch on architecture. Both approaches work as long as the corresponding reader expects the same convention. None of this is blocking — just trying to pass along the details I’ve seen crop up when running the full parquet-encode → parquet-decode cycle on big-endian hardware. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
