zeroshade commented on PR #854: URL: https://github.com/apache/arrow-go/pull/854#issuecomment-4754440862
Re-reviewed at `ae93a6ad`: all the actionable items from above are resolved and verified. In particular `vector_length` field id is now **11** (matching the `pitrou:vector-repetition` draft, alongside `VECTOR = 3`), the `12→11` change is consistent across the struct tag, read/write dispatch, IDL, and the round-trip test, and field 11 is genuinely free in parquet-format master. The draft's primitive-leaf shape matches this PR, so the earlier group-vs-leaf question is moot for now. `file` / `schema` / `thrift` and the full `pqarrow` suite (with `PARQUET_TEST_DATA`) pass; the gen file matches its template. Remaining before this graduates from experimental — none blocking the proposal: 1. **Read-side guard for repetition-type-3.** Still documentation-only: a pre-VECTOR reader silently misreads a VECTOR column as a flat required column with the wrong row count. This wants a loud rejection / feature gate on read before VECTOR is non-experimental. 2. **Fallback round-trip tests.** Add write→read assertions that the ineligible cases (nullable, zero-length, nested, non-primitive-element `FixedSizeList`) transparently fall back to `LIST` and round-trip, plus multiple VECTOR columns, a VECTOR column mixed with normal columns, and an explicit row-count check across more than one row group. Everything else (the DataPageV1 no-offset-index double scan, element-level statistics, and leaf-vs-group representation) is fine to leave as documented proposal decisions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
