zeroshade opened a new issue, #826:
URL: https://github.com/apache/arrow-go/issues/826

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   **Discovered while reviewing #824 (which fixes the related panic in 
`file.NewParquetWriter`).**
   
   Across all 8 generated `*ColumnChunkWriter` types in 
[`parquet/file/column_writer_types.gen.go`](https://github.com/apache/arrow-go/blob/main/parquet/file/column_writer_types.gen.go),
 `WriteBatch` and `WriteBatchSpaced` are an asymmetric pair:
   
   - `WriteBatch` returns `(valueOffset int64, err error)` and opens with a 
`defer recover()` that converts any panic into a returned `error` via 
`utils.FormatRecoveredError`.
   - `WriteBatchSpaced` returns nothing (`void` signature), has no `defer 
recover()`, and silently discards the `error` returned by 
`commitWriteAndCheckPageLimit`.
   
   This is the same class of "panic escapes the public API" bug as #820 (fixed 
in #824) but on a different code path.
   
   #### Code
   
   The asymmetry is most visible in `Int32ColumnChunkWriter` 
([`column_writer_types.gen.go:66-143`](https://github.com/apache/arrow-go/blob/main/parquet/file/column_writer_types.gen.go#L66-L143)):
   
   `WriteBatch` (line 66) — recovers panics, surfaces as `err`:
   
   ```go
   func (w *Int32ColumnChunkWriter) WriteBatch(values []int32, defLevels, 
repLevels []int16) (valueOffset int64, err error) {
       defer func() {
           if r := recover(); r != nil {
               err = utils.FormatRecoveredError("unknown error type", r)
           }
       }()
       ...
       w.doBatches(n, repLevels, func(offset, batch int64) {
           ...
           if err := w.commitWriteAndCheckPageLimit(batch, toWrite); err != nil 
{
               panic(err) // caught above
           }
           ...
       })
       return
   }
   ```
   
   `WriteBatchSpaced` (line 118) — no recover, no error return, silently 
swallows the `commitWriteAndCheckPageLimit` error:
   
   ```go
   func (w *Int32ColumnChunkWriter) WriteBatchSpaced(values []int32, defLevels, 
repLevels []int16, validBits []byte, validBitsOffset int64) {
       ...
       doBatches(int64(length), w.props.WriteBatchSize(), func(offset, batch 
int64) {
           ...
           w.commitWriteAndCheckPageLimit(batch, info.numSpaced()) // <-- error 
ignored
           ...
       })
   }
   ```
   
   `commitWriteAndCheckPageLimit` is defined in 
[`parquet/file/column_writer.go:264`](https://github.com/apache/arrow-go/blob/main/parquet/file/column_writer.go#L264)
 as `func (w *columnWriter) commitWriteAndCheckPageLimit(numLevels, numValues 
int64) error` — the return value is meaningful.
   
   The pattern is identical for the other 7 types — `Int64`, `Int96`, 
`Float32`, `Float64`, `Boolean`, `ByteArray`, `FixedLenByteArray`. The third 
method on each writer, `WriteDictIndices`, mirrors `WriteBatch`: it both 
returns `error` and uses `defer recover()`. So `WriteBatchSpaced` is the lone 
outlier.
   
   #### Why this matters in practice
   
   `WriteBatchSpaced` is reachable through public API surface — for example, 
`(*pqarrow.FileWriter).WriteBuffered` and the `pqarrow` array-encoder paths 
drive it for nullable columns. Any of the panic sources reachable from the 
inner closure (`maybeCalculateValidityBits`, `writeLevelsSpaced`, 
`writeValuesSpaced`, `commitWriteAndCheckPageLimit` returning a non-nil error 
in a context where the caller chose to panic, dictionary-fallback paths that 
`panic(err)` on bookkeeping failures) escapes the goroutine instead of 
surfacing as an `error`.
   
   Concrete failure modes already in the package:
   
   - `commitWriteAndCheckPageLimit` returns an error → ignored entirely (the 
simpler bug)
   - Any helper called from the closure that panics on its own internal 
invariant (e.g. dictionary fallback's `panic(err)` calls in `FallbackToPlain`, 
or the `panic(err)` paths inside 
`WriteDictionaryPage`/`drainBufferedDataPages`) → propagates out of 
`WriteBatchSpaced`, past any caller that expected `WriteBatch`-style error 
semantics, and crashes the goroutine.
   
   For consumers writing nullable columns through `pqarrow` to a 
network-attached sink (the same scenario described in #820 — GCS, S3 multipart, 
HDFS, etc.), this means a transient downstream failure on a spaced-write path 
crashes the worker process rather than returning an error.
   
   #### Source of truth
   
   Both methods are generated from 
[`parquet/file/column_writer_types.gen.go.tmpl`](https://github.com/apache/arrow-go/blob/main/parquet/file/column_writer_types.gen.go.tmpl):
   
   - `WriteBatch` template body: lines 71–172 (returns `(valueOffset int64, err 
error)`, has `defer recover()`).
   - `WriteBatchSpaced` template body: lines 189 onward (no error return, no 
`defer recover()`).
   
   A fix lands in one place and regenerates 8 method bodies plus the equivalent 
dictionary-write paths.
   
   #### Expected behavior
   
   `WriteBatchSpaced` should match `WriteBatch`:
   
   1. Signature `(valueOffset int64, err error)` (or at minimum `error` — 
matching `WriteBatch` is the cleanest API).
   2. `defer func() { if r := recover(); r != nil { err = 
utils.FormatRecoveredError("unknown error type", r) } }()` at the top.
   3. Check the `error` returned by `commitWriteAndCheckPageLimit` and 
`panic(err)` to let the deferred recover handle it (matching the `WriteBatch` 
pattern in the same file).
   
   Adding a return value to a public method is a breaking API change, so a 
minimal-breakage path is also viable:
   
   - Introduce a sibling `WriteBatchSpacedWithError(...) (int64, error)`
   - Keep `WriteBatchSpaced(...)` as a thin wrapper that calls the new method 
and panics on error with a documented panic-string format
   
   — exactly the pattern #824 chose for `NewParquetWriter` / 
`NewParquetWriterWithError`. That preserves source compatibility while giving 
callers a clean error path.
   
   The breaking-API option is also defensible if maintainers prefer fixing the 
surface properly given that `WriteBatchSpaced` returning nothing is itself a Go 
API smell (it cannot signal failure at all today).
   
   ### Component(s)
   
   Parquet
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to