zeroshade opened a new pull request, #824:
URL: https://github.com/apache/arrow-go/pull/824

   ### Rationale for this change
   
   Closes #820.
   
   `(*file.Writer).startFile()` panics with the literal string `"failed to 
write magic number"` when the underlying sink's first `Write` call returns an 
error or short-writes. Because `file.NewParquetWriter` calls `startFile` 
synchronously inside the constructor and has no `error` return, callers had no 
idiomatic Go recovery path. The higher-level `pqarrow.NewFileWriter` already 
returns `(*FileWriter, error)`, but the panic propagated through it unchanged, 
so its `error` return was never reached for this failure mode.
   
   In production this manifests when the sink is a network-attached writer 
(`*storage.Writer` from GCS, an S3 multipart upload, an HDFS client, etc.). A 
transient network blip on the first 4-byte magic-header write reliably crashes 
the worker process, orphans any in-flight upload session, and forces a 
container restart.
   
   ### What changes are included in this PR?
   
   1. **`(*Writer).startFile()` now returns `error`** instead of panicking on 
sink-write failures. The configuration-validation panic for "encrypted column 
not found in file schema" is preserved — that path is a programmer error, not 
an I/O failure.
   2. **New public constructor `file.NewParquetWriterWithError(w, sc, opts...) 
(*Writer, error)`** is the preferred entry point for callers whose sink may 
transiently fail.
   3. **`file.NewParquetWriter` is now a thin wrapper** over 
`NewParquetWriterWithError` that panics with the *exact same string* (`"failed 
to write magic number"`) on init failure. Any consumer with a `recover()` block 
that string-matches the panic value continues to work unchanged.
   4. **`pqarrow.NewFileWriter` switches to `NewParquetWriterWithError`** 
internally. Its public signature is unchanged; its `error` return now 
meaningfully covers transient sink failures, with no API change for any of its 
callers.
   
   ### Are these changes tested?
   
   Yes. Six new tests in `parquet/file/file_writer_test.go` exercise the new 
behavior using a `flakyMagicSink` test helper modeled on the existing 
`errCloseWriter` pattern in the same file:
   
   - `TestNewParquetWriterWithError_Success` — happy path on a `bytes.Buffer`.
   - `TestNewParquetWriterWithError_FirstWriteFails` — first `Write` returns 
`io.ErrUnexpectedEOF`; constructor returns a wrapped error and does not panic.
   - `TestNewParquetWriterWithError_FirstWriteShortWrites` — first `Write` 
returns `n=len(p)-1, nil`; constructor returns an error mentioning the short 
write.
   - `TestNewParquetWriter_PreservesPanicMessage` — the legacy constructor 
still panics, the panic value is still a `string`, and the message is still the 
literal `"failed to write magic number"` (back-compat guard).
   - `TestPqarrowNewFileWriter_PropagatesInitError` — `pqarrow.NewFileWriter` 
returns the wrapped sink error rather than panicking.
   - `TestPqarrowNewFileWriter_PropagatesShortWrite` — same for the short-write 
case.
   
   The existing `TestCloseError`, full `parquet/file/...` and 
`parquet/pqarrow/...` test suites all pass (with `PARQUET_TEST_DATA` pointed at 
`parquet-testing/data`).
   
   ### Are there any user-facing changes?
   
   **No breaking changes.**
   
   - `file.NewParquetWriter(w, sc, opts...) *Writer` — signature unchanged, 
still panics on first-write failure with the identical literal string `"failed 
to write magic number"`. Existing `recover()` workarounds keep working.
   - `file.NewParquetWriterWithError(w, sc, opts...) (*Writer, error)` — new 
exported symbol. Adding an exported function is non-breaking in Go.
   - `pqarrow.NewFileWriter(...) (*FileWriter, error)` — signature unchanged. 
Its `error` return now also covers the magic-header-write failure that 
previously escaped as a panic. Callers who already check `err` get strictly 
better behavior.
   
   The new constructor's docstring is the recommended path for users on 
cloud-storage upload sinks. The legacy constructor's docstring now documents 
the panic behavior the original issue called out as undocumented.
   
   ### Out of scope (separate follow-up)
   
   While investigating, I noticed `WriteBatchSpaced` in 
`parquet/file/column_writer_types.gen.go` lacks the `defer recover()` that 
`WriteBatch` has, so an I/O error during a spaced write panics out of the 
calling goroutine instead of returning an error. Same class of bug as #820 but 
a different code path; happy to file a separate issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to