laskoviymishka commented on code in PR #1241:
URL: https://github.com/apache/iceberg-go/pull/1241#discussion_r3449053878


##########
table/internal/parquet_files.go:
##########
@@ -312,11 +312,13 @@ func (p parquetFormat) WriteDataFile(ctx context.Context, 
fs iceio.WriteFileIO,
                return nil, err
        }
 
+       return writeDataFileBatches(w, batches)
+}
+
+func writeDataFileBatches(w FileWriter, batches []arrow.RecordBatch) 
(iceberg.DataFile, error) {
        for _, batch := range batches {
                if err := w.Write(batch); err != nil {
-                       w.Close()
-
-                       return nil, err
+                       return nil, errors.Join(err, w.Abort())

Review Comment:
   Minor, take or leave. On the common path — write fails, abort succeeds — 
`errors.Join(err, nil)` doesn't return `err`, it wraps it in a `*joinError` 
with a different concrete type (and a trailing newline in `.Error()`). 
`errors.Is` still finds the write error so anything matching on it is fine, but 
a `==` sentinel check or type assertion on the returned error silently stops 
matching, and that's the expected case here, not an edge one.
   
   I'd keep the raw error when abort succeeds:
   
   ```go
   if abortErr := w.Abort(); abortErr != nil {
       return nil, errors.Join(err, abortErr)
   }
   return nil, err
   ```



##########
table/internal/parquet_files_internal_test.go:
##########
@@ -0,0 +1,73 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package internal
+
+import (
+       "errors"
+       "testing"
+
+       "github.com/apache/arrow-go/v18/arrow"
+       "github.com/apache/iceberg-go"
+       "github.com/stretchr/testify/assert"
+       "github.com/stretchr/testify/require"
+)
+
+type failingWriteFileWriter struct {
+       writeErr error
+       abortErr error
+
+       abortCalled bool
+       closeCalled bool
+}
+
+func (w *failingWriteFileWriter) Write(arrow.RecordBatch) error {
+       return w.writeErr
+}
+
+func (*failingWriteFileWriter) BytesWritten() int64 {
+       return 0
+}
+
+func (w *failingWriteFileWriter) Close() (iceberg.DataFile, error) {
+       w.closeCalled = true
+       panic("Close should not be called after write failure")
+}
+
+func (w *failingWriteFileWriter) Abort() error {
+       w.abortCalled = true
+
+       return w.abortErr
+}
+
+func TestWriteDataFileBatchesAbortsOnWriteError(t *testing.T) {
+       writeErr := errors.New("write failed")
+       abortErr := errors.New("abort failed")
+       writer := &failingWriteFileWriter{
+               writeErr: writeErr,
+               abortErr: abortErr,
+       }
+
+       df, err := writeDataFileBatches(writer, []arrow.RecordBatch{nil})

Review Comment:
   Nit. This only covers the both-errors case. The common path — write fails, 
abort succeeds (`abortErr == nil`) — is the one that exercises the 
type-wrapping I flagged in `parquet_files.go`, and it's untested. I'd make this 
table-driven with two subcases (abort succeeds / abort fails) and assert in the 
abort-succeeds case that the returned error is the raw `writeErr`, not a join.
   
   While you're here: `[]arrow.RecordBatch{nil}` works only because the fake's 
`Write` ignores its arg — a one-line comment saying so would save the next 
reader from thinking this validates nil-batch handling.



##########
table/internal/parquet_files_internal_test.go:
##########
@@ -0,0 +1,73 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package internal
+
+import (
+       "errors"
+       "testing"
+
+       "github.com/apache/arrow-go/v18/arrow"
+       "github.com/apache/iceberg-go"
+       "github.com/stretchr/testify/assert"
+       "github.com/stretchr/testify/require"
+)
+
+type failingWriteFileWriter struct {
+       writeErr error
+       abortErr error
+
+       abortCalled bool
+       closeCalled bool
+}
+
+func (w *failingWriteFileWriter) Write(arrow.RecordBatch) error {
+       return w.writeErr
+}
+
+func (*failingWriteFileWriter) BytesWritten() int64 {
+       return 0
+}
+
+func (w *failingWriteFileWriter) Close() (iceberg.DataFile, error) {
+       w.closeCalled = true
+       panic("Close should not be called after write failure")

Review Comment:
   This is the one I'd cover before merge. The panic here crashes the whole 
test binary if `Close` is ever called — the rest of the package's tests don't 
run, and you get a stack trace instead of a readable failure. The `closeCalled` 
set just above it is unreachable for assertion purposes, and the 
`assert.False(t, writer.closeCalled)` at the call site never runs if the panic 
fires.
   
   Since we already track `closeCalled`, I'd drop the panic and let 
`assert.False(closeCalled)` do the work — it's the idiomatic way to say "this 
must not be called."



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to