This is an automated email from the ASF dual-hosted git repository.
mengw15 pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/texera.git
The following commit(s) were added to refs/heads/main by this push:
new 9422def3a8 fix: preserve original error in
IcebergIterator._seek_to_usable_file (#5092)
9422def3a8 is described below
commit 9422def3a8c9ef5b8ee128afc4095d07104f3e2f
Author: Meng Wang <[email protected]>
AuthorDate: Mon May 18 14:40:56 2026 -0700
fix: preserve original error in IcebergIterator._seek_to_usable_file (#5092)
### What changes were proposed in this PR?
`IcebergIterator._seek_to_usable_file` previously swallowed every error
during file-scan setup:
```python
except Exception:
print("Could not read iceberg table:\n")
raise Exception
```
The bare `raise Exception` (no args, no `from`) constructs a fresh
`Exception` with empty `str()` and no `__cause__`. Callers that do
`except Exception as e: log.error(str(e))` see only an empty class name
— the original error type, message, and traceback are all lost. The
`print` also bypasses the project logger.
This PR replaces the bare re-raise with a true re-raise of the original
exception and routes the diagnostic message through `loguru`, matching
the existing `except Exception as err: logger.exception(err)` pattern
used in `data_processor.py:125`, `main_loop.py:422`, and
`input_port_materialization_reader_runnable.py:169`:
```python
except Exception as err:
logger.exception(err)
raise
```
Callers now see the actual underlying exception (catalog auth failure,
S3 IO error, manifest corruption, etc.) with its full class name,
message, and traceback.
### Any related issues, documentation, discussions?
Closes #5091.
### How was this PR tested?
Added
`amber/src/test/python/core/storage/iceberg/test_iceberg_iterator_error_paths.py`,
a slim mocked regression test: it patches `load_table_metadata` to
return a `Mock` whose `refresh()` raises `RuntimeError("Catalog auth
failure: token expired")`, drives `next(IcebergIterator(...))`, and
asserts the caller observes the original `RuntimeError` (via
`pytest.raises(RuntimeError, match=...)`). Locks in the contract that
the except clause must not swallow the underlying exception's
type/message.
Run locally:
```
python -m pytest
amber/src/test/python/core/storage/iceberg/test_iceberg_iterator_error_paths.py
-v
```
Result: `1 passed`.
### Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Code (claude-opus-4-7)
Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
---
.../core/storage/iceberg/iceberg_document.py | 7 ++--
.../iceberg/test_iceberg_iterator_error_paths.py | 37 ++++++++++++++++++++++
2 files changed, 41 insertions(+), 3 deletions(-)
diff --git a/amber/src/main/python/core/storage/iceberg/iceberg_document.py
b/amber/src/main/python/core/storage/iceberg/iceberg_document.py
index 997ab9b5b7..7a5beda916 100644
--- a/amber/src/main/python/core/storage/iceberg/iceberg_document.py
+++ b/amber/src/main/python/core/storage/iceberg/iceberg_document.py
@@ -17,6 +17,7 @@
import pyarrow as pa
from itertools import islice
+from loguru import logger
from pyiceberg.catalog import Catalog
from pyiceberg.schema import Schema
from pyiceberg.table import Table, FileScanTask
@@ -211,9 +212,9 @@ class IcebergIterator(Iterator[T]):
self.num_of_skipped_records += record_count
continue
yield task
- except Exception:
- print("Could not read iceberg table:\n")
- raise Exception
+ except Exception as err:
+ logger.exception(err)
+ raise
else:
return iter([])
diff --git
a/amber/src/test/python/core/storage/iceberg/test_iceberg_iterator_error_paths.py
b/amber/src/test/python/core/storage/iceberg/test_iceberg_iterator_error_paths.py
new file mode 100644
index 0000000000..d724ac31d5
--- /dev/null
+++
b/amber/src/test/python/core/storage/iceberg/test_iceberg_iterator_error_paths.py
@@ -0,0 +1,37 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from unittest.mock import Mock, patch
+
+import pytest
+
+from core.storage.iceberg import iceberg_document
+from core.storage.iceberg.iceberg_document import IcebergIterator
+
+
+def test_seek_to_usable_file_preserves_original_error():
+ failing_table = Mock()
+ failing_table.refresh.side_effect = RuntimeError(
+ "Catalog auth failure: token expired"
+ )
+
+ with patch.object(
+ iceberg_document, "load_table_metadata", return_value=failing_table
+ ):
+ it = IcebergIterator(0, None, None, "ns", "tbl", None, None)
+ with pytest.raises(RuntimeError, match="Catalog auth failure"):
+ next(it)