XiaoHongbo-Hope commented on PR #7891:
URL: https://github.com/apache/paimon/pull/7891#issuecomment-4532324066
> ## Review: [python] Support row-level Blob access (Revised)
> This revision removes the `to_blob_iterator()` approach in favor of
propagating `file_io` / `blob_field_indices` through the reader chain, which is
a much cleaner design. Comments below:
>
> ### 2. `Blob.from_bytes()` — dead branch in condition
> ```python
> if not allow_blob_data and not is_descriptor:
> raise ValueError(...) # ← exits here
>
> if is_descriptor or not allow_blob_data: # ← "not allow_blob_data" is
dead code
> ...
> ```
>
> If `not allow_blob_data` is `True`, the only way to reach the second `if`
is when `is_descriptor` is `True` (otherwise we already raised). So the
condition simplifies to:
>
> ```python
> if is_descriptor:
> ...
> return BlobData(data)
> ```
>
> ### 4. `OffsetRow` post-construction mutation
> `set_file_io()` / `set_blob_field_indices()` mutate state after
`__init__`, making `OffsetRow` harder to reason about. These values are known
at construction time (from the reader that creates the iterator). Consider
passing them through the constructor, or at least through
`InternalRowWrapperIterator.__init__` → `OffsetRow.__init__`, rather than
setter methods.
>
> This also interacts with the `_reused_row` pattern — since the same
`OffsetRow` is reused across rows, the blob context is set once and remains
valid. But if someone constructs an `OffsetRow` directly without calling the
setters (e.g., in tests or custom code), `get_blob()` will raise a confusing
`TypeError("not a BLOB field")` when the real issue is missing setup.
>
> ### 5. PR description is empty
> Please fill in the Purpose section — what motivated the change, and what
user-facing API is being added. This helps reviewers and future readers of git
history.
Thanks, Jingsong, updated
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]