This is an automated email from the ASF dual-hosted git repository.
JingsongLi pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/paimon.git
The following commit(s) were added to refs/heads/master by this push:
new 6c4e73896d [doc] Update 'blob-as-descriptor' setting in blob.md
6c4e73896d is described below
commit 6c4e73896d47af9a327a4ccce33c8681d2612e95
Author: JingsongLi <[email protected]>
AuthorDate: Mon May 25 18:07:35 2026 +0800
[doc] Update 'blob-as-descriptor' setting in blob.md
---
docs/docs/pypaimon/blob.md | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/docs/docs/pypaimon/blob.md b/docs/docs/pypaimon/blob.md
index 916d893571..dee0b9d624 100644
--- a/docs/docs/pypaimon/blob.md
+++ b/docs/docs/pypaimon/blob.md
@@ -109,18 +109,17 @@ genuinely lazy depends on how the table is configured:
This mirrors Java's `BlobFormatReader` semantics.
For genuine on-demand streaming of large blobs (videos, model weights),
-configure `blob-as-descriptor=true` before reading:
+use `table.copy` to set `blob-as-descriptor=true` before reading:
```python
-schema = Schema.from_pyarrow_schema(
- pa_schema,
- options={
- 'row-tracking.enabled': 'true',
- 'data-evolution.enabled': 'true',
- 'blob-as-descriptor': 'true',
- },
-)
-# Reads of this table return BlobRef whose new_input_stream() is lazy.
+table = catalog.get_table('my_db.image_table')
+table = table.copy({'blob-as-descriptor': 'true'})
+
+read_builder = table.new_read_builder()
+splits = read_builder.new_scan().plan().splits()
+read = read_builder.new_read()
+
+# Reads now return BlobRef whose new_input_stream() is lazy.
for row in read.to_iterator(splits):
with row.get_blob(2).new_input_stream() as stream:
chunk = stream.read(1024)