Re: [PR] Client doc website [fluss-rust]

2026-02-14 Thread via GitHub


luoyuxia merged PR #300:
URL: https://github.com/apache/fluss-rust/pull/300


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-14 Thread via GitHub


luoyuxia commented on code in PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#discussion_r2808220207


##
.github/workflows/release_python.yml:
##
@@ -78,6 +81,9 @@ jobs:
 steps:
   - uses: actions/checkout@v4
 
+  - name: Generate Python README
+run: python3 bindings/python/generate_readme.py

Review Comment:
   Thank you very much for the explanation!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-14 Thread via GitHub


fresh-borzoni commented on PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#issuecomment-3902934353

   No comments, we'd better merge


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-14 Thread via GitHub


leekeiabstraction commented on PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#issuecomment-3902775446

   Thank you both for the diligent and tireless reviews 🙏! I've addressed most 
comments. Let's get this merged and we can address further changes in smaller 
PRs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-14 Thread via GitHub


leekeiabstraction commented on code in PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#discussion_r2808097334


##
website/docs/user-guide/python/example/log-tables.md:
##
@@ -0,0 +1,122 @@
+---
+sidebar_position: 4
+---
+# Log Tables
+
+Log tables are append-only tables without primary keys, suitable for event 
streaming.
+
+## Creating a Log Table
+
+```python
+import pyarrow as pa
+
+schema = fluss.Schema(pa.schema([
+pa.field("id", pa.int32()),
+pa.field("name", pa.string()),
+pa.field("score", pa.float32()),
+]))
+
+table_path = fluss.TablePath("fluss", "events")
+await admin.create_table(table_path, fluss.TableDescriptor(schema), 
ignore_if_exists=True)
+```
+
+## Writing
+
+Rows can be appended as dicts, lists, or tuples. For bulk writes, use 
`write_arrow()`, `write_arrow_batch()`, or `write_pandas()`.
+
+Write methods like `append()` and `write_arrow_batch()` return a 
`WriteResultHandle`. You can ignore it for fire-and-forget semantics (flush at 
the end), or `await handle.wait()` to block until the server acknowledges that 
specific write.
+
+```python
+table = await conn.get_table(table_path)
+writer = table.new_append().create_writer()
+
+# Fire-and-forget: queue writes, flush at the end
+writer.append({"id": 1, "name": "Alice", "score": 95.5})
+writer.append([2, "Bob", 87.0])
+await writer.flush()
+
+# Per-record acknowledgment
+handle = writer.append({"id": 3, "name": "Charlie", "score": 91.0})
+await handle.wait()
+
+# Bulk writes
+writer.write_arrow(pa_table)  # PyArrow Table
+writer.write_arrow_batch(record_batch) # PyArrow RecordBatch
+writer.write_pandas(df)# Pandas DataFrame
+await writer.flush()
+```
+
+## Reading
+
+There are two scanner types:
+- **Batch scanner** (`create_record_batch_log_scanner()`): returns Arrow 
Tables or DataFrames, best for analytics
+- **Record scanner** (`create_log_scanner()`): returns individual records with 
metadata (offset, timestamp, change type), best for streaming
+
+And two reading modes:
+- **`to_arrow()` / `to_pandas()`**: reads all data from subscribed buckets up 
to the current latest offset, then returns. Best for one-shot batch reads.
+- **`poll_arrow()` / `poll()` / `poll_record_batch()`**: returns whatever data 
is available within the timeout, then returns. Call in a loop for continuous 
streaming.
+
+### Batch Read (One-Shot)
+
+```python
+num_buckets = (await admin.get_table_info(table_path)).num_buckets
+
+scanner = await table.new_scan().create_record_batch_log_scanner()
+scanner.subscribe_buckets({i: fluss.EARLIEST_OFFSET for i in 
range(num_buckets)})
+
+# Reads everything up to current latest offset, then returns
+arrow_table = scanner.to_arrow()
+df = scanner.to_pandas()
+```
+
+### Continuous Polling
+
+Use `poll_arrow()` or `poll()` in a loop for streaming consumption:
+
+```python
+# Batch scanner: poll as Arrow Tables
+scanner = await table.new_scan().create_record_batch_log_scanner()
+scanner.subscribe(bucket_id=0, start_offset=fluss.EARLIEST_OFFSET)
+
+while True:
+result = scanner.poll_arrow(timeout_ms=5000)
+if result.num_rows > 0:
+print(result.to_pandas())
+
+# Record scanner: poll individual records with metadata
+scanner = await table.new_scan().create_log_scanner()
+scanner.subscribe_buckets({i: fluss.EARLIEST_OFFSET for i in 
range(num_buckets)})
+
+while True:
+for record in scanner.poll(timeout_ms=5000):
+print(f"offset={record.offset}, 
change={record.change_type.short_string()}, row={record.row}")
+```
+
+### Unsubscribing
+
+To stop consuming from a bucket, use `unsubscribe()`:
+
+```python
+scanner.unsubscribe(bucket_id=0)
+```
+
+### Subscribe from Latest Offset

Review Comment:
   Will leave it to Anton to update this as his PR deals with making Offset 
types consistent.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-14 Thread via GitHub


leekeiabstraction commented on code in PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#discussion_r2808096554


##
website/docs/index.md:
##
@@ -0,0 +1,33 @@
+---
+slug: /
+sidebar_position: 1
+title: Introduction
+---
+
+# Introduction
+
+[Apache Fluss](https://fluss.apache.org/) (incubating) is a streaming storage 
system built for real-time analytics, serving as the real-time data layer for 
Lakehouse architectures.
+
+This documentation covers the **Fluss client libraries** for Rust, Python, and 
C++, which are developed in the 
[fluss-rust](https://github.com/apache/fluss-rust) repository. These clients 
allow you to:
+
+- **Create and manage** databases, tables, and partitions
+- **Write** data to log tables (append-only) and primary key tables 
(upsert/delete)
+- **Read** data via log scanning and key lookups
+- **Integrate** with the broader Fluss ecosystem including lakehouse snapshots
+
+## Client Overview
+
+|| Rust
   | Python   | C++ 
   |
+|||--||
+| **Package**| [fluss-rs](https://crates.io/crates/fluss-rs) on 
crates.io | Build from source (PyO3) | Build from source (CMake)
  |
+| **Async runtime**  | Tokio   
   | asyncio  | Synchronous (Tokio runtime managed 
internally) |
+| **Data format**| Arrow RecordBatch / GenericRow  
   | PyArrow / Pandas / dict  | Arrow RecordBatch / GenericRow  
   |
+| **Log tables** | Read + Write
   | Read + Write | Read + Write
   |
+| **Primary key tables** | Upsert + Delete + Lookup
   | Upsert + Delete + Lookup | Upsert + Delete + Lookup
   |
+| **Partitioned tables** | Full support
   | Write support| Full support
   |

Review Comment:
   You're right, updated



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-14 Thread via GitHub


leekeiabstraction commented on code in PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#discussion_r2808096392


##
website/docs/developer-guide/release.md:
##
@@ -0,0 +1,181 @@
+# Release

Review Comment:
   Moved to release section for now. We can adjust further in later PRs



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-14 Thread via GitHub


leekeiabstraction commented on code in PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#discussion_r2808096101


##
.github/workflows/release_python.yml:
##
@@ -78,6 +81,9 @@ jobs:
 steps:
   - uses: actions/checkout@v4
 
+  - name: Generate Python README
+run: python3 bindings/python/generate_readme.py

Review Comment:
   I'm thinking of removing README for rust and CPP and point to doc. 
   
   Generation of python README from doc is necessary as we publish python 
artefact with single README file as description. Thus this script generates a 
readme by concatenating docs from python.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-14 Thread via GitHub


leekeiabstraction commented on code in PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#discussion_r2808095531


##
bindings/cpp/README.md:
##
@@ -1,21 +1,3 @@
-

Re: [PR] Client doc website [fluss-rust]

2026-02-14 Thread via GitHub


leekeiabstraction commented on code in PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#discussion_r2808094939


##
website/docs/user-guide/cpp/api-reference.md:
##
@@ -0,0 +1,494 @@
+---
+sidebar_position: 2
+---
+# API Reference
+
+Complete API reference for the Fluss C++ client.
+
+## `Result`
+
+| Field / Method  | Type  | Description
|
+|-|---||
+| `error_code`| `int32_t` | 0 for success, non-zero for errors 
|
+| `error_message` | `std::string` | Human-readable error description   
|
+| `Ok()`  | `bool`| Returns `true` if operation succeeded 
(`error_code == 0`)  |
+
+## `Configuration`
+
+| Field | Type  | Default  | 
Description |
+|---|---|--|-|
+| `bootstrap_servers`   | `std::string` | `"127.0.0.1:9123"`   | 
Coordinator server address  |
+| `writer_request_max_size` | `int32_t` | `10485760` (10 MB)   | 
Maximum request size in bytes   |
+| `writer_acks` | `std::string` | `"all"`  | 
Acknowledgment setting (`"all"`, `"0"`, `"1"`, or `"-1"`)   |
+| `writer_retries`  | `int32_t` | `INT32_MAX`  | 
Number of retries on failure|
+| `writer_batch_size`   | `int32_t` | `2097152` (2 MB) | 
Batch size for writes in bytes  |
+| `scanner_remote_log_prefetch_num` | `size_t`  | `4`  | 
Number of remote log segments to prefetch   |
+| `remote_file_download_thread_num` | `size_t`  | `3`  | 
Number of threads for remote log downloads  |
+
+## `Connection`
+
+| Method  | 
Description   |
+|-|---|
+| `static Create(const Configuration& config, Connection& out) -> Result` | 
Create a connection to a Fluss cluster|
+| `GetAdmin(Admin& out) -> Result`| 
Get the admin interface   |
+| `GetTable(const TablePath& table_path, Table& out) -> Result`   | 
Get a table for read/write operations |
+| `Available() -> bool`   | 
Check if the connection is valid and initialized  |
+
+## `Admin`
+
+### Database Operations
+
+| Method   
 | Description  |
+|---|--|
+| `CreateDatabase(const std::string& database_name, const DatabaseDescriptor& 
descriptor, bool ignore_if_exists) -> Result` | Create a database|
+| `DropDatabase(const std::string& name, bool ignore_if_not_exists, bool 
cascade) -> Result`| Drop a database  |
+| `ListDatabases(std::vector& out) -> Result` 
 | List all databases   |
+| `DatabaseExists(const std::string& name, bool& out) -> Result`   
 | Check if a database exists |
+| `GetDatabaseInfo(const std::string& name, DatabaseInfo& out) -> Result`  
 | Get database metadata|
+
+### Table Operations
+
+| Method   
  | Description |
+||-|
+| `CreateTable(const TablePath& path, const TableDescriptor& descriptor, bool 
ignore_if_exists) -> Result`   | Create a table  |
+| `DropTable(const TablePath& path, bool ignore_if_not_exists) -> Result`  
  | Drop a table|
+| `GetTableInfo(const TablePath& path, TableInfo& out) -> Result`  
  | Get table metadata  |
+| `ListTables(const std::string& database_name, std::vector& out) 
-> Result`| List tables in a database   |
+| `TableExists(const TablePath& path, boo

Re: [PR] Client doc website [fluss-rust]

2026-02-14 Thread via GitHub


leekeiabstraction commented on code in PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#discussion_r2808094278


##
website/docs/user-guide/cpp/api-reference.md:
##
@@ -0,0 +1,494 @@
+---
+sidebar_position: 2
+---
+# API Reference
+
+Complete API reference for the Fluss C++ client.
+
+## `Result`
+
+| Field / Method  | Type  | Description
|
+|-|---||
+| `error_code`| `int32_t` | 0 for success, non-zero for errors 
|
+| `error_message` | `std::string` | Human-readable error description   
|
+| `Ok()`  | `bool`| Returns `true` if operation succeeded 
(`error_code == 0`)  |
+
+## `Configuration`
+
+| Field | Type  | Default  | 
Description |
+|---|---|--|-|
+| `bootstrap_servers`   | `std::string` | `"127.0.0.1:9123"`   | 
Coordinator server address  |
+| `writer_request_max_size` | `int32_t` | `10485760` (10 MB)   | 
Maximum request size in bytes   |
+| `writer_acks` | `std::string` | `"all"`  | 
Acknowledgment setting (`"all"`, `"0"`, `"1"`, or `"-1"`)   |
+| `writer_retries`  | `int32_t` | `INT32_MAX`  | 
Number of retries on failure|
+| `writer_batch_size`   | `int32_t` | `2097152` (2 MB) | 
Batch size for writes in bytes  |
+| `scanner_remote_log_prefetch_num` | `size_t`  | `4`  | 
Number of remote log segments to prefetch   |
+| `remote_file_download_thread_num` | `size_t`  | `3`  | 
Number of threads for remote log downloads  |
+
+## `Connection`
+
+| Method  | 
Description   |
+|-|---|
+| `static Create(const Configuration& config, Connection& out) -> Result` | 
Create a connection to a Fluss cluster|
+| `GetAdmin(Admin& out) -> Result`| 
Get the admin interface   |
+| `GetTable(const TablePath& table_path, Table& out) -> Result`   | 
Get a table for read/write operations |
+| `Available() -> bool`   | 
Check if the connection is valid and initialized  |
+
+## `Admin`
+
+### Database Operations
+
+| Method   
 | Description  |
+|---|--|
+| `CreateDatabase(const std::string& database_name, const DatabaseDescriptor& 
descriptor, bool ignore_if_exists) -> Result` | Create a database|
+| `DropDatabase(const std::string& name, bool ignore_if_not_exists, bool 
cascade) -> Result`| Drop a database  |
+| `ListDatabases(std::vector& out) -> Result` 
 | List all databases   |
+| `DatabaseExists(const std::string& name, bool& out) -> Result`   
 | Check if a database exists |
+| `GetDatabaseInfo(const std::string& name, DatabaseInfo& out) -> Result`  
 | Get database metadata|
+
+### Table Operations
+
+| Method   
  | Description |
+||-|
+| `CreateTable(const TablePath& path, const TableDescriptor& descriptor, bool 
ignore_if_exists) -> Result`   | Create a table  |
+| `DropTable(const TablePath& path, bool ignore_if_not_exists) -> Result`  
  | Drop a table|
+| `GetTableInfo(const TablePath& path, TableInfo& out) -> Result`  
  | Get table metadata  |
+| `ListTables(const std::string& database_name, std::vector& out) 
-> Result`| List tables in a database   |
+| `TableExists(const TablePath& path, boo

Re: [PR] Client doc website [fluss-rust]

2026-02-14 Thread via GitHub


leekeiabstraction commented on code in PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#discussion_r2808092580


##
website/docs/user-guide/python/api-reference.md:
##
@@ -0,0 +1,281 @@
+---
+sidebar_position: 2
+---
+# API Reference
+
+Complete API reference for the Fluss Python client.
+
+## `Config`
+
+| Method / Property | Description  
|
+|---|--|

Review Comment:
   Updated API Reference section for python



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-14 Thread via GitHub


leekeiabstraction commented on code in PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#discussion_r2808092363


##
website/docs/user-guide/python/error-handling.md:
##
@@ -0,0 +1,19 @@
+---
+sidebar_position: 4
+---
+# Error Handling
+
+The client raises `fluss.FlussError` for Fluss-specific errors:
+
+```python

Review Comment:
   Updated error handling section for Python



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-13 Thread via GitHub


luoyuxia commented on code in PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#discussion_r2806734522


##
website/docs/user-guide/python/installation.md:
##
@@ -0,0 +1,41 @@
+---
+sidebar_position: 1
+---
+# Installation
+
+```bash
+pip install pyfluss
+```
+
+To build from source instead:

Review Comment:
   Make it as a title to higlight the following context is to build from source 
instead. 



##
website/docs/user-guide/cpp/example/configuration.md:
##
@@ -0,0 +1,35 @@
+---
+sidebar_position: 2
+---
+# Configuration

Review Comment:
   Maybe some thing like:
   # Fluss Connection
   
   ## Connection Setup
   
   ## Configuration options to Setup Connection
   
   ?
   
   The Configuration is just a class to create connection, maybe not suitable 
as the main title.
   WDYT?



##
website/docs/user-guide/cpp/example/configuration.md:
##
@@ -0,0 +1,35 @@
+---
+sidebar_position: 2
+---
+# Configuration

Review Comment:
   And same for other language clients.



##
website/docs/user-guide/python/example/log-tables.md:
##
@@ -0,0 +1,122 @@
+---
+sidebar_position: 4
+---
+# Log Tables
+
+Log tables are append-only tables without primary keys, suitable for event 
streaming.
+
+## Creating a Log Table
+
+```python
+import pyarrow as pa
+
+schema = fluss.Schema(pa.schema([
+pa.field("id", pa.int32()),
+pa.field("name", pa.string()),
+pa.field("score", pa.float32()),
+]))
+
+table_path = fluss.TablePath("fluss", "events")
+await admin.create_table(table_path, fluss.TableDescriptor(schema), 
ignore_if_exists=True)
+```
+
+## Writing
+
+Rows can be appended as dicts, lists, or tuples. For bulk writes, use 
`write_arrow()`, `write_arrow_batch()`, or `write_pandas()`.
+
+Write methods like `append()` and `write_arrow_batch()` return a 
`WriteResultHandle`. You can ignore it for fire-and-forget semantics (flush at 
the end), or `await handle.wait()` to block until the server acknowledges that 
specific write.
+
+```python
+table = await conn.get_table(table_path)
+writer = table.new_append().create_writer()
+
+# Fire-and-forget: queue writes, flush at the end
+writer.append({"id": 1, "name": "Alice", "score": 95.5})
+writer.append([2, "Bob", 87.0])
+await writer.flush()
+
+# Per-record acknowledgment
+handle = writer.append({"id": 3, "name": "Charlie", "score": 91.0})
+await handle.wait()
+
+# Bulk writes
+writer.write_arrow(pa_table)  # PyArrow Table
+writer.write_arrow_batch(record_batch) # PyArrow RecordBatch
+writer.write_pandas(df)# Pandas DataFrame
+await writer.flush()
+```
+
+## Reading
+
+There are two scanner types:
+- **Batch scanner** (`create_record_batch_log_scanner()`): returns Arrow 
Tables or DataFrames, best for analytics
+- **Record scanner** (`create_log_scanner()`): returns individual records with 
metadata (offset, timestamp, change type), best for streaming
+
+And two reading modes:
+- **`to_arrow()` / `to_pandas()`**: reads all data from subscribed buckets up 
to the current latest offset, then returns. Best for one-shot batch reads.
+- **`poll_arrow()` / `poll()` / `poll_record_batch()`**: returns whatever data 
is available within the timeout, then returns. Call in a loop for continuous 
streaming.
+
+### Batch Read (One-Shot)
+
+```python
+num_buckets = (await admin.get_table_info(table_path)).num_buckets
+
+scanner = await table.new_scan().create_record_batch_log_scanner()
+scanner.subscribe_buckets({i: fluss.EARLIEST_OFFSET for i in 
range(num_buckets)})
+
+# Reads everything up to current latest offset, then returns
+arrow_table = scanner.to_arrow()
+df = scanner.to_pandas()
+```
+
+### Continuous Polling
+
+Use `poll_arrow()` or `poll()` in a loop for streaming consumption:
+
+```python
+# Batch scanner: poll as Arrow Tables
+scanner = await table.new_scan().create_record_batch_log_scanner()
+scanner.subscribe(bucket_id=0, start_offset=fluss.EARLIEST_OFFSET)
+
+while True:
+result = scanner.poll_arrow(timeout_ms=5000)
+if result.num_rows > 0:
+print(result.to_pandas())
+
+# Record scanner: poll individual records with metadata
+scanner = await table.new_scan().create_log_scanner()
+scanner.subscribe_buckets({i: fluss.EARLIEST_OFFSET for i in 
range(num_buckets)})
+
+while True:
+for record in scanner.poll(timeout_ms=5000):
+print(f"offset={record.offset}, 
change={record.change_type.short_string()}, row={record.row}")
+```
+
+### Unsubscribing
+
+To stop consuming from a bucket, use `unsubscribe()`:
+
+```python
+scanner.unsubscribe(bucket_id=0)
+```
+
+### Subscribe from Latest Offset

Review Comment:
   Seems it's not valid. Right? We can remove it.



##
website/docs/index.md:
##
@@ -0,0 +1,33 @@
+---
+slug: /
+sidebar_position: 1
+title: Introduction
+---
+
+# Introduction
+
+[Apache Fluss](https://fluss.apache.org/) (incubating) is a streaming storage 
system built for real-time analytics, serving as the real-time data layer

Re: [PR] Client doc website [fluss-rust]

2026-02-13 Thread via GitHub


luoyuxia commented on code in PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#discussion_r2806723681


##
website/docs/developer-guide/release.md:
##
@@ -0,0 +1,181 @@
+# Release

Review Comment:
   may make how to verify a release as a doc in website



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-13 Thread via GitHub


fresh-borzoni commented on code in PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#discussion_r2806713920


##
website/docs/user-guide/python/error-handling.md:
##
@@ -0,0 +1,19 @@
+---
+sidebar_position: 4
+---
+# Error Handling
+
+The client raises `fluss.FlussError` for Fluss-specific errors:
+
+```python

Review Comment:
   Yeah, up to @leekeiabstraction, we can create umbrella issue and collect 
leftovers there



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-13 Thread via GitHub


luoyuxia commented on code in PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#discussion_r2806615474


##
.github/workflows/release_python.yml:
##
@@ -78,6 +81,9 @@ jobs:
 steps:
   - uses: actions/checkout@v4
 
+  - name: Generate Python README
+run: python3 bindings/python/generate_readme.py

Review Comment:
   seems it only generate read me for python. Howabout rust and cpp?



##
bindings/cpp/README.md:
##
@@ -1,21 +1,3 @@
-

Re: [PR] Client doc website [fluss-rust]

2026-02-13 Thread via GitHub


luoyuxia commented on code in PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#discussion_r2806657543


##
website/docs/user-guide/python/error-handling.md:
##
@@ -0,0 +1,19 @@
+---
+sidebar_position: 4
+---
+# Error Handling
+
+The client raises `fluss.FlussError` for Fluss-specific errors:
+
+```python

Review Comment:
   Maybe we can have in another pr since the curernt is big :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-13 Thread via GitHub


fresh-borzoni commented on PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#issuecomment-3898974477

   @leekeiabstraction I created https://github.com/apache/fluss-rust/pull/313 
to address inconsistencies found while reviewing doc


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-13 Thread via GitHub


fresh-borzoni commented on PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#issuecomment-3898248183

   Also we may wish to specify default precision for Timestamps - it's 6. It 
might be important.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-13 Thread via GitHub


fresh-borzoni commented on code in PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#discussion_r2804930425


##
website/docs/user-guide/python/api-reference.md:
##
@@ -0,0 +1,281 @@
+---
+sidebar_position: 2
+---
+# API Reference
+
+Complete API reference for the Fluss Python client.
+
+## `Config`
+
+| Method / Property | Description  
|
+|---|--|

Review Comment:
   it doesn't look complete



##
website/docs/user-guide/python/example/configuration.md:
##
@@ -0,0 +1,34 @@
+---
+sidebar_position: 2
+---
+# Configuration
+
+```python
+import fluss
+
+config = fluss.Config({"bootstrap.servers": "127.0.0.1:9123"})
+conn = await fluss.FlussConnection.create(config)
+```
+
+The connection also supports context managers:
+
+```python
+with await fluss.FlussConnection.create(config) as conn:
+...
+```
+
+## Configuration Options
+
+| Key | Description   
| Default|
+|-|---||
+| `bootstrap.servers` | Coordinator server address
| `127.0.0.1:9123`   |
+| `request.max.size`  | Maximum request size in bytes 
| `10485760` (10 MB) |
+| `writer.acks`   | Acknowledgment setting (`all` waits for all replicas) 
| `all`  |
+| `writer.retries`| Number of retries on failure  
| `2147483647`   |
+| `writer.batch.size` | Batch size for writes in bytes
| `2097152` (2 MB)   |

Review Comment:
   I think we have different dict keys in config, with '-', but it's my editor 
played funny auto-completion, so I'll fix to match this doc



##
website/docs/user-guide/python/error-handling.md:
##
@@ -0,0 +1,19 @@
+---
+sidebar_position: 4
+---
+# Error Handling
+
+The client raises `fluss.FlussError` for Fluss-specific errors:
+
+```python

Review Comment:
   Shall we add more details to error handling? Python looks surprisingly thin 
:)
   no error_code explanation or anything



##
website/docs/user-guide/rust/example/partitioned-tables.md:
##
@@ -0,0 +1,215 @@
+---
+sidebar_position: 6
+---
+# Partitioned Tables
+
+Partitioned tables distribute data across partitions based on partition column 
values, enabling efficient data organization and querying. Both log tables and 
primary key tables support partitioning.
+
+## Partitioned Log Tables
+
+### Creating a Partitioned Log Table
+
+```rust
+use fluss::metadata::{DataTypes, LogFormat, Schema, TableDescriptor, 
TablePath};
+
+let table_descriptor = TableDescriptor::builder()
+.schema(
+Schema::builder()
+.column("event_id", DataTypes::int())
+.column("event_type", DataTypes::string())
+.column("dt", DataTypes::string())
+.column("region", DataTypes::string())
+.build()?,
+)
+.partitioned_by(vec!["dt", "region"])
+.log_format(LogFormat::ARROW)
+.build()?;
+
+let table_path = TablePath::new("fluss", "partitioned_events");
+admin.create_table(&table_path, &table_descriptor, true).await?;
+```
+
+### Writing to Partitioned Log Tables
+
+**Partitions must exist before writing data, otherwise the client will by 
default retry indefinitely.** Include partition column values in each row, the 
client routes records to the correct partition automatically.
+
+```rust
+use fluss::metadata::PartitionSpec;
+use std::collections::HashMap;
+
+let table = conn.get_table(&table_path).await?;
+
+// Create the partition before writing
+let mut partition_values = HashMap::new();
+partition_values.insert("dt", "2024-01-15");
+partition_values.insert("region", "US");
+admin.create_partition(&table_path, &PartitionSpec::new(partition_values), 
true).await?;
+
+let append_writer = table.new_append()?.create_writer()?;
+
+let mut row = GenericRow::new(4);
+row.set_field(0, 1);  // event_id
+row.set_field(1, "user_login");   // event_type
+row.set_field(2, "2024-01-15");   // dt (partition column)
+row.set_field(3, "US");   // region (partition column)
+
+append_writer.append(&row)?;
+append_writer.flush().await?;
+```
+
+### Reading from Partitioned Log Tables
+
+For partitioned tables, use partition-aware subscribe methods.
+
+```rust
+use std::time::Duration;
+
+let table = conn.get_table(&table_path).await?;
+let admin = conn.get_admin().await?;
+let partitions = admin.list_partition_infos(&table_path).await?;
+
+let log_scanner = table.new_scan().create_log_scanner()?;
+
+// Subscribe to each partition's buckets
+for partition_info in &partitions {
+let partition_id = partition_info.get_partition_id();
+let num_buckets = table.get_table_info().get_num_buckets();
+for bucket_id in 0..num_buckets {
+log_s

Re: [PR] Client doc website [fluss-rust]

2026-02-13 Thread via GitHub


leekeiabstraction commented on code in PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#discussion_r2804193927


##
website/sidebars.ts:
##
@@ -0,0 +1,24 @@
+import type {SidebarsConfig} from '@docusaurus/plugin-content-docs';
+
+const sidebars: SidebarsConfig = {

Review Comment:
   Excluded website from ASF license header check just like main fluss repo.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-13 Thread via GitHub


leekeiabstraction commented on code in PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#discussion_r2804192714


##
website/docs/index.md:
##
@@ -0,0 +1,33 @@
+---
+slug: /
+sidebar_position: 1
+title: Introduction
+---
+

Review Comment:
   Excluded website from ASF license header check just like main fluss repo.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-13 Thread via GitHub


leekeiabstraction commented on PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#issuecomment-3897207819

   Updated doc to include changes from unsubscribe APIs as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-12 Thread via GitHub


leekeiabstraction commented on PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#issuecomment-3893824598

   Rebased and updated doc according to latest API changes, appreciate a review 
here @luoyuxia @fresh-borzoni 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-12 Thread via GitHub


leekeiabstraction commented on PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#issuecomment-3889604927

   I will update this again after #302 is merged


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-11 Thread via GitHub


leekeiabstraction commented on PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#issuecomment-3887429790

   @luoyuxia @fresh-borzoni Appreciate your reviews here


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-11 Thread via GitHub


luoyuxia commented on PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#issuecomment-3882931659

   @leekeiabstraction Thanks for the great work. It look well!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Client doc website [fluss-rust]

2026-02-10 Thread via GitHub


Copilot commented on code in PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#discussion_r2791083129


##
website/sidebars.ts:
##
@@ -0,0 +1,24 @@
+import type {SidebarsConfig} from '@docusaurus/plugin-content-docs';
+
+const sidebars: SidebarsConfig = {

Review Comment:
   This sidebar config is missing the standard ASF license header comment at 
the top. The repository enforces license headers via CI; please add the header 
to this file.



##
website/docs/index.md:
##
@@ -0,0 +1,33 @@
+---
+slug: /
+sidebar_position: 1
+title: Introduction
+---
+

Review Comment:
   This documentation page is missing the ASF license header (the repo’s other 
Markdown docs start with the ASF header as an HTML comment, and CI checks 
headers). Please add the header here (and consistently across the other new 
`website/docs/**.md` pages).



##
website/docs/user-guide/python/primary-key-tables.md:
##
@@ -0,0 +1,88 @@
+---
+sidebar_position: 6
+---
+# Primary Key Tables
+
+Primary key tables (KV tables) support upsert, delete, and lookup operations.
+
+## Creating a Primary Key Table
+
+```python
+import pyarrow as pa
+
+pk_schema = pa.schema([
+pa.field("id", pa.int32()),
+pa.field("name", pa.string()),
+pa.field("age", pa.int64()),
+])
+
+fluss_schema = fluss.Schema(pk_schema, primary_keys=["id"])
+table_descriptor = fluss.TableDescriptor(fluss_schema, bucket_count=3)
+
+table_path = fluss.TablePath("fluss", "users")
+await admin.create_table(table_path, table_descriptor, ignore_if_exists=True)
+```
+
+## Upserting Records
+
+```python
+table = await conn.get_table(table_path)
+upsert_writer = table.new_upsert()
+
+await upsert_writer.upsert({"id": 1, "name": "Alice", "age": 25})
+await upsert_writer.upsert({"id": 2, "name": "Bob", "age": 30})
+await upsert_writer.upsert({"id": 3, "name": "Charlie", "age": 35})
+await upsert_writer.flush()
+```
+
+## Updating Records
+
+Upsert with the same primary key to update an existing record.
+
+```python
+await upsert_writer.upsert({
+"id": 1,
+"name": "Alice Updated",
+"age": 26,
+})
+await upsert_writer.flush()
+```
+
+## Deleting Records
+
+```python
+await upsert_writer.delete({"id": 2})
+await upsert_writer.flush()
+```
+
+## Partial Updates
+
+Update only specific columns while preserving others.
+
+```python
+# By column names
+partial_writer = table.new_upsert(columns=["id", "name", "age"])
+
+await partial_writer.upsert({
+"id": 1,   # primary key required
+"name": "Alice Partial",
+"age": 27,
+})
+await partial_writer.flush()
+
+# By column indices
+partial_writer_idx = table.new_upsert(column_indices=[0, 1, 3])

Review Comment:
   `partial_writer_idx = table.new_upsert(column_indices=[0, 1, 3])` uses an 
out-of-range index (this example schema only has 3 columns: indices 0..2). 
Update the indices to match the schema.
   ```suggestion
   partial_writer_idx = table.new_upsert(column_indices=[0, 1, 2])
   ```



##
website/package.json:
##
@@ -0,0 +1,43 @@
+{
+  "name": "fluss-clients-website",
+  "version": "0.0.0",
+  "private": true,
+  "scripts": {
+"docusaurus": "docusaurus",
+"start": "docusaurus start",

Review Comment:
   This repo enforces ASF license headers via CI (SkyWalking Eyes). 
`package.json` is JSON (no comments), so it can’t carry a header inline; as-is 
it’s likely to fail the header check. Consider adding an appropriate 
`paths-ignore` entry for this file in `.licenserc.yaml`, or switching the 
tooling/config to exclude `*.json` files.



##
website/docs/user-guide/python/log-tables.md:
##
@@ -0,0 +1,125 @@
+---
+sidebar_position: 5
+---
+# Log Tables
+
+Log tables are append-only tables without primary keys, suitable for event 
streaming.
+
+## Creating a Log Table
+
+```python
+import pyarrow as pa
+
+schema = pa.schema([
+pa.field("event_id", pa.int32()),
+pa.field("event_type", pa.string()),
+pa.field("timestamp", pa.int64()),
+])
+
+fluss_schema = fluss.Schema(schema)
+table_descriptor = fluss.TableDescriptor(fluss_schema)
+
+table_path = fluss.TablePath("fluss", "events")
+await admin.create_table(table_path, table_descriptor, ignore_if_exists=True)
+```
+
+## Writing to Log Tables
+
+```python
+table = await conn.get_table(table_path)
+append_writer = await table.new_append_writer()
+```
+
+**Write a PyArrow Table:**
+
+```python
+pa_table = pa.Table.from_arrays(
+[
+pa.array([1, 2, 3], type=pa.int32()),
+pa.array(["user_login", "page_view", "checkout"], type=pa.string()),
+pa.array([170406720, 1704067201000, 1704067202000], 
type=pa.int64()),
+],
+schema=schema,
+)
+append_writer.write_arrow(pa_table)
+```
+
+**Write a PyArrow RecordBatch:**
+
+```python
+batch = pa.RecordBatch.from_arrays(
+[
+pa.array([4, 5], type=pa.int32()),
+pa.array(["signup", "logout"], type=pa.string()),
+pa.array([1704067203000, 1704067204000], type=pa

Re: [PR] Client doc website [fluss-rust]

2026-02-10 Thread via GitHub


leekeiabstraction commented on PR #300:
URL: https://github.com/apache/fluss-rust/pull/300#issuecomment-3881098505

   @luoyuxia This is a first draft, can you check if this structuring is OK 
before I move onto polishing the content?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]