binary-signal commented on code in PR #59:
URL: https://github.com/apache/fluss-rust/pull/59#discussion_r2557915600
##########
Cargo.toml:
##########
@@ -34,5 +34,5 @@ members = ["crates/fluss", "crates/examples",
"bindings/python"]
fluss = { version = "0.1.0", path = "./crates/fluss" }
tokio = { version = "1.44.2", features = ["full"] }
clap = { version = "4.5.37", features = ["derive"] }
-arrow = "57.0.0"
+arrow = { version = "57.0.0", features = ["ipc_compression"] }
Review Comment:
@luoyuxia
1 → Yes, Arrow automatically decompresses data when reading into a
`RecordBatch`.
From the Arrow IPC docs: https://arrow.apache.org/rust/arrow_ipc/index.html
```
The Arrow IPC format defines how to read/write RecordBatches to/from files
or byte streams. It handles serialization and deserialization.
```
2 → Arrow IPC supports **LZ4** and **ZSTD**.
In the Fluss docs, I've also seen support for **LZ4** and **ZSTD**:
`table.log.arrow.compression.type` can be `NONE`, `LZ4_FRAME`, or `ZSTD`
(default: `ZSTD`).
Decompression happens at a lower layer than the Fluss logical types which is
transparent for the most part, meaning it can handle all types the arrow
supports. For the non-standard types like **ltz** (I assume you mean
`TIMESTAMP_LTZ`), this must be handled in the Rust binding code since you need
to parse the metadata stored in the arrow timestamp which has information about
the precision and the timezone. I already have another PR (not submitted)
adding experimental support for `ltz` + timestamps.
But also found PR #53 , which implements *all* types including ltz
timestamp. Since this is more holistic PR than what I was planning to submit,
it’s worth reviewing and merging this PR instead for adding support for ltz.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]