fresh-borzoni opened a new pull request, #430:
URL: https://github.com/apache/fluss-rust/pull/430
## Summary
close #429
- Pre-size Arrow builders to `DEFAULT_MAX_RECORD` (256) capacity to
eliminate reallocations
- Document jemalloc as recommended allocator for Linux deployments
- Add jemalloc setup to examples
## Benchmark results (Linux, glibc, 8 threads, 100K cycles, 13-column
schema)
| Variant | Throughput | RSS delta |
|---------|-----------|-----------|
| Default capacity raw Arrow | baseline | baseline |
| Pre-sized raw Arrow | **+30%** | **-15%** |
| Pre-sized fluss builder | **+4%** | **-12%** |
| jemalloc + pre-sized | **+45%** vs default | lowest RSS |
I will address these findings during benchmarks in
## Follow-ups
- Switch `DEFAULT_MAX_RECORD` hard cap to byte-size-based `is_full()`
(like Java's `ArrowWriter`)
- Replace `FieldGetter`/`Datum` dispatch with typed column writers to
close the ~2.5x gap vs raw Arrow builders
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]