fresh-borzoni commented on code in PR #404:
URL: https://github.com/apache/fluss-rust/pull/404#discussion_r2898653814
##########
crates/fluss/src/client/write/accumulator.rs:
##########
@@ -15,21 +15,142 @@
// specific language governing permissions and limitations
// under the License.
+use crate::client::broadcast;
+use crate::client::write::IdempotenceManager;
use crate::client::write::batch::WriteBatch::{ArrowLog, Kv};
use crate::client::write::batch::{ArrowLogWriteBatch, KvWriteBatch,
WriteBatch};
use crate::client::{LogWriteRecord, Record, ResultHandle, WriteRecord};
use crate::cluster::{BucketLocation, Cluster, ServerNode};
use crate::config::Config;
-use crate::error::Result;
+use crate::error::{Error, Result};
use crate::metadata::{PhysicalTablePath, TableBucket};
+use crate::record::{NO_BATCH_SEQUENCE, NO_WRITER_ID};
use crate::util::current_time_ms;
use crate::{BucketId, PartitionId, TableId};
use dashmap::DashMap;
-use parking_lot::Mutex;
-use parking_lot::RwLock;
+use parking_lot::{Condvar, Mutex, RwLock};
use std::collections::{HashMap, HashSet, VecDeque};
use std::sync::Arc;
-use std::sync::atomic::{AtomicI32, AtomicI64, Ordering};
+use std::sync::atomic::{AtomicBool, AtomicI32, AtomicI64, AtomicUsize,
Ordering};
+use std::time::{Duration, Instant};
+
+/// Byte-counting semaphore that blocks producers when total buffered memory
+/// exceeds the configured limit. Matches Java's `LazyMemorySegmentPool`
behavior.
+///
+/// TODO: Replace `notify_all()` with per-waiter FIFO signaling (Java uses
per-request
+/// Condition objects in a Deque) to avoid thundering herd under high
contention.
+///
+/// TODO: Track actual batch memory usage instead of reserving a fixed
`writer_batch_size`
+/// per batch. This over-counts when batches don't fill completely, reducing
effective
+/// throughput. Requires tighter coupling with batch internals.
+pub(crate) struct MemoryLimiter {
+ state: Mutex<usize>,
+ cond: Condvar,
+ max_memory: usize,
+ wait_timeout: Duration,
+ closed: AtomicBool,
+ waiting_count: AtomicUsize,
+}
+
+impl MemoryLimiter {
+ pub fn new(max_memory: usize, wait_timeout: Duration) -> Self {
+ Self {
+ state: Mutex::new(0),
+ cond: Condvar::new(),
+ max_memory,
+ wait_timeout,
+ closed: AtomicBool::new(false),
+ waiting_count: AtomicUsize::new(0),
+ }
+ }
+
+ /// Try to acquire `size` bytes. Blocks until memory is available,
+ /// the timeout expires, or the limiter is closed.
+ /// Returns a `MemoryPermit` on success.
+ pub fn acquire(self: &Arc<Self>, size: usize) -> Result<MemoryPermit> {
Review Comment:
Yes, MemoryLimiter is a global pool shared across all buckets, same as
Java's LazyMemorySegmentPool. A hot bucket can exhaust the pool and block
writes to other buckets.
But when any producer is blocked on memory, ready() marks all batches as
immediately sendable, so partial batches get drained and memory frees up
faster.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]