gnuhpc opened a new issue, #2346:
URL: https://github.com/apache/fluss/issues/2346
## Bug Description
`IndexOutOfBoundsException` occurs in `ArrowVarCharWriter.doWrite()` when
writing VARCHAR fields to Arrow WAL under moderate concurrency (8 threads).
This bug causes **silent write failures** on the tablet server side, leading to
potential data loss in production environments.
### Stack Trace
```
ERROR org.apache.fluss.server.replica.ReplicaManager - Error put records to
local kv on replica TableBucket{tableId=18, bucket=0}
java.lang.IndexOutOfBoundsException: null
at java.nio.ByteBuffer.wrap(Unknown Source)
at
org.apache.fluss.memory.MemorySegment.wrapInternal(MemorySegment.java:258)
at org.apache.fluss.memory.MemorySegment.wrap(MemorySegment.java:252)
at
org.apache.fluss.row.BinarySection.wrapByteBuffer(BinarySection.java:112)
at
org.apache.fluss.row.arrow.writers.ArrowVarCharWriter.doWrite(ArrowVarCharWriter.java:37)
at
org.apache.fluss.row.arrow.writers.ArrowFieldWriter.write(ArrowFieldWriter.java:59)
at org.apache.fluss.row.arrow.ArrowWriter.writeRow(ArrowWriter.java:206)
at
org.apache.fluss.record.MemoryLogRecordsArrowBuilder.append(MemoryLogRecordsArrowBuilder.java:183)
at
org.apache.fluss.server.kv.wal.ArrowWalBuilder.append(ArrowWalBuilder.java:48)
at org.apache.fluss.server.kv.KvTablet.applyUpdate(KvTablet.java:519)
```
---
## Reproduction
### Environment
- **Fluss Version**: 0.9-SNAPSHOT (latest main branch as of 2026-01-11)
- **Cluster**: 1 coordinator + 1 tablet server (Docker)
- **JDK**: OpenJDK 11+
### Reproduction Program
I created a standalone Java program that reliably reproduces this bug. See
the attached `ArrowWalRepro.java` file.
**Key characteristics:**
- Creates a table with `INT` primary key + `VARCHAR` payload column
- Spawns multiple concurrent writer threads
- Each thread performs UPSERT operations with VARCHAR data
- Triggers buffer reuse/memory segment pooling scenarios
### Reproduction Steps
1. **Compile the reproduction program:**
```bash
# Assuming Fluss source is built at /path/to/fluss
javac -cp "$(find /path/to/fluss -name '*.jar' | tr '\n' ':')"
ArrowWalRepro.java
```
2. **Run the reproduction:**
```bash
java -cp "$(find /path/to/fluss -name '*.jar' | tr '\n' ':'):."
ArrowWalRepro \
--bootstrap localhost:9123 \
--threads 8 \
--records 20000 \
--payload-size 256 \
--buffer-memory 8mb \
--buffer-page 64kb
```
3. **Check tablet server logs:**
```bash
# You should see IndexOutOfBoundsException errors
grep -A 20 "IndexOutOfBoundsException" <tablet-server-log-file>
```
### Reproduction Results
With the configuration above:
- **Bug triggered**: 2 times during ~1 minute run
- **Total writes**: 160,000 records (8 threads × 20,000 records)
- **Failure rate**: ~0.00125% (2 failures / 160,000 writes)
- **Error timestamps**:
- 2026-01-11 07:07:10,270
- 2026-01-11 07:07:53,803 (43 seconds apart)
---
## Root Cause Analysis
### The Bug Mechanism
The issue lies in an **offset semantic mismatch** between `BinarySection`
and `ByteBuffer.wrap()`:
1. **`BinarySection`** stores:
- `heapMemory`: byte array reference (may be a subsection of a larger
buffer)
- `offset`: **logical offset relative to the original large buffer**
2. **`ArrowVarCharWriter.doWrite():37`** calls:
```java
ByteBuffer buffer = binarySection.wrapByteBuffer();
```
3. **`BinarySection.wrapByteBuffer():112`** does:
```java
return MemorySegment.wrap(heapMemory, offset, length);
```
4. **`MemorySegment.wrap():252`** eventually calls:
```java
ByteBuffer.wrap(heapMemory, offset, length)
```
5. **Bug trigger**: When `heapMemory` is already a subsection:
- `heapMemory.length` = small (e.g., 1024 bytes)
- `offset` = large (e.g., 5000 - logical offset from original buffer)
- **Condition**: `offset > heapMemory.length`
- **Result**: `IndexOutOfBoundsException` in `ByteBuffer.wrap()`
### Why It's Intermittent
The bug depends on memory segment reuse patterns:
- Under high concurrency, the memory pool allocates/reuses segments
frequently
- When a `BinarySection` is created from a reused/subsectioned memory
segment, the offset may exceed the subsection's bounds
- This is a **timing-dependent** bug that manifests under buffer pressure
---
## Impact Assessment
| Impact Factor | Severity | Details |
|---------------|----------|---------|
| **Data Loss** | 🔴 **HIGH** | Writes fail silently on tablet server;
clients may receive false success |
| **Consistency** | 🔴 **HIGH** | Failed writes not visible to client; no
retry mechanism |
| **Frequency** | 🟡 **MEDIUM** | Reproducible with moderate concurrency (8
threads) |
| **Detectability** | 🔴 **LOW** | Only visible in tablet server ERROR logs;
no client-side indication |
### Production Risk
- ❌ **Blocks high-concurrency workloads** (e.g., YCSB benchmarks, Redis-like
use cases)
- ❌ **Silent data loss**: Clients believe writes succeeded, but data not
persisted
- ❌ **No automatic recovery**: Failed writes require application-level retry
- ⚠️ **Hard to diagnose**: Appears as random data loss without obvious
pattern
---
## Proposed Solution
### Short-Term Fix (Immediate)
Change `ArrowVarCharWriter.doWrite():37` from:
```java
ByteBuffer buffer = binarySection.wrapByteBuffer(); // BROKEN: offset
mismatch
```
To:
```java
byte[] bytes = binarySection.toBytes(); // SAFE: creates new copy with
correct bounds
ByteBuffer buffer = ByteBuffer.wrap(bytes);
```
**Trade-off**: Introduces memory copy overhead, but guarantees correctness.
### Long-Term Fix (Architectural)
Redesign offset semantics in `BinarySection` / `MemorySegment`:
1. Ensure `heapMemory` always represents the **exact byte range** [offset,
offset+length)
2. Or: Store both "absolute offset" and "relative offset" explicitly
3. Or: Change `wrapByteBuffer()` to adjust offset to be relative to
`heapMemory` base
This requires careful analysis of memory management architecture.
---
## Verification Checklist
After applying a fix, verify with:
1. ✅ Run reproduction program with same parameters → **0 exceptions**
2. ✅ Increase concurrency (16-32 threads) → **0 exceptions**
3. ✅ Extended test (1M+ records) → **0 exceptions**
4. ✅ YCSB benchmark (real-world workload) → **0 exceptions**
5. ✅ Check performance impact (memory copies if short-term fix used)
---
## Attached Files
1. **`ArrowWalRepro.java`**: Complete standalone reproduction program
2. **`tablet-server-error-logs.txt`**: Full stack traces from tablet server
---
## Additional Context
- Discovered during Fluss 0.9 compatibility testing for CAPE
(Cache-Augmented Persistent Engine) integration
- Bug was first observed in YCSB Redis benchmark workloads with high write
concurrency
- This standalone reproduction was created to isolate the issue from the
larger testing framework
---
## Request for Maintainers
This is a **critical data integrity bug** that should be addressed before
0.9 release. We are happy to:
- Provide additional debugging information if needed
- Test proposed fixes with our reproduction program
- Contribute a PR if guidance on preferred approach is provided
Thank you for your attention to this issue!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]