[
https://issues.apache.org/jira/browse/PHOENIX-7846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tanuj Khurana resolved PHOENIX-7846.
------------------------------------
Resolution: Fixed
> Bound rotation replay cost for large commit batches
> ---------------------------------------------------
>
> Key: PHOENIX-7846
> URL: https://issues.apache.org/jira/browse/PHOENIX-7846
> Project: Phoenix
> Issue Type: Sub-task
> Reporter: Tanuj Khurana
> Assignee: Tanuj Khurana
> Priority: Major
>
> Problem:
> ReplicationLog maintains a currentBatch which accumulates every successful
> append and clears only on an explicit sync() call. On writer rotation
> mid-batch, replayCurrentBatch() re-appends every record in the batch onto the
> new writer. For workloads with many appends between explicit syncs, the
> replay cost scales linearly with batch size.
> There is a pre-existing implicit durability point:
> LogFileFormatWriter.append() checks the in-memory block size after each
> append and, when the block hits maxBlockSize (default 1 MB), triggers an
> internal sync() that flushes the block to HDFS. Records up to that point are
> durable. However, this information does not propagate back to
> ReplicationLog.append(), so currentBatch keeps growing past these durability
> points.
> For example, with a 10k-record batch (1 KB records, 1 MB block size): blocks
> fill every ~1000 records, but currentBatch grows to 10,000. Rotation at
> record 9,500 replays all 9,500 records — even though records 1–9,000 are
> already durable in completed blocks on the old writer's file.
> Solution:
> Change LogFile.Writer.append() to return a boolean indicating whether a
> block-full sync occurred. Propagate this signal through LogFileFormatWriter →
> LogFileWriter → ReplicationLog.append(). When the signal is true, clear
> currentBatch — all records up to this point are durable and do not need
> replay.
> After this change, replay on rotation is proportional to the last partial
> block (bounded by maxBlockSize), not the full inter-sync window. Using the
> same example: rotation at record 9,500 replays only ~500 records instead of
> 9,500.
> No change to durability semantics — this only leverages an existing
> durability point that was previously not propagated.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)