[ 
https://issues.apache.org/jira/browse/PHOENIX-7846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanuj Khurana resolved PHOENIX-7846.
------------------------------------
    Resolution: Fixed

> Bound rotation replay cost for large commit batches
> ---------------------------------------------------
>
>                 Key: PHOENIX-7846
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-7846
>             Project: Phoenix
>          Issue Type: Sub-task
>            Reporter: Tanuj Khurana
>            Assignee: Tanuj Khurana
>            Priority: Major
>
> Problem:
> ReplicationLog maintains a currentBatch which accumulates every successful 
> append and clears only on an explicit sync() call. On writer rotation 
> mid-batch, replayCurrentBatch() re-appends every record in the batch onto the 
> new writer. For workloads with many appends between explicit syncs, the 
> replay cost scales linearly with batch size. 
> There is a pre-existing implicit durability point: 
> LogFileFormatWriter.append() checks the in-memory block size after each 
> append and, when the block hits maxBlockSize (default 1 MB), triggers an 
> internal sync() that flushes the block to HDFS. Records up to that point are 
> durable. However, this information does not propagate back to 
> ReplicationLog.append(), so currentBatch keeps growing past these durability 
> points. 
> For example, with a 10k-record batch (1 KB records, 1 MB block size): blocks 
> fill every ~1000 records, but currentBatch grows to 10,000. Rotation at 
> record 9,500 replays all 9,500 records — even though  records 1–9,000 are 
> already durable in completed blocks on the old writer's file.
> Solution:
> Change LogFile.Writer.append() to return a boolean indicating whether a 
> block-full sync occurred. Propagate this signal through LogFileFormatWriter → 
> LogFileWriter → ReplicationLog.append(). When the signal  is true, clear 
> currentBatch — all records up to this point are durable and do not need 
> replay.
> After this change, replay on rotation is proportional to the last partial 
> block (bounded by maxBlockSize), not the full inter-sync window. Using the 
> same example: rotation at record 9,500 replays only ~500 records instead of 
> 9,500.
> No change to durability semantics — this only leverages an existing 
> durability point that was previously not propagated. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to