[ 
https://issues.apache.org/jira/browse/PHOENIX-7945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ujjawal Kumar updated PHOENIX-7945:
-----------------------------------
    Fix Version/s: 5.3.2

>  Retain orphaned delete markers (without puts) during Phoenix compaction
> ------------------------------------------------------------------------
>
>                 Key: PHOENIX-7945
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-7945
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Ujjawal Kumar
>            Assignee: Ujjawal Kumar
>            Priority: Minor
>             Fix For: 5.3.2
>
>
> Orphaned delete markers (DeleteFamily markers without corresponding puts) are 
> dropped during major compaction before Phoenix CompactionScanner can process 
> them.
> *Issue -* 
> HBase {{DropDeletesCompactionScanQueryMatcher.tryDropDelete()}} drops delete 
> markers whose timestamp < {{earliestPutTs}} when {{KeepDeletedCells=TTL}} and 
> {{timeToPurgeDeletes}} is 0. Since {{earliestPutTs}} is a *global minimum 
> across ALL HFiles* being compacted, a put in any other row HFile can cause 
> orphaned markers to be dropped before Phoenix CompactionScanner ever sees 
> them.
> h2. Fix
> Set {{timeToPurgeDeletes = Long.MAX_VALUE}} in 
> {{{}setScanOptionsForFlushesAndCompactions(){}}}. This short-circuits 
> {{tryDropDelete()}} so HBase never purges delete markers – Phoenix 
> CompactionScanner then applies its standard max-lookback logic.
> h2. Orphan Delete Marker Lifecycle with this - 
> Same as normal deleted rows:
> ||Time Zone||Behavior||
> |Within max-lookback|Retained|
> |Outside max-lookback (but within TTL)|*Purged*|
> |Outside TTL|Purged|
> Users who need markers retained beyond max-lookback (for replication lag 
> cases) can use the per-table max-lookback override to extend it up to TTL.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to