lhotari opened a new pull request, #26016:
URL: https://github.com/apache/pulsar/pull/26016

   Fixes #24148
   
   Related: #22736
   
   ### Motivation
   
   Forced topic/namespace deletion can deadlock with an in-flight compaction, 
as analyzed in #24148: 
`PersistentTopic.asyncDeleteCursorWithCleanCompactionLedger()` waits for 
`currentCompaction` while the compactor cannot make progress on the topic being 
deleted.
   
   PR #24366 addressed the infinite wait: the compactor's `RawReader` is 
created with `retryOnRecoverableErrors=false`, so when the deletion fences the 
topic, the reader's reconnect fails with `ServiceNotReady` 
(`TopicFencedException` maps to it), the reader closes, and `currentCompaction` 
completes exceptionally. However, the deletion still fails every time after 
that:
   
   - `asyncDeleteCursorWithCleanCompactionLedger()` chains the compaction 
cursor deletion with `currentCompaction.thenCompose(...)`. When 
`currentCompaction` completed exceptionally, the cursor deletion is skipped and 
the unsubscribe fails with the compaction's exception instead.
   - The failed `currentCompaction` future is never reset (`triggerCompaction` 
is blocked while the topic is deleting), so **every subsequent deletion attempt 
fails immediately** until the topic instance is reloaded. 
`deleteNamespaceWithRetry` in tests retries into this poisoned state until it 
times out, producing the flaky-test failures tracked in #22736.
   
   #24366 effectively turned the deadlock into a permanent failure loop. Its 
regression test did not catch this because 
`testConcurrentCompactionAndTopicDelete` only asserts 
`deleteTopicFuture.isDone()`, which is also true when the deletion fails.
   
   ### Modifications
   
   - `PersistentTopic.asyncDeleteCursorWithCleanCompactionLedger()`: proceed 
with the compaction cursor deletion once `currentCompaction` has finished, 
regardless of whether it completed normally or exceptionally. Only an in-flight 
compaction is awaited. A compaction terminated by the deletion's fencing (per 
#24366) no longer fails the deletion.
   - Strengthened `testConcurrentCompactionAndTopicDelete` to assert that the 
topic deletion succeeded (`deleteTopicFuture.get(...)`) instead of only 
asserting completion. Without the fix, this now fails with the terminated 
compaction's `CancellationException`.
   - Added `CompactionTest.testForcedDeleteSucceedsAfterFailedCompaction`: 
deterministically fails a compaction (injection after the phase-two seek), then 
verifies a forced topic deletion succeeds. Without the fix, the deletion fails 
with the previous compaction's exception.
   - Added `CompactionTest.testForcedNamespaceDeleteWithInflightCompaction`: 
end-to-end reproduction of #24148 — forced namespace deletion while a 
compaction is in-flight (kept in-flight by 2000 one-message-at-a-time reads via 
`dispatcherMaxReadBatchSize=1`). Without the fix, it fails with the 
`ConditionTimeoutException` in `deleteNamespaceWithRetry` reported in #22736. 
Enabled `forceDeleteNamespaceAllowed` in the test config for this.
   
   ### Verifying this change
   
   - [x] Make sure that the change passes the CI checks.
   
   This change added tests and can be verified as follows:
   
   - All three tests above were verified red without the fix and green with the 
fix.
   - The full `CompactionTest` class passes locally.
   
   ### Does this pull request potentially affect one of the following parts:
   
   *If the box was checked, please highlight the changes*
   
   - [ ] Dependencies (add or upgrade a dependency)
   - [ ] The public API
   - [ ] The schema
   - [ ] The default values of configurations
   - [ ] The threading model
   - [ ] The binary protocol
   - [ ] The REST endpoints
   - [ ] The admin CLI options
   - [ ] The metrics
   - [ ] Anything that affects deployment
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to