[
https://issues.apache.org/jira/browse/RANGER-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18055626#comment-18055626
]
Vyom Mani Tiwari commented on RANGER-5476:
------------------------------------------
The {{PolicyRefresher.stopRefresher()}} method can deadlock because it
interrupts the refresher thread and then waits for it to complete using
{{{}join(){}}}. However, if the {{RangerRESTClient}} is currently in its retry
loop due to server errors (such as 503 Service Unavailable), it catches the
{{InterruptedException}} during its {{Thread.sleep()}} interval but fails to
propagate the interruption signal or break the retry logic. By swallowing this
exception, the client clears the thread's interrupted status and continues the
{{for}} loop, preventing the thread from ever terminating. This leaves the
calling thread blocked indefinitely on {{{}join(){}}}.
The fix involves updating the {{shouldRetry()}} method in {{RangerRESTClient}}
to properly handle the {{InterruptedException}} by restoring the thread's
interrupted status via {{Thread.currentThread().interrupt()}} and returning
{{{}false{}}}. This ensures that the retry loop is aborted immediately when a
shutdown is requested, allowing the thread to exit gracefully and unblocking
the {{stopRefresher()}} call without negatively impacting normal failover logic.
> PolicyRefresher.stopRefresher() can deadlock when retrying HTTP request
> -----------------------------------------------------------------------
>
> Key: RANGER-5476
> URL: https://issues.apache.org/jira/browse/RANGER-5476
> Project: Ranger
> Issue Type: Bug
> Components: Ranger
> Affects Versions: 2.7.0
> Reporter: Naoki Takezoe
> Priority: Major
>
> PolicyRefresher.stopRefresher() can deadlock when it's called while
> RangerRESTClient is retying server error, because it interrupts itself and
> wait for the completion of the thread:
> https://github.com/apache/ranger/blob/fe379d0a40aa4ae93c978a2c4d3a77fc9df2fbbb/agents-common/src/main/java/org/apache/ranger/plugin/util/PolicyRefresher.java#L167-L175
> But this interruption is caught and ignored in RangerRESTClient when it's
> retrying server error, so PolicyRefresher will never get control back.
> https://github.com/apache/ranger/blob/fe379d0a40aa4ae93c978a2c4d3a77fc9df2fbbb/agents-common/src/main/java/org/apache/ranger/plugin/util/RangerRESTClient.java#L665-L669
> It looks like RangerRESTClient shouldn't ignore InterruptedException but I
> wonder if it affect the existing use case. Rather, it might be better to
> provide a way to stop RangerAdminClient safely and call it from
> PolicyRefresher.stopRefresher().
--
This message was sent by Atlassian Jira
(v8.20.10#820010)