[ 
https://issues.apache.org/jira/browse/RANGER-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18055626#comment-18055626
 ] 

Vyom Mani Tiwari commented on RANGER-5476:
------------------------------------------

The {{PolicyRefresher.stopRefresher()}} method can deadlock because it 
interrupts the refresher thread and then waits for it to complete using 
{{{}join(){}}}. However, if the {{RangerRESTClient}} is currently in its retry 
loop due to server errors (such as 503 Service Unavailable), it catches the 
{{InterruptedException}} during its {{Thread.sleep()}} interval but fails to 
propagate the interruption signal or break the retry logic. By swallowing this 
exception, the client clears the thread's interrupted status and continues the 
{{for}} loop, preventing the thread from ever terminating. This leaves the 
calling thread blocked indefinitely on {{{}join(){}}}.

The fix involves updating the {{shouldRetry()}} method in {{RangerRESTClient}} 
to properly handle the {{InterruptedException}} by restoring the thread's 
interrupted status via {{Thread.currentThread().interrupt()}} and returning 
{{{}false{}}}. This ensures that the retry loop is aborted immediately when a 
shutdown is requested, allowing the thread to exit gracefully and unblocking 
the {{stopRefresher()}} call without negatively impacting normal failover logic.

> PolicyRefresher.stopRefresher() can deadlock when retrying HTTP request
> -----------------------------------------------------------------------
>
>                 Key: RANGER-5476
>                 URL: https://issues.apache.org/jira/browse/RANGER-5476
>             Project: Ranger
>          Issue Type: Bug
>          Components: Ranger
>    Affects Versions: 2.7.0
>            Reporter: Naoki Takezoe
>            Priority: Major
>
> PolicyRefresher.stopRefresher() can deadlock when it's called while 
> RangerRESTClient is retying server error, because it interrupts itself and 
> wait for the completion of the thread:
> https://github.com/apache/ranger/blob/fe379d0a40aa4ae93c978a2c4d3a77fc9df2fbbb/agents-common/src/main/java/org/apache/ranger/plugin/util/PolicyRefresher.java#L167-L175
> But this interruption is caught and ignored in RangerRESTClient when it's 
> retrying server error, so PolicyRefresher will never get control back.
> https://github.com/apache/ranger/blob/fe379d0a40aa4ae93c978a2c4d3a77fc9df2fbbb/agents-common/src/main/java/org/apache/ranger/plugin/util/RangerRESTClient.java#L665-L669
> It looks like RangerRESTClient shouldn't ignore InterruptedException but I 
> wonder if it affect the existing use case. Rather, it might be better to 
> provide a way to stop RangerAdminClient safely and call it from 
> PolicyRefresher.stopRefresher().



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to