[ 
https://issues.apache.org/jira/browse/BEAM-9399?focusedWorklogId=484191&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-484191
 ]

ASF GitHub Bot logged work on BEAM-9399:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 14/Sep/20 19:48
            Start Date: 14/Sep/20 19:48
    Worklog Time Spent: 10m 
      Work Description: scwhittle commented on pull request #12825:
URL: https://github.com/apache/beam/pull/12825#issuecomment-692275145


   @lukecwik I think to have a single lock, we would need to use the 
PrintStream lock itself, as that is what is synchronized on by 
Throwable.printStackTrace and which can be synchronize outside and inside other 
synchronization blocks if we choose another lock to guard the buffer, 
ErrorManager, handler.
   However using the PrintStream as a lock within the 
DataflowWorkerLoggingHandler seemed perhaps error prone as that extends a 
Handler which might have it's own synchronized methods (on itself not the 
PrintStream) or callers that synchronize on the Handler (and not on the 
PrintStream).
   Additionally it seemed a bit gross since there is not always a PrintStream 
for a handler, though we could just allow injecting a lock Object into the 
handler to hide that.
   If you think that approach is still preferrable, I can put that together.
   
   Regarding on why to synchronize on buffer, that allows us to keep the 
preconditioncheck to sanity check our implementation.  But that seems overkill, 
so I will just remove it.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 484191)
    Time Spent: 8h 40m  (was: 8.5h)

> Possible deadlock between DataflowWorkerLoggingHandler and overridden 
> System.err PrintStream
> --------------------------------------------------------------------------------------------
>
>                 Key: BEAM-9399
>                 URL: https://issues.apache.org/jira/browse/BEAM-9399
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow
>            Reporter: Sam Whittle
>            Assignee: Sam Whittle
>            Priority: P3
>             Fix For: 2.21.0
>
>          Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> When an exception is encountered in DataflowWorkerLoggingHandler the 
> ErrorManager is used to log the exception.  ErrorManager uses System.err 
> which is overridden to be a PrintStream that writes back into 
> DataflowWorkerLoggingHandler.
> This has the lock ordering DataflowWorkerLoggingHandler -> PrintStream.
> Other logging of System.err has the inverse lock ordering 
> PrintStream->DataflowWorkerLoggingHandler so there is potential for deadlock.
> This is one known cause of the inversion, but any other System.err logs from 
> inside DataflowWorkerLoggingHandler could cause the same issue.
> Proposed fix is to address low-hanging fruit of having ErrorManager output to 
> the original System.err.  A full fix would be to improve our override of 
> System.err to a PrintStream that can detect the locking inversion or possibly 
> we could use the PrintStream mutex in both cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to