[ 
https://issues.apache.org/jira/browse/KAFKA-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15847520#comment-15847520
 ] 

ASF GitHub Bot commented on KAFKA-4039:
---------------------------------------

GitHub user ijuma opened a pull request:

    https://github.com/apache/kafka/pull/2474

    KAFKA-4039: Fix deadlock during shutdown due to log truncation not allowed

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ijuma/kafka 
kafka-4039-deadlock-during-shutdown

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/2474.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2474
    
----
commit d902c2bc81c4799ad1eb7bc7ec50e67d04d1c9c2
Author: Maysam Yabandeh <myaban...@dropbox.com>
Date:   2016-08-17T18:21:07Z

    KAFKA-4039: delay invocation of System.exit via FatalExitException

commit 548ddec8c8715125f25f3da3263fe4ead39deca2
Author: Maysam Yabandeh <myaban...@dropbox.com>
Date:   2016-08-17T22:39:12Z

    KAFKA-4039 change FatalExitException to Error

commit 08992e10eacb661d24506bff1f617f42cce02ffb
Author: Maysam Yabandeh <myaban...@dropbox.com>
Date:   2016-08-18T00:33:43Z

    KAFKA-4039 fix the default value of test mode in FatalExitError

commit bcf9e9d605fef9bc2cf46380767f3881eeee6313
Author: Maysam Yabandeh <myaban...@dropbox.com>
Date:   2016-08-18T14:26:20Z

    KAFKA-4039 mock System.exit

commit 98acc3674a610ac329935c2d3a1bbdd032685d25
Author: Maysam Yabandeh <myaban...@dropbox.com>
Date:   2016-08-19T14:38:33Z

    KAFKA-4039 exit inside an anonymous thread, add test for deadlock

commit 39a667eda0817bc42beaaedb962e5797b6f106e1
Author: Ismael Juma <ism...@juma.me.uk>
Date:   2017-01-30T14:48:24Z

    Merge remote-tracking branch 'apache/trunk' into 
kafka-4039-deadlock-during-shutdown
    
    * apache/trunk: (552 commits)
      MINOR: JavaDoc markup cleanup
      KAFKA-4679: Remove unstable markers from Connect APIs
      KAFKA-4450; Add upgrade tests for 0.10.1 and rename TRUNK to DEV_BRANCH 
to reduce confusion
      KAFKA-4635; Client Compatibility follow-ups
      MINOR: update JavaDocs for Kafka Streams DSL helpers
      KAFKA-4557; Handle Producer.send correctly in expiry callbacks
      MINOR: Update copyright year in the NOTICE file.
      KAFKA-4664; Update docs/protocol.html with KIP-97 information
      KAFKA-4704; Coordinator cache loading fails if groupId is reused for 
offset storage after group is removed
      MINOR: Replace for within for each; replace if-elseif with match
      KAFKA-4644: Improve test coverage of StreamsPartitionAssignor
      MINOR: Update KTable JavaDoc
      MINOR: Include more detail in `ConfigDef.parseType` exception message
      KAFKA-4578; Upgrade notes for 0.10.2.0
      MINOR: Streams API JavaDoc improvements
      MINOR: Add Streams system test for broker backwards compatibility
      MINOR: Close create topics policy during shutdown and more tests
      MINOR: Update JavaDoc for DSL PAPI-API
      KAFKA-4636; Per listener security settings overrides (KIP-103)
      MINOR: Change logging level for ignored maybeAddMetric from debug to trace
      ...

commit 4d130ef7e990ca8215684adc36865f08caa79557
Author: Ismael Juma <ism...@juma.me.uk>
Date:   2017-01-31T21:07:37Z

    Try a slightly different approach to solving the shutdown deadlock
    
    * Open the latch in `ShutdownableThread` before calling `exit`
    * Introduce `Exit` classes that perform exit/halt and can be changed in 
tests
    * Invoke `exit` from the calling thread instead of spawning a new thread
    * Updated the tests
    * A few clean-ups

commit 1c8032b7419874ef2c56e7cf321d39a3f64e6833
Author: Ismael Juma <ism...@juma.me.uk>
Date:   2017-01-31T21:07:51Z

    A few style clean-ups

commit 50b6c571ae51d2c9f5b7288faace349acc156bc8
Author: Ismael Juma <ism...@juma.me.uk>
Date:   2017-01-31T21:08:18Z

    Merge remote-tracking branch 'apache/trunk' into 
kafka-4039-deadlock-during-shutdown
    
    * apache/trunk:
      MINOR: Logging improvements in consumer internals
      KAFKA-2955; Add a simple ">" prompt to console producer
      KAFKA-4613: Follow-up to fix JavaDocs
      KAFKA-4613: Treat null-key records the same way for joins and aggreations

----


> Exit Strategy: using exceptions instead of inline invocation of exit/halt
> -------------------------------------------------------------------------
>
>                 Key: KAFKA-4039
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4039
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.10.0.0
>            Reporter: Maysam Yabandeh
>            Priority: Critical
>              Labels: reliability
>         Attachments: deadlock-stack2
>
>
> The current practice is to directly invoke halt/exit right after the line 
> that intends to terminate the execution. In the case of System.exit this 
> could cause deadlocks if the thread invoking System.exit is holding  a lock 
> that will be requested by the shutdown hook threads that will be started by 
> System.exit. An example is reported by [~aozeritsky] in KAFKA-3924. This 
> would also makes testing more difficult as it would require mocking static 
> methods of System and Runtime classes, which is not natively supported in 
> Java.
> One alternative suggested 
> [here|https://issues.apache.org/jira/browse/KAFKA-3924?focusedCommentId=15420269&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15420269]
>  would be to throw some dedicated exceptions that will eventually invoke 
> exit/halt:
> {quote} it would be great to move away from executing `System.exit` inline in 
> favour of throwing an exception (two examples, but maybe we can find better 
> names: FatalExitException and FatalHaltException) that is caught by some 
> central code that then does the `System.exit` or `Runtime.getRuntime.halt`. 
> This helps in a couple of ways:
> (1) Avoids issues with locks being held as in this issue
> (2) It makes it possible to abstract the action, which is very useful in 
> tests. At the moment, we can't easily test for these conditions as they cause 
> the whole test harness to exit. Worse, these conditions are sometimes 
> triggered in the tests and it's unclear why.
> (3) We can have more consistent logging around these actions and possibly 
> extended logging for tests
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to