[
https://issues.apache.org/jira/browse/SAMZA-1455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jagadish updated SAMZA-1455:
----------------------------
Description:
Currently, we do not cleanly close the producer and consumer in the JobRunner.
This means that, any exception happening in the JobRunner will simply exit the
main thread but not call tear-down the producers/consumers. For producers and
consumers that spawn non-daemon threads (for example, a KafkaConsumer), this
has the effect of not shutting down the JVM cleanly.
In our production clusters, We have observed that JVM processes (corresponding
to the JobRunner) do not shut-down. Often, these processes hold on to deleted
file handles leading to multiple resource leaks.
was:
Currently, we do not cleanly close the producer and consumer in the JobRunner.
This means that, any exception happening in the JobRunner will simply exit the
main thread but not call tear-down the producers/consumers. For producers and
consumers that spawn non-daemon threads (for example, a KafkaConsumer), this
has the effect of not shutting down the JVM cleanly.
We have observed that on production clusters, JVM processes corresponding to
the JobRunner do not shut-down holding on to deleted file handles leading to
site up issues.
> Shutdown coordinator stream producers and consumers cleanly in JobRunner
> ------------------------------------------------------------------------
>
> Key: SAMZA-1455
> URL: https://issues.apache.org/jira/browse/SAMZA-1455
> Project: Samza
> Issue Type: Bug
> Reporter: Jagadish
>
> Currently, we do not cleanly close the producer and consumer in the
> JobRunner. This means that, any exception happening in the JobRunner will
> simply exit the main thread but not call tear-down the producers/consumers.
> For producers and consumers that spawn non-daemon threads (for example, a
> KafkaConsumer), this has the effect of not shutting down the JVM cleanly.
> In our production clusters, We have observed that JVM processes
> (corresponding to the JobRunner) do not shut-down. Often, these processes
> hold on to deleted file handles leading to multiple resource leaks.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)