[ 
https://issues.apache.org/jira/browse/SAMZA-2721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17489906#comment-17489906
 ] 

Bharath Kumarasubramanian commented on SAMZA-2721:
--------------------------------------------------

Here is the sample case where the container hit the code path where it swallows 
the exception and logs
{code:java}
2022-02-03 00:15:53.624 [crm-sync-entity-resolution-pipeline-i001-auditor] 
KafkaProducer [INFO] [Producer 
clientId=crm-sync-entity-resolution-pipeline-i001-auditor] Closing the Kafka 
producer with timeoutMillis = 9223370393007422183 ms.
2022-02-03 00:15:53.625 [main] LiKafkaConsumerImpl [INFO] Shutdown complete in 
11 millis
2022-02-03 00:15:53.626 [main] ContainerLaunchUtil [ERROR] Container stopped 
with Exception. 
2022-02-03 00:15:53.626 [main] CoordinatorStreamStore [INFO] Stopping the 
coordinator stream system consumer.
2022-02-03 00:15:53.626 [main] LiKafkaConsumerImpl [INFO] Shutting down ...
2022-02-03 00:15:53.627 [main] AbstractAuditor [INFO] Closing auditor with 
timeout 9223372036854775807 MILLI {code}
 

https://github.com/apache/samza/blob/2e7c2fe6c095d05b1b75fc0c2768ad0e9a81a085/samza-core/src/main/java/org/apache/samza/runtime/ContainerLaunchUtil.java#L185

> Container should exit with non-zero status code in case of errors during 
> launch
> -------------------------------------------------------------------------------
>
>                 Key: SAMZA-2721
>                 URL: https://issues.apache.org/jira/browse/SAMZA-2721
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Bharath Kumarasubramanian
>            Priority: Major
>
> {*}Problem{*}:
> ContainerLaunchUtil during its launch sequence swallows exception and 
> proceeds to shutdown with 0 status code. This causes AM to not restart the 
> container.
> {*}Description{*}:
> With the run method, as part of launch sequence we have various 
> initialization steps before kicking off the container. In case of exceptions 
> during this step, the run method catches all erros but only logs them and 
> proceeds to shutdown as usual. 
> Due to normal exit, AM treats the container completed successfully and hence 
> doesn't restart causing the failed container to remain failed.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to