[
https://issues.apache.org/jira/browse/SAMZA-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16103479#comment-16103479
]
Yi Pan (Data Infrastructure) commented on SAMZA-1371:
-----------------------------------------------------
[~akkaul], thanks for reporting this issue. It seems that the consumer instance
that reads the changelog topic keeps dying when restoring the KV store. The
code is written retrying to connect to the Kafka broker if disconnected and the
consumer initial polling tells the container there are more messages at the
broker. Hence, I would recommend dig into the issue causing the Kafka consumer
dye.
Secondary, are you asking to fail the container with a max retry number when
recover KV store exception happens?
> Some Samza Containers get stuck at "Starting BrokerProxy for
> hostname:portnum" while others seem to be fine
> -----------------------------------------------------------------------------------------------------------
>
> Key: SAMZA-1371
> URL: https://issues.apache.org/jira/browse/SAMZA-1371
> Project: Samza
> Issue Type: Bug
> Components: container
> Affects Versions: 0.11.0
> Environment: Samza version: 0.11
> Kafka version: 0.11.0.0
> Reporter: Ak Ka
> Priority: Blocker
>
> We have multiple Samza apps using local store that have this issue. Some
> containers get stuck on "Starting BrokerProxy for hostname:portnum" while
> others seem to work as expected.
> Here is the log:
> stuck:
> ```
> [...]
> 2017-07-25 17:11:26.546 [main] org.apache.samza.system.kafka.BrokerProxy
> [INFO] Creating new SimpleConsumer for host hostname:portnum for system kafka
> 2017-07-25 17:11:26.547 [main] org.apache.samza.system.kafka.GetOffset [INFO]
> Validating offset 0 for topic and partition
> [prod.localStateChangeLog.prod.AlertsOrganizerInstant_matcherValidation,2]
> 2017-07-25 17:11:26.648 [main] org.apache.samza.system.kafka.GetOffset [INFO]
> Able to successfully read from offset 0 for topic and partition
> [prod.localStateChangeLog.prod.AlertsOrganizerInstant_matcherValidation,2].
> Using it to instantiate consumer.
> 2017-07-25 17:11:26.649 [main] org.apache.samza.system.kafka.BrokerProxy
> [INFO] Starting BrokerProxy for hostname:portnum
> // it's dead, Jim
> ```
> healthy:
> ```
> [...]
> 2017-07-25 17:11:26.920 [main] org.apache.samza.system.kafka.BrokerProxy
> [INFO] Creating new SimpleConsumer for host hostname:portnum for system kafka
> 2017-07-25 17:11:26.921 [main] org.apache.samza.system.kafka.GetOffset [INFO]
> Validating offset 0 for topic and partition
> [prod.localStateChangeLog.prod.AlertsOrganizerInstant_matcherValidation,1]
> 2017-07-25 17:11:27.023 [main] org.apache.samza.system.kafka.GetOffset [INFO]
> Able to successfully read from offset 0 for topic and partition
> [prod.localStateChangeLog.prod.AlertsOrganizerInstant_matcherValidation,1].
> Using it to instantiate consumer.
> 2017-07-25 17:11:27.023 [main] org.apache.samza.system.kafka.BrokerProxy
> [INFO] Starting BrokerProxy for hostname:portnum
> 2017-07-25 17:11:29.194 [main] org.apache.samza.system.kafka.BrokerProxy
> [INFO] Shutting down BrokerProxy for hostname:portnum
> 2017-07-25 17:11:29.194 [main] org.apache.samza.system.kafka.BrokerProxy
> [INFO] closing simple consumer...
> 2017-07-25 17:11:29.239 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed at
> hostname:portnum for client samza_consumer-prod_AlertsOrganizerInstant-1]
> org.apache.samza.system.kafka.DefaultFetchSimpleConsumer [INFO] Reconnect due
> to socket error: java.nio.channels.ClosedChannelException
> 2017-07-25 17:11:29.244 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed at
> hostname:portnum for client samza_consumer-prod_AlertsOrganizerInstant-1]
> org.apache.samza.system.kafka.BrokerProxy [WARN] Restarting consumer due to
> java.nio.channels.ClosedChannelException. Releasing ownership of all
> partitions, and restarting consumer. Turn on debugging to get a full stack
> trace.
> 2017-07-25 17:11:29.247 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed at
> hostname:portnum for client samza_consumer-prod_AlertsOrganizerInstant-1]
> org.apache.samza.system.kafka.KafkaSystemConsumer [INFO] Abdicating for
> [prod.localStateChangeLog.prod.AlertsOrganizerInstant_alertSetting,1]
> 2017-07-25 17:11:29.247 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed at
> hostname:portnum for client samza_consumer-prod_AlertsOrganizerInstant-1]
> org.apache.samza.system.kafka.KafkaSystemConsumer [INFO] Refreshing brokers
> for:
> Map([prod.localStateChangeLog.prod.AlertsOrganizerInstant_alertSetting,1] ->
> 13572)
> 2017-07-25 17:11:29.247 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed at
> hostname:portnum for client samza_consumer-prod_AlertsOrganizerInstant-1]
> org.apache.samza.system.kafka.BrokerProxy [INFO] Shutting down due to
> interrupt.
> 2017-07-25 17:11:29.247 [main] org.apache.samza.system.kafka.BrokerProxy
> [INFO] Shutting down BrokerProxy for hostname:portnum
> 2017-07-25 17:11:29.248 [main] org.apache.samza.system.kafka.BrokerProxy
> [INFO] closing simple consumer...
> 2017-07-25 17:11:29.265 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed at
> hostname:portnum for client samza_consumer-prod_AlertsOrganizerInstant-1]
> org.apache.samza.system.kafka.BrokerProxy [INFO] Shutting down due to
> interrupt.
> 2017-07-25 17:11:29.265 [main] org.apache.samza.system.kafka.BrokerProxy
> [INFO] Shutting down BrokerProxy for hostname:portnum
> 2017-07-25 17:11:29.265 [main] org.apache.samza.system.kafka.BrokerProxy
> [INFO] closing simple consumer...
> 2017-07-25 17:11:29.523 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed at
> hostname:portnum for client samza_consumer-prod_AlertsOrganizerInstant-1]
> org.apache.samza.system.kafka.BrokerProxy [INFO] Shutting down due to
> interrupt.
> 2017-07-25 17:11:29.524 [main] org.apache.samza.system.kafka.BrokerProxy
> [INFO] Shutting down BrokerProxy for hostname:portnum
> 2017-07-25 17:11:29.524 [main] org.apache.samza.system.kafka.BrokerProxy
> [INFO] closing simple consumer...
> 2017-07-25 17:11:29.601 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed at
> hostname:portnum for client samza_consumer-prod_AlertsOrganizerInstant-1]
> org.apache.samza.system.kafka.BrokerProxy [INFO] Shutting down due to
> interrupt.
> 2017-07-25 17:11:29.602 [main] org.apache.samza.system.kafka.BrokerProxy
> [INFO] Shutting down BrokerProxy for hostname:portnum
> 2017-07-25 17:11:29.602 [main] org.apache.samza.system.kafka.BrokerProxy
> [INFO] closing simple consumer...
> 2017-07-25 17:11:29.663 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed at
> hostname:portnum for client samza_consumer-prod_AlertsOrganizerInstant-1]
> org.apache.samza.system.kafka.BrokerProxy [INFO] Shutting down due to
> interrupt.
> 2017-07-25 17:11:29.668 [main] org.apache.samza.container.SamzaContainer
> [INFO] Starting host statistics monitor
> 2017-07-25 17:11:29.670 [main] org.apache.samza.container.SamzaContainer
> [INFO] Registering task instances with producers.
> 2017-07-25 17:11:29.674 [main] org.apache.samza.container.SamzaContainer
> [INFO] Starting producer multiplexer.
> 2017-07-25 17:11:29.675 [main] org.apache.samza.container.SamzaContainer
> [INFO] Initializing stream tasks.
> 2017-07-25 17:11:29.676 [main]
> com.company.samza.app.companyStreamingAppWrapper [INFO] Initializing instance
> of streaming application
> 2017-07-25 17:11:29.681 [main]
> com.company.samza.app.companyStreamingAppWrapper [INFO] First initialization.
> Setting up Guice container with configuration
> companyStreamingAppWrapperConfiguration{company.app.name=AlertsOrganizerInstant,
> company.appgroup=aws, company.env=prod,
> company.guice.module=com.company.notifications.Alerts.organizer..AlertsOrganizerModule}
> 2017-07-25 17:11:30.118 [main] com.company.config.guice.configModule [INFO]
> configModule loaded requested override file
> '/storage/data/secure/config/AnalyticsServiceClient.cfg'
> 2017-07-25 17:11:30.480 [main]
> com.company.samza.dataService.SamzaSessionFactoriesModule [INFO] Loading prod
> dbConfig from /data/config/prod.database.properties
> // Hibernate stuff (i.e. our code is hit)
> ```
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)