[ 
https://issues.apache.org/jira/browse/SAMZA-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149805#comment-16149805
 ] 

Yi Pan (Data Infrastructure) commented on SAMZA-1371:
-----------------------------------------------------

[~hao.song], so, from the stack trace, it looks like that the restoration 
process stuck at SystemStreamPartitionIterator#refresh, which invokes 
KafkaSystemConsumer#poll method. From that point, it determines that there are 
still messages in the partition at the broker, but was not able to get any from 
the queue, which should be populated by BrokerProxy thread that keeps fetching 
messages from broker and populated the corresponding queue. Could you post the 
full log from the stuck container as well? I wanted to see whether there were 
any log lines before this happened to the BrokerProxy thread.

P.S. we have not test against Kafka broker 0.11. Hence, it could be an issue 
between broker respond differently between 0.10.1 and 0.11 that caused the 
BrokerProxy thread dies.

> Some Samza Containers get stuck at "Starting BrokerProxy for 
> hostname:portnum" while others seem to be fine
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SAMZA-1371
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1371
>             Project: Samza
>          Issue Type: Bug
>          Components: container
>    Affects Versions: 0.11.0, 0.12.0
>         Environment: Samza version: 0.11, 0.12
> Kafka version: 0.11.0.0
>            Reporter: Ak Ka
>            Priority: Blocker
>
> We have multiple Samza apps using local store that have this issue. Some 
> containers get stuck on "Starting BrokerProxy for hostname:portnum" while 
> others seem to work as expected.  
> Here is the log:
> stuck:
> ```
> [...]
> 2017-07-25 17:11:26.546 [main] org.apache.samza.system.kafka.BrokerProxy 
> [INFO] Creating new SimpleConsumer for host hostname:portnum for system kafka
> 2017-07-25 17:11:26.547 [main] org.apache.samza.system.kafka.GetOffset [INFO] 
> Validating offset 0 for topic and partition 
> [prod.localStateChangeLog.prod.AlertsOrganizerInstant_matcherValidation,2]
> 2017-07-25 17:11:26.648 [main] org.apache.samza.system.kafka.GetOffset [INFO] 
> Able to successfully read from offset 0 for topic and partition 
> [prod.localStateChangeLog.prod.AlertsOrganizerInstant_matcherValidation,2]. 
> Using it to instantiate consumer.
> 2017-07-25 17:11:26.649 [main] org.apache.samza.system.kafka.BrokerProxy 
> [INFO] Starting BrokerProxy for hostname:portnum
> // it's dead, Jim
> ```
> healthy:
> ```
> [...]
> 2017-07-25 17:11:26.920 [main] org.apache.samza.system.kafka.BrokerProxy 
> [INFO] Creating new SimpleConsumer for host hostname:portnum for system kafka
> 2017-07-25 17:11:26.921 [main] org.apache.samza.system.kafka.GetOffset [INFO] 
> Validating offset 0 for topic and partition 
> [prod.localStateChangeLog.prod.AlertsOrganizerInstant_matcherValidation,1]
> 2017-07-25 17:11:27.023 [main] org.apache.samza.system.kafka.GetOffset [INFO] 
> Able to successfully read from offset 0 for topic and partition 
> [prod.localStateChangeLog.prod.AlertsOrganizerInstant_matcherValidation,1]. 
> Using it to instantiate consumer.
> 2017-07-25 17:11:27.023 [main] org.apache.samza.system.kafka.BrokerProxy 
> [INFO] Starting BrokerProxy for hostname:portnum
> 2017-07-25 17:11:29.194 [main] org.apache.samza.system.kafka.BrokerProxy 
> [INFO] Shutting down BrokerProxy for hostname:portnum
> 2017-07-25 17:11:29.194 [main] org.apache.samza.system.kafka.BrokerProxy 
> [INFO] closing simple consumer...
> 2017-07-25 17:11:29.239 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed at 
> hostname:portnum for client samza_consumer-prod_AlertsOrganizerInstant-1] 
> org.apache.samza.system.kafka.DefaultFetchSimpleConsumer [INFO] Reconnect due 
> to socket error: java.nio.channels.ClosedChannelException
> 2017-07-25 17:11:29.244 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed at 
> hostname:portnum for client samza_consumer-prod_AlertsOrganizerInstant-1] 
> org.apache.samza.system.kafka.BrokerProxy [WARN] Restarting consumer due to 
> java.nio.channels.ClosedChannelException. Releasing ownership of all 
> partitions, and restarting consumer. Turn on debugging to get a full stack 
> trace.
> 2017-07-25 17:11:29.247 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed at 
> hostname:portnum for client samza_consumer-prod_AlertsOrganizerInstant-1] 
> org.apache.samza.system.kafka.KafkaSystemConsumer [INFO] Abdicating for 
> [prod.localStateChangeLog.prod.AlertsOrganizerInstant_alertSetting,1]
> 2017-07-25 17:11:29.247 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed at 
> hostname:portnum for client samza_consumer-prod_AlertsOrganizerInstant-1] 
> org.apache.samza.system.kafka.KafkaSystemConsumer [INFO] Refreshing brokers 
> for: 
> Map([prod.localStateChangeLog.prod.AlertsOrganizerInstant_alertSetting,1] -> 
> 13572)
> 2017-07-25 17:11:29.247 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed at 
> hostname:portnum for client samza_consumer-prod_AlertsOrganizerInstant-1] 
> org.apache.samza.system.kafka.BrokerProxy [INFO] Shutting down due to 
> interrupt.
> 2017-07-25 17:11:29.247 [main] org.apache.samza.system.kafka.BrokerProxy 
> [INFO] Shutting down BrokerProxy for hostname:portnum
> 2017-07-25 17:11:29.248 [main] org.apache.samza.system.kafka.BrokerProxy 
> [INFO] closing simple consumer...
> 2017-07-25 17:11:29.265 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed at 
> hostname:portnum for client samza_consumer-prod_AlertsOrganizerInstant-1] 
> org.apache.samza.system.kafka.BrokerProxy [INFO] Shutting down due to 
> interrupt.
> 2017-07-25 17:11:29.265 [main] org.apache.samza.system.kafka.BrokerProxy 
> [INFO] Shutting down BrokerProxy for hostname:portnum
> 2017-07-25 17:11:29.265 [main] org.apache.samza.system.kafka.BrokerProxy 
> [INFO] closing simple consumer...
> 2017-07-25 17:11:29.523 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed at 
> hostname:portnum for client samza_consumer-prod_AlertsOrganizerInstant-1] 
> org.apache.samza.system.kafka.BrokerProxy [INFO] Shutting down due to 
> interrupt.
> 2017-07-25 17:11:29.524 [main] org.apache.samza.system.kafka.BrokerProxy 
> [INFO] Shutting down BrokerProxy for hostname:portnum
> 2017-07-25 17:11:29.524 [main] org.apache.samza.system.kafka.BrokerProxy 
> [INFO] closing simple consumer...
> 2017-07-25 17:11:29.601 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed at 
> hostname:portnum for client samza_consumer-prod_AlertsOrganizerInstant-1] 
> org.apache.samza.system.kafka.BrokerProxy [INFO] Shutting down due to 
> interrupt.
> 2017-07-25 17:11:29.602 [main] org.apache.samza.system.kafka.BrokerProxy 
> [INFO] Shutting down BrokerProxy for hostname:portnum
> 2017-07-25 17:11:29.602 [main] org.apache.samza.system.kafka.BrokerProxy 
> [INFO] closing simple consumer...
> 2017-07-25 17:11:29.663 [SAMZA-BROKER-PROXY-BrokerProxy thread pointed at 
> hostname:portnum for client samza_consumer-prod_AlertsOrganizerInstant-1] 
> org.apache.samza.system.kafka.BrokerProxy [INFO] Shutting down due to 
> interrupt.
> 2017-07-25 17:11:29.668 [main] org.apache.samza.container.SamzaContainer 
> [INFO] Starting host statistics monitor
> 2017-07-25 17:11:29.670 [main] org.apache.samza.container.SamzaContainer 
> [INFO] Registering task instances with producers.
> 2017-07-25 17:11:29.674 [main] org.apache.samza.container.SamzaContainer 
> [INFO] Starting producer multiplexer.
> 2017-07-25 17:11:29.675 [main] org.apache.samza.container.SamzaContainer 
> [INFO] Initializing stream tasks.
> 2017-07-25 17:11:29.676 [main] 
> com.company.samza.app.companyStreamingAppWrapper [INFO] Initializing instance 
> of streaming application
> 2017-07-25 17:11:29.681 [main] 
> com.company.samza.app.companyStreamingAppWrapper [INFO] First initialization. 
> Setting up Guice container with configuration 
> companyStreamingAppWrapperConfiguration{company.app.name=AlertsOrganizerInstant,
>  company.appgroup=aws, company.env=prod, 
> company.guice.module=com.company.notifications.Alerts.organizer..AlertsOrganizerModule}
> 2017-07-25 17:11:30.118 [main] com.company.config.guice.configModule [INFO] 
> configModule loaded requested override file 
> '/storage/data/secure/config/AnalyticsServiceClient.cfg'
> 2017-07-25 17:11:30.480 [main] 
> com.company.samza.dataService.SamzaSessionFactoriesModule [INFO] Loading prod 
> dbConfig from /data/config/prod.database.properties
> // Hibernate stuff (i.e. our code is hit)
> ```



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to