[ 
https://issues.apache.org/jira/browse/SAMZA-170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13921008#comment-13921008
 ] 

Martin Kleppmann commented on SAMZA-170:
----------------------------------------

Noticed this exception in the logs of the wikipedia-parser job, not sure if 
it's related:

{noformat}
2014-03-05 15:10:05 KafkaCheckpointManager [WARN] Got exception while trying to 
read validate topic __samza_checkpoint_wikipedia-parser_1. Retrying.
org.apache.samza.SamzaException: State topic validation failed for topic 
__samza_checkpoint_wikipedia-parser_1 because we got error code 5 from Kafka.
        at 
org.apache.samza.checkpoint.kafka.KafkaCheckpointManager.validateTopic(KafkaCheckpointManager.scala:267)
        at 
org.apache.samza.checkpoint.kafka.KafkaCheckpointManager.start(KafkaCheckpointManager.scala:208)
        at 
org.apache.samza.container.SamzaContainer.startCheckpoints(SamzaContainer.scala:561)
        at 
org.apache.samza.container.SamzaContainer.run(SamzaContainer.scala:494)
        at 
org.apache.samza.container.SamzaContainer$.main(SamzaContainer.scala:81)
        at org.apache.samza.container.SamzaContainer.main(SamzaContainer.scala)
{noformat}

Also some warnings in the wikipedia-parser job:

{noformat}
2014-03-05 15:10:15 TaskInstance [WARN] No checkpoint found for partition: 
Partition [partition=1]. This is allowed if this is your first time running the 
job, but if it's not, you've probably lost data.
2014-03-05 15:10:15 TaskInstance [WARN] No checkpoint found for partition: 
Partition [partition=0]. This is allowed if this is your first time running the 
job, but if it's not, you've probably lost data.
2014-03-05 15:10:15 BrokerProxy [WARN] It appears that we received an invalid 
or empty offset None for [wikipedia-raw,1]. Attempting to use Kafka's 
auto.offset.reset setting. This can result in data loss if processing continues.
2014-03-05 15:10:15 BrokerProxy [WARN] It appears that we received an invalid 
or empty offset None for [wikipedia-raw,0]. Attempting to use Kafka's 
auto.offset.reset setting. This can result in data loss if processing continues.
{noformat}

And some warnings and an error in the wikipedia-stats logs:

{noformat}
2014-03-05 15:10:16 BrokerProxy [WARN] It appears that we received an invalid 
or empty offset None for [wikipedia-edits,1]. Attempting to use Kafka's 
auto.offset.reset setting. This can result in data loss if processing continues.
2014-03-05 15:10:16 BrokerProxy [WARN] It appears that we received an invalid 
or empty offset None for [wikipedia-edits,0]. Attempting to use Kafka's 
auto.offset.reset setting. This can result in data loss if processing continues.
2014-03-05 15:10:16 BrokerPartitionInfo [WARN] Error while fetching metadata 
[{TopicMetadata for topic wikipedia-stats ->
No partition metadata for topic wikipedia-stats due to 
kafka.common.LeaderNotAvailableException}] for topic [wikipedia-stats]: class 
kafka.common.LeaderNotAvailableException
2014-03-05 15:10:16 BrokerPartitionInfo [WARN] Error while fetching metadata 
[{TopicMetadata for topic wikipedia-stats ->
No partition metadata for topic wikipedia-stats due to 
kafka.common.LeaderNotAvailableException}] for topic [wikipedia-stats]: class 
kafka.common.LeaderNotAvailableException
2014-03-05 15:10:16 DefaultEventHandler [ERROR] Failed to collate messages by 
topic, partition due to: Failed to fetch topic metadata for topic: 
wikipedia-stats
{noformat}


> hello-samza wikipedia-stats job only receives messages on one partition
> -----------------------------------------------------------------------
>
>                 Key: SAMZA-170
>                 URL: https://issues.apache.org/jira/browse/SAMZA-170
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Martin Kleppmann
>            Assignee: Martin Kleppmann
>
> If you run the three hello-samza jobs, and inspect the output of the 
> wikipedia-stats topic, it looks like this:
> {noformat}
> {"is-bot-edit":3,"bytes-added":2695,"edits":24,"unique-titles":24,"is-new":1,"is-minor":7}
> {"bytes-added":0,"edits":0,"unique-titles":0}
> {"is-bot-edit":3,"is-talk":1,"bytes-added":3474,"edits":19,"unique-titles":19,"is-minor":6}
> {"bytes-added":0,"edits":0,"unique-titles":0}
> {"is-bot-edit":3,"bytes-added":1794,"edits":15,"unique-titles":15,"is-new":1,"is-minor":5}
> {"bytes-added":0,"edits":0,"unique-titles":0}
> {"is-bot-edit":3,"bytes-added":118,"edits":19,"unique-titles":19,"is-new":2,"is-minor":5}
> {"bytes-added":0,"edits":0,"unique-titles":0}
> {noformat}
> Every other message has 0 edits, and two messages appear every 10 seconds, 
> suggesting that of the job's two tasks (Kafka's default partition count is 
> 2), one of the two tasks is not receiving any messages. That might be because 
> all messages are going into one partition, or because half the messages are 
> being lost, I'm not sure. Either way, it doesn't seem right. (And I'm fairly 
> sure that it wasn't this way a few weeks ago.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to