Re: Problems upgrading Job

Yi Pan Thu, 12 Nov 2015 23:09:07 -0800

Hi, Rick,

Yes, please open a JIRA w/ your configuration, deployment set up and
sequence, and logs from JobRunner.


Thanks a lot!

-Yi

On Thu, Nov 12, 2015 at 10:10 AM, Rick Mangi <[email protected]> wrote:

> Hi Yi,
>
> I pulled from master and built this morning.
>
> Yes, that’s the output from JobRunner. I also tried setting a job.id to
> see if this was an issue migrating from an old task checkpoint topic but I
> got the same result.
>
> Would you like me to open a jira ticket?
>
> Thanks,
>
> Rick
>
>
>
> > On Nov 12, 2015, at 12:59 PM, Yi Pan <[email protected]> wrote:
> >
> > Hi, Rick,
> >
> > Did you get the fix in SAMZA-723 in your test? And could you confirm that
> > the errors are from JobRunner log?
> >
> > -Yi
> >
> > On Thu, Nov 12, 2015 at 8:48 AM, Rick Mangi <[email protected]> wrote:
> >
> >> Hi,
> >>
> >> I’m trying to migrate our samza jobs to 0.10.0 snapshot (built against
> the
> >> latest). Everything works fine running locally (although I had to make
> some
> >> changes to the local grid’s kafka since the checkpointing seems to
> require
> >> replication_factor > 1) but when I deploy it against my production yarn
> >> cluster I get these errors.
> >>
> >> [yarnmaster01] out: 2015-11-12 10:40:53 ZkClient [INFO] zookeeper state
> >> changed (SyncConnected)
> >> [yarnmaster01] out: 2015-11-12 10:40:53 ZkEventThread [INFO] Terminate
> >> ZkClient event thread.
> >> [yarnmaster01] out: 2015-11-12 10:40:53 ZooKeeper [INFO] Session:
> >> 0x250233cdf57f2fa closed
> >> [yarnmaster01] out: 2015-11-12 10:40:53 ClientCnxn [INFO] EventThread
> shut
> >> down
> >> [yarnmaster01] out: 2015-11-12 10:40:53 KafkaSystemAdmin [INFO]
> >> Coordinator stream __samza_coordinator_metrics-reporter_1 already
> exists.
> >> [yarnmaster01] out: 2015-11-12 10:40:53 JobRunner [INFO] Storing config
> in
> >> coordinator stream.
> >> [yarnmaster01] out: 2015-11-12 10:40:53 CoordinatorStreamSystemProducer
> >> [INFO] Starting coordinator stream producer.
> >> [yarnmaster01] out: 2015-11-12 10:40:53 KafkaSystemProducer [INFO]
> >> Creating a new producer for system mykafka.
> >> [yarnmaster01] out: 2015-11-12 10:40:53 ProducerConfig [INFO]
> >> ProducerConfig values:
> >> [yarnmaster01] out:     value.serializer = class
> >> org.apache.kafka.common.serialization.ByteArraySerializer
> >> [yarnmaster01] out:     key.serializer = class
> >> org.apache.kafka.common.serialization.ByteArraySerializer
> >> [yarnmaster01] out:     block.on.buffer.full = true
> >> [yarnmaster01] out:     retry.backoff.ms = 100
> >> [yarnmaster01] out:     buffer.memory = 33554432
> >> [yarnmaster01] out:     batch.size = 16384
> >> [yarnmaster01] out:     metrics.sample.window.ms = 30000
> >> [yarnmaster01] out:     metadata.max.age.ms = 300000
> >> [yarnmaster01] out:     receive.buffer.bytes = 32768
> >> [yarnmaster01] out:     timeout.ms = 30000
> >> [yarnmaster01] out:     max.in.flight.requests.per.connection = 1
> >> [yarnmaster01] out:     bootstrap.servers = [
> >> devstream01.chartbeat.net:9092]
> >> [yarnmaster01] out:     metric.reporters = []
> >> [yarnmaster01] out:     client.id =
> >> samza_producer-metrics_reporter-1-1447342853273-4
> >> [yarnmaster01] out:     compression.type = none
> >> [yarnmaster01] out:     retries = 2147483647
> >> [yarnmaster01] out:     max.request.size = 1048576
> >> [yarnmaster01] out:     send.buffer.bytes = 131072
> >> [yarnmaster01] out:     acks = 1
> >> [yarnmaster01] out:     reconnect.backoff.ms = 10
> >> [yarnmaster01] out:     linger.ms = 0
> >> [yarnmaster01] out:     metrics.num.samples = 2
> >> [yarnmaster01] out:     metadata.fetch.timeout.ms = 60000
> >> [yarnmaster01] out:
> >> [yarnmaster01] out: 2015-11-12 10:40:53 ProducerConfig [WARN] The
> >> configuration batch.num.messages = null was supplied but isn't a known
> >> config.
> >> [yarnmaster01] out: 2015-11-12 10:40:53 ProducerConfig [WARN] The
> >> configuration producer.type = null was supplied but isn't a known
> config.
> >> [yarnmaster01] out: Exception in thread "main"
> >> org.apache.samza.SamzaException:
> >> org.apache.kafka.common.errors.TimeoutException: Failed to update
> metadata
> >> after 60000 ms.
> >> [yarnmaster01] out:     at
> >>
> org.apache.samza.coordinator.stream.CoordinatorStreamSystemProducer.send(CoordinatorStreamSystemProducer.java:115)
> >> [yarnmaster01] out:     at
> >>
> org.apache.samza.coordinator.stream.CoordinatorStreamSystemProducer.writeConfig(CoordinatorStreamSystemProducer.java:132)
> >> [yarnmaster01] out:     at
> >> org.apache.samza.job.JobRunner.run(JobRunner.scala:85)
> >> [yarnmaster01] out:     at
> >> org.apache.samza.job.JobRunner$.main(JobRunner.scala:43)
> >> [yarnmaster01] out:     at
> >> org.apache.samza.job.JobRunner.main(JobRunner.scala)
> >> [yarnmaster01] out: Caused by:
> >> org.apache.kafka.common.errors.TimeoutException: Failed to update
> metadata
> >> after 60000 ms.
> >> [yarnmaster01] out:
> >>
> >>
> >> Warning: run() received nonzero return code 1 while executing
> >> './bin/run-job.sh
> >>
> -config-factory=org.apache.samza.config.factories.PropertiesConfigFactory
> >> --config-path=file://$PWD/conf/metrics_reporter.properties'!
> >>
> >>
> >> This looks similar to https://issues.apache.org/jira/browse/SAMZA-560
> but
> >> I’m not using a StreamAppender in log4j.
> >>
> >> Any ideas? My first thought is that I might have to delete the existing
> >> checkpoint topics but that would mean we can’t migrate completely until
> the
> >> 10.0 release unless we want to run snapshot code in production.
> >>
> >> Thanks!
> >>
> >> Rick
> >>
> >>
> >>
>
>

Re: Problems upgrading Job

Reply via email to