Hi Garrett, 1. "systems.kafka.streams.metrics-samza.samza.msg.serde=metrics" seems a little problematic for me. I did not see where the metrics part is configured. such as
"# Define a metrics reporter called "snapshot", which publishes metrics # every 60 seconds. metrics.reporters=snapshot metrics.reporter.snapshot.class=org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactory # Tell the snapshot reporter to publish to a topic called "metrics" # in the "kafka" system. metrics.reporter.snapshot.stream=kafka.metrics # Encode metrics data as JSON. serializers.registry.metrics.class=org.apache.samza.serializers.MetricsSnapshotSerdeFactory systems.kafka.streams.metrics.samza.msg.serde=metrics" 2. "When I switch it back to the other topics I get the hung behavior with nothing erroring out in the logs." It is weird that can not see errors in the "samza-container-log", "stdout". What is the last part of the "samza-container-log"? Thanks, Fang, Yan yanfang...@gmail.com On Wed, Jun 3, 2015 at 9:06 AM, Garrett Barton <garrett.bar...@gmail.com> wrote: > It must have been junk data. I started using a new topic for everything > (metrics/source/dest) and data is flowing fine now. When I switch it back > to the other topics I get the hung behavior with nothing erroring out in > the logs. > > On Wed, Jun 3, 2015 at 10:10 AM, Garrett Barton <garrett.bar...@gmail.com> > wrote: > > > Thanks all for the responses. > > As you can probably tell from the job name I am doing a validation > process > > which is taking in json and testing for a ton of things. This is why I > am > > using String serde's instead of Samza's built in json support, the json > > could be malformed and I need to still operate on it. My serde issues > were > > from the malformed json being sent around, switching to StringSerde > solved > > this. I have not seen that behavior in SAMZA-608 yet. > > > > here is my current config, I started with the hello-samza wiki stat > config: > > > > # Job > > job.factory.class=org.apache.samza.job.yarn.YarnJobFactoryjob.name > =validate-records > > > > # YARN > > yarn.package.path=hdfs://p934/user/bartong/realtime-validator.tar.gz > > > > # Task > > task.class=samza.RealtimeValidator > > task.inputs=kafka.sample-raw-streamtask.window.ms=60000 > > > task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory > > task.checkpoint.system=kafka > > # Normally, this would be 3, but we have only one broker. > > task.checkpoint.replication.factor=1 > > > > # Serializers > > > serializers.registry.json.class=org.apache.samza.serializers.JsonSerdeFactory > > > serializers.registry.string.class=org.apache.samza.serializers.StringSerdeFactory > > > serializers.registry.long.class=org.apache.samza.serializers.LongSerdeFactory > > > serializers.registry.integer.class=org.apache.samza.serializers.IntegerSerdeFactory > > > > # Systems > > > systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory > > systems.kafka.samza.key.serde=string > > systems.kafka.samza.msg.serde=string > > systems.kafka.streams.metrics-samza.samza.msg.serde=metrics > > systems.kafka.consumer.zookeeper.connect=zk1:2181/ > > systems.kafka.consumer.auto.offset.reset=smallest > > systems.kafka.producer.bootstrap.servers=ka1:6667 > > > > # Key-value storage > > > stores.validate-records.factory=org.apache.samza.storage.kv.RocksDbKeyValueStorageEngineFactory > > stores.validate-records.changelog=kafka.validate-records-changelog > > stores.validate-records.key.serde=string > > stores.validate-records.msg.serde=long > > > > # Normally, we'd leave this alone, but we have only one broker. > > stores.validate-records.changelog.replication.factor=1 > > > > # Normally, we'd set this much higher, but we want things to look snappy > in the demo. > > stores.validate-records.write.batch.size=0 > > stores.validate-records.object.cache.size=0 > > > > > > > > Some things I've noticed while testing: > > setting auto.offset.reset=smallest is not taking me back to the start of > > the streams. Its like its being ignored. I also don't see consumer > > properties outputting to the logs, I do see producer props though. > > > > On Tue, Jun 2, 2015 at 8:24 PM, Guozhang Wang <wangg...@gmail.com> > wrote: > > > >> Hi Garret, > >> > >> Regarding the serde issues that Yan mentioned, you can check > >> https://issues.apache.org/jira/browse/SAMZA-608 and see if its > >> description > >> matches what you saw. > >> > >> Guozhang > >> > >> On Tue, Jun 2, 2015 at 4:56 PM, Yan Fang <yanfang...@gmail.com> wrote: > >> > >> > Hi Garrett, > >> > > >> > I guess you run into the serde issues as you mentioned. If you can > check > >> > the Samza log and show us, we will be more helpful. Also, maybe > pasting > >> the > >> > config here (if you dont mind), we can help to see if you miss > >> something. > >> > > >> > Thanks, > >> > > >> > Fang, Yan > >> > yanfang...@gmail.com > >> > > >> > On Tue, Jun 2, 2015 at 3:01 PM, Garrett Barton < > >> garrett.bar...@gmail.com> > >> > wrote: > >> > > >> > > Greetings all, > >> > > > >> > > > >> > > I am trying to translate an existing workflow from MR into Samza. > >> Thus > >> > far > >> > > everything is coded and kinks with deploying have been worked out. > My > >> > task > >> > > deploys into yarn (2.6.0), consumes records from Kafka (0.8.2.1) > fine, > >> > but > >> > > no data from metrics and my output streams are showing up in > Kafka. I > >> > see > >> > > the metrics topic created in Kafka, but its empty (I have a counter > >> > > counting records seen). > >> > > > >> > > I have debug prints showing me that I am calling collector.send() > >> which > >> > is > >> > > also wrapped in a catch for Throwable. Nothing at all shows in the > >> logs. > >> > > > >> > > I do see the checkpoint topic being used, and incremented > >> appropriately. > >> > > So between that and consuming records in the first place I think the > >> > > system.kafka is configured correctly. I ran into serde issues with > >> > > consumption and sending and fixed those too. > >> > > > >> > > Has anyone run into this kind of behavior? Am hoping its a dumb > >> config > >> > > issue. > >> > > > >> > > V/R, > >> > > ~Garrett > >> > > > >> > > >> > >> > >> > >> -- > >> -- Guozhang > >> > > > > >