Re: Kafka topic naming conventions

2015-03-18 Thread Roger Hoover
Thanks, guys. I was also playing around with including partition count and even the partition key in the topic name. My thought was that topics may have the same data and number of partitions but only differ by partition key. After a while, the naming does get crazy (too long and ugly). We rea

Re: Kafka topic naming conventions

2015-03-18 Thread Chinmay Soman
Yeah ! It does seem a bit hackish - but I think this approach promises less config/operation errors. Although I think some of these checks can be built within Samza - assuming Kafka has a metadata store in the near future - the Samza container can validate the #topics against this store. On Wed,

Re: Kafka topic naming conventions

2015-03-18 Thread Chris Riccomini
Hey Chinmay, Cool, this is good feedback. I didn't think I was *that* crazy. :) Cheers, Chris On Wed, Mar 18, 2015 at 6:10 PM, Chinmay Soman wrote: > Thats what we're doing as well - appending partition count to the kafka > topic name. This actually helps keep track of the #partitions for each

Re: Kafka topic naming conventions

2015-03-18 Thread Chinmay Soman
Thats what we're doing as well - appending partition count to the kafka topic name. This actually helps keep track of the #partitions for each topic (since Kafka doesn't have a Metadata store yet). In case of topic expansion - we actually just resort to creating a new topic. Although that is an ov

Re: Kafka topic naming conventions

2015-03-18 Thread Chris Riccomini
Hey Jakob, > Yeah, but then if you change the partition count later on, you've got > incorrect information forever. You're right. But IMO this further reinforces that you *can't* change partition counts on a topic that you're using for a JOIN. This completely breaks the operation. Agree that it

Re: Kafka topic naming conventions

2015-03-18 Thread Jakob Homan
On 18 March 2015 at 17:48, Chris Riccomini wrote: > One thing I haven't seen, but might be relevant, is including partition > counts in the topic. Yeah, but then if you change the partition count later on, you've got incorrect information forever. Or you need to create a new stream, which might b

Re: Review Request 32188: Disable WAL in RocksDB KV store

2015-03-18 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32188/#review76989 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On March 18, 20

Re: Kafka topic naming conventions

2015-03-18 Thread Chris Riccomini
Hey Roger, We haven't thought about this in great detail. People do all kinds of wacky things in practice. We have some that are like, "AdViewsByMemberId". There are various permutations of that. One thing I haven't seen, but might be relevant, is including partition counts in the topic. If you'r

Kafka topic naming conventions

2015-03-18 Thread Roger Hoover
Hi, Wondering what naming conventions people are using for topics in Kafka. When there's re-partitioning involved, you can end up with multiple topics that have the exact same data but are partitioned differently. How do you name them? Thanks, Roger

Re: Example Samza job using Confluent Platform

2015-03-18 Thread Roger Hoover
Thanks, Chris. On Wed, Mar 18, 2015 at 11:29 AM, Chris Riccomini wrote: > Hey Roger, > > This is awesome! I've added it to the ecosystem wiki: > > https://cwiki.apache.org/confluence/display/SAMZA/Ecosystem > > Cheers, > Chris > > On Wed, Mar 18, 2015 at 10:59 AM, Roger Hoover > wrote: > > >

Re: Review Request 32155: SAMZA-458: Close in KafkaSystemProducer should flush all source buffers

2015-03-18 Thread Yan Fang
> On March 18, 2015, 9:01 p.m., Navina Ramesh wrote: > > samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala, > > line 49 > > > > > > Can you explain the purpose of "failedSources"? It i

Re: Review Request 32188: Disable WAL in RocksDB KV store

2015-03-18 Thread Navina Ramesh
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32188/ --- (Updated March 18, 2015, 11:01 p.m.) Review request for samza. Repository: sa

Re: Review Request 32188: Disable WAL in RocksDB KV store

2015-03-18 Thread Navina Ramesh
> On March 18, 2015, 9:56 p.m., Yi Pan (Data Infrastructure) wrote: > > samza-test/src/main/scala/org/apache/samza/test/performance/TestKeyValuePerformance.scala, > > line 135 > > > > > > nit: do we need multiple inst

Re: Review Request 32188: Disable WAL in RocksDB KV store

2015-03-18 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32188/#review76954 --- Overall looks good to me. samza-test/src/main/scala/org/apache/sam

Re: Review Request 32188: Disable WAL in RocksDB KV store

2015-03-18 Thread Navina Ramesh
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32188/ --- (Updated March 18, 2015, 9:48 p.m.) Review request for samza. Repository: sam

Re: Review Request 32155: SAMZA-458: Close in KafkaSystemProducer should flush all source buffers

2015-03-18 Thread Navina Ramesh
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32155/#review76944 --- samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSyste

Review Request 32217: SAMZA 567

2015-03-18 Thread Naveen Somasundaram
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32217/ --- Review request for samza. Repository: samza Description --- Moved the pr

Re: Example Samza job using Confluent Platform

2015-03-18 Thread Chris Riccomini
Hey Roger, This is awesome! I've added it to the ecosystem wiki: https://cwiki.apache.org/confluence/display/SAMZA/Ecosystem Cheers, Chris On Wed, Mar 18, 2015 at 10:59 AM, Roger Hoover wrote: > Hi all, > > In case others find this useful I created a simple Samza job that uses Avro > + the

Example Samza job using Confluent Platform

2015-03-18 Thread Roger Hoover
Hi all, In case others find this useful I created a simple Samza job that uses Avro + the schema registry from the Confluent Platform. It's using the pending Pull Request to decode to Avro SpecificRecords. Suggestions for improvement are also welcome. https://github.com/theduderog/hello-samza-c

Re: SamzaException: no job factory class defined

2015-03-18 Thread Tommy Becker
Yes, Jetty was repackaged to org.eclipse.jetty in version 7. On 03/18/2015 12:39 PM, Jordi Blasi Uribarri wrote: That was it, and some more configurations that were missing: task.class=samzafroga.job1 job.factory.class=org.apache.samza.job.local.ThreadJobFactory job.name=samzafroga.job1 systems

RE: SamzaException: no job factory class defined

2015-03-18 Thread Jordi Blasi Uribarri
That was it, and some more configurations that were missing: task.class=samzafroga.job1 job.factory.class=org.apache.samza.job.local.ThreadJobFactory job.name=samzafroga.job1 systems.kafka.producer.bootstrap.servers=broker01:9092 Now I am getting this exception: Exception in thread "main" java.l

Re: SamzaException: no job factory class defined

2015-03-18 Thread Roger Hoover
Hi Jordi, I think you need to add the "job.factory.class" property. http://samza.apache.org/learn/documentation/0.8/jobs/configuration-table.html #An example job.factory.class=org.apache.samza.job.local.ThreadJobFactory Cheers, Roger On Wed, Mar 18, 2015 at 8:45 AM, Jordi Blasi Uribarri wrote

SamzaException: no job factory class defined

2015-03-18 Thread Jordi Blasi Uribarri
Hello, I am trying to run my first job (publish what receives) in Samza and I think that all the dependencies where added by configuring the Maven repositories (solved in a recent question to the list). I am getting another exception on the Job runner: #/opt/jobs# bin/run-job.sh --config-fact

Review Request 32202: SAMZA-456 - Add samza-yarn jar to multi-node tutorial.

2015-03-18 Thread Tommy Becker
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32202/ --- Review request for samza. Repository: samza Description --- Fix for SAMZ