Re: Kafka topic naming conventions
Renato, Thanks for the link. Some interesting suggests there as well. On Thu, Mar 19, 2015 at 2:28 AM, Renato Marroquín Mogrovejo < renatoj.marroq...@gmail.com> wrote: > There was an interesting discussion over in the kafka mailing list that > might give you more ideas Roger. > Although they don't mention anything about the number of partitions when > doing so, anyways maybe it helps. > > > Renato M. > > [1] https://www.mail-archive.com/users@kafka.apache.org/msg11976.html > > 2015-03-19 5:43 GMT+01:00 Roger Hoover : > > > Thanks, guys. I was also playing around with including partition count > and > > even the partition key in the topic name. My thought was that topics > may > > have the same data and number of partitions but only differ by partition > > key. After a while, the naming does get crazy (too long and ugly). We > > really need a topic metatdata store. > > > > On Wed, Mar 18, 2015 at 6:21 PM, Chinmay Soman < > chinmay.cere...@gmail.com> > > wrote: > > > > > Yeah ! It does seem a bit hackish - but I think this approach promises > > less > > > config/operation errors. > > > > > > Although I think some of these checks can be built within Samza - > > assuming > > > Kafka has a metadata store in the near future - the Samza container can > > > validate the #topics against this store. > > > > > > On Wed, Mar 18, 2015 at 6:16 PM, Chris Riccomini < > criccom...@apache.org> > > > wrote: > > > > > > > Hey Chinmay, > > > > > > > > Cool, this is good feedback. I didn't think I was *that* crazy. :) > > > > > > > > Cheers, > > > > Chris > > > > > > > > On Wed, Mar 18, 2015 at 6:10 PM, Chinmay Soman < > > > chinmay.cere...@gmail.com> > > > > wrote: > > > > > > > > > Thats what we're doing as well - appending partition count to the > > kafka > > > > > topic name. This actually helps keep track of the #partitions for > > each > > > > > topic (since Kafka doesn't have a Metadata store yet). > > > > > > > > > > In case of topic expansion - we actually just resort to creating a > > new > > > > > topic. Although that is an overhead - the thought process is that > > this > > > > will > > > > > minimize operational errors. Also, this is necessary to do in case > > > we're > > > > > doing some kind of joins. > > > > > > > > > > > > > > > On Wed, Mar 18, 2015 at 5:59 PM, Jakob Homan > > > wrote: > > > > > > > > > > > On 18 March 2015 at 17:48, Chris Riccomini < > criccom...@apache.org> > > > > > wrote: > > > > > > > One thing I haven't seen, but might be relevant, is including > > > > partition > > > > > > > counts in the topic. > > > > > > > > > > > > Yeah, but then if you change the partition count later on, you've > > got > > > > > > incorrect information forever. Or you need to create a new > stream, > > > > > > which might be a nice forcing function to make sure your join > isn't > > > > > > screwed up. There'd need to be something somewhere to enforce > that > > > > > > though. > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Thanks and regards > > > > > > > > > > Chinmay Soman > > > > > > > > > > > > > > > > > > > > > -- > > > Thanks and regards > > > > > > Chinmay Soman > > > > > >
Re: Kafka topic naming conventions
There was an interesting discussion over in the kafka mailing list that might give you more ideas Roger. Although they don't mention anything about the number of partitions when doing so, anyways maybe it helps. Renato M. [1] https://www.mail-archive.com/users@kafka.apache.org/msg11976.html 2015-03-19 5:43 GMT+01:00 Roger Hoover : > Thanks, guys. I was also playing around with including partition count and > even the partition key in the topic name. My thought was that topics may > have the same data and number of partitions but only differ by partition > key. After a while, the naming does get crazy (too long and ugly). We > really need a topic metatdata store. > > On Wed, Mar 18, 2015 at 6:21 PM, Chinmay Soman > wrote: > > > Yeah ! It does seem a bit hackish - but I think this approach promises > less > > config/operation errors. > > > > Although I think some of these checks can be built within Samza - > assuming > > Kafka has a metadata store in the near future - the Samza container can > > validate the #topics against this store. > > > > On Wed, Mar 18, 2015 at 6:16 PM, Chris Riccomini > > wrote: > > > > > Hey Chinmay, > > > > > > Cool, this is good feedback. I didn't think I was *that* crazy. :) > > > > > > Cheers, > > > Chris > > > > > > On Wed, Mar 18, 2015 at 6:10 PM, Chinmay Soman < > > chinmay.cere...@gmail.com> > > > wrote: > > > > > > > Thats what we're doing as well - appending partition count to the > kafka > > > > topic name. This actually helps keep track of the #partitions for > each > > > > topic (since Kafka doesn't have a Metadata store yet). > > > > > > > > In case of topic expansion - we actually just resort to creating a > new > > > > topic. Although that is an overhead - the thought process is that > this > > > will > > > > minimize operational errors. Also, this is necessary to do in case > > we're > > > > doing some kind of joins. > > > > > > > > > > > > On Wed, Mar 18, 2015 at 5:59 PM, Jakob Homan > > wrote: > > > > > > > > > On 18 March 2015 at 17:48, Chris Riccomini > > > > wrote: > > > > > > One thing I haven't seen, but might be relevant, is including > > > partition > > > > > > counts in the topic. > > > > > > > > > > Yeah, but then if you change the partition count later on, you've > got > > > > > incorrect information forever. Or you need to create a new stream, > > > > > which might be a nice forcing function to make sure your join isn't > > > > > screwed up. There'd need to be something somewhere to enforce that > > > > > though. > > > > > > > > > > > > > > > > > > > > > -- > > > > Thanks and regards > > > > > > > > Chinmay Soman > > > > > > > > > > > > > > > -- > > Thanks and regards > > > > Chinmay Soman > > >
Re: Kafka topic naming conventions
Thanks, guys. I was also playing around with including partition count and even the partition key in the topic name. My thought was that topics may have the same data and number of partitions but only differ by partition key. After a while, the naming does get crazy (too long and ugly). We really need a topic metatdata store. On Wed, Mar 18, 2015 at 6:21 PM, Chinmay Soman wrote: > Yeah ! It does seem a bit hackish - but I think this approach promises less > config/operation errors. > > Although I think some of these checks can be built within Samza - assuming > Kafka has a metadata store in the near future - the Samza container can > validate the #topics against this store. > > On Wed, Mar 18, 2015 at 6:16 PM, Chris Riccomini > wrote: > > > Hey Chinmay, > > > > Cool, this is good feedback. I didn't think I was *that* crazy. :) > > > > Cheers, > > Chris > > > > On Wed, Mar 18, 2015 at 6:10 PM, Chinmay Soman < > chinmay.cere...@gmail.com> > > wrote: > > > > > Thats what we're doing as well - appending partition count to the kafka > > > topic name. This actually helps keep track of the #partitions for each > > > topic (since Kafka doesn't have a Metadata store yet). > > > > > > In case of topic expansion - we actually just resort to creating a new > > > topic. Although that is an overhead - the thought process is that this > > will > > > minimize operational errors. Also, this is necessary to do in case > we're > > > doing some kind of joins. > > > > > > > > > On Wed, Mar 18, 2015 at 5:59 PM, Jakob Homan > wrote: > > > > > > > On 18 March 2015 at 17:48, Chris Riccomini > > > wrote: > > > > > One thing I haven't seen, but might be relevant, is including > > partition > > > > > counts in the topic. > > > > > > > > Yeah, but then if you change the partition count later on, you've got > > > > incorrect information forever. Or you need to create a new stream, > > > > which might be a nice forcing function to make sure your join isn't > > > > screwed up. There'd need to be something somewhere to enforce that > > > > though. > > > > > > > > > > > > > > > > -- > > > Thanks and regards > > > > > > Chinmay Soman > > > > > > > > > -- > Thanks and regards > > Chinmay Soman >
Re: Kafka topic naming conventions
Yeah ! It does seem a bit hackish - but I think this approach promises less config/operation errors. Although I think some of these checks can be built within Samza - assuming Kafka has a metadata store in the near future - the Samza container can validate the #topics against this store. On Wed, Mar 18, 2015 at 6:16 PM, Chris Riccomini wrote: > Hey Chinmay, > > Cool, this is good feedback. I didn't think I was *that* crazy. :) > > Cheers, > Chris > > On Wed, Mar 18, 2015 at 6:10 PM, Chinmay Soman > wrote: > > > Thats what we're doing as well - appending partition count to the kafka > > topic name. This actually helps keep track of the #partitions for each > > topic (since Kafka doesn't have a Metadata store yet). > > > > In case of topic expansion - we actually just resort to creating a new > > topic. Although that is an overhead - the thought process is that this > will > > minimize operational errors. Also, this is necessary to do in case we're > > doing some kind of joins. > > > > > > On Wed, Mar 18, 2015 at 5:59 PM, Jakob Homan wrote: > > > > > On 18 March 2015 at 17:48, Chris Riccomini > > wrote: > > > > One thing I haven't seen, but might be relevant, is including > partition > > > > counts in the topic. > > > > > > Yeah, but then if you change the partition count later on, you've got > > > incorrect information forever. Or you need to create a new stream, > > > which might be a nice forcing function to make sure your join isn't > > > screwed up. There'd need to be something somewhere to enforce that > > > though. > > > > > > > > > > > -- > > Thanks and regards > > > > Chinmay Soman > > > -- Thanks and regards Chinmay Soman
Re: Kafka topic naming conventions
Hey Chinmay, Cool, this is good feedback. I didn't think I was *that* crazy. :) Cheers, Chris On Wed, Mar 18, 2015 at 6:10 PM, Chinmay Soman wrote: > Thats what we're doing as well - appending partition count to the kafka > topic name. This actually helps keep track of the #partitions for each > topic (since Kafka doesn't have a Metadata store yet). > > In case of topic expansion - we actually just resort to creating a new > topic. Although that is an overhead - the thought process is that this will > minimize operational errors. Also, this is necessary to do in case we're > doing some kind of joins. > > > On Wed, Mar 18, 2015 at 5:59 PM, Jakob Homan wrote: > > > On 18 March 2015 at 17:48, Chris Riccomini > wrote: > > > One thing I haven't seen, but might be relevant, is including partition > > > counts in the topic. > > > > Yeah, but then if you change the partition count later on, you've got > > incorrect information forever. Or you need to create a new stream, > > which might be a nice forcing function to make sure your join isn't > > screwed up. There'd need to be something somewhere to enforce that > > though. > > > > > > -- > Thanks and regards > > Chinmay Soman >
Re: Kafka topic naming conventions
Thats what we're doing as well - appending partition count to the kafka topic name. This actually helps keep track of the #partitions for each topic (since Kafka doesn't have a Metadata store yet). In case of topic expansion - we actually just resort to creating a new topic. Although that is an overhead - the thought process is that this will minimize operational errors. Also, this is necessary to do in case we're doing some kind of joins. On Wed, Mar 18, 2015 at 5:59 PM, Jakob Homan wrote: > On 18 March 2015 at 17:48, Chris Riccomini wrote: > > One thing I haven't seen, but might be relevant, is including partition > > counts in the topic. > > Yeah, but then if you change the partition count later on, you've got > incorrect information forever. Or you need to create a new stream, > which might be a nice forcing function to make sure your join isn't > screwed up. There'd need to be something somewhere to enforce that > though. > -- Thanks and regards Chinmay Soman
Re: Kafka topic naming conventions
Hey Jakob, > Yeah, but then if you change the partition count later on, you've got > incorrect information forever. You're right. But IMO this further reinforces that you *can't* change partition counts on a topic that you're using for a JOIN. This completely breaks the operation. Agree that it's just best effort, and kind of hacky. Was just a thought. I haven't seen anyone actually do this. Cheers, Chris On Wed, Mar 18, 2015 at 5:59 PM, Jakob Homan wrote: > On 18 March 2015 at 17:48, Chris Riccomini wrote: > > One thing I haven't seen, but might be relevant, is including partition > > counts in the topic. > > Yeah, but then if you change the partition count later on, you've got > incorrect information forever. Or you need to create a new stream, > which might be a nice forcing function to make sure your join isn't > screwed up. There'd need to be something somewhere to enforce that > though. >
Re: Kafka topic naming conventions
On 18 March 2015 at 17:48, Chris Riccomini wrote: > One thing I haven't seen, but might be relevant, is including partition > counts in the topic. Yeah, but then if you change the partition count later on, you've got incorrect information forever. Or you need to create a new stream, which might be a nice forcing function to make sure your join isn't screwed up. There'd need to be something somewhere to enforce that though.
Re: Kafka topic naming conventions
Hey Roger, We haven't thought about this in great detail. People do all kinds of wacky things in practice. We have some that are like, "AdViewsByMemberId". There are various permutations of that. One thing I haven't seen, but might be relevant, is including partition counts in the topic. If you're doing joins, you kind of care about both the join key and partition count. Sorry I don't have better guidance. :/ Cheers, Chris On Wed, Mar 18, 2015 at 5:23 PM, Roger Hoover wrote: > Hi, > > Wondering what naming conventions people are using for topics in Kafka. > When there's re-partitioning involved, you can end up with multiple topics > that have the exact same data but are partitioned differently. How do you > name them? > > Thanks, > > Roger >