I absolutely disagree with #2, Neha. That will break a lot of
infrastructure within LinkedIn. That said, removing "." might break other
people as well, but I think we should have a clearer idea of how much usage
there is on either side.

-Todd


On Fri, Jul 10, 2015 at 2:08 PM, Neha Narkhede <n...@confluent.io> wrote:

> "." seems natural for grouping topic names. +1 for 2) going forward only
> without breaking previously created topics with "_" though that might
> require us to patch the code somewhat awkwardly till we phase it out a
> couple (purposely left vague to stay out of Ewen's wrath :-)) versions
> later.
>
> On Fri, Jul 10, 2015 at 2:02 PM, Gwen Shapira <gshap...@cloudera.com>
> wrote:
>
> > I don't think we should break existing topics. Just disallow new
> > topics going forward.
> >
> > Agree that having both is horrible, but we should have a solution that
> > fails when you run "kafka_topics.sh --create", not when you configure
> > Ganglia.
> >
> > Gwen
> >
> > On Fri, Jul 10, 2015 at 1:53 PM, Jay Kreps <j...@confluent.io> wrote:
> > > Unfortunately '.' is pretty common too. I agree that it is perverse,
> but
> > > people seem to do it. Breaking all the topics with '.' in the name
> seems
> > > like it could be worse than combining metrics for people who have a
> > > 'foo_bar' AND 'foo.bar' (and after all, having both is DEEPLY perverse,
> > > no?).
> > >
> > > Where is our Dean of Compatibility, Ewen, on this?
> > >
> > > -Jay
> > >
> > > On Fri, Jul 10, 2015 at 1:32 PM, Todd Palino <tpal...@gmail.com>
> wrote:
> > >
> > >> My selfish point of view is that we do #1, as we use "_" extensively
> in
> > >> topic names here :) I also happen to think it's the right choice,
> > >> specifically because "." has more special meanings, as you noted.
> > >>
> > >> -Todd
> > >>
> > >>
> > >> On Fri, Jul 10, 2015 at 1:30 PM, Gwen Shapira <gshap...@cloudera.com>
> > >> wrote:
> > >>
> > >> > Unintentional side effect from allowing IP addresses in consumer
> > client
> > >> > IDs :)
> > >> >
> > >> > So the question is, what do we do now?
> > >> >
> > >> > 1) disallow "."
> > >> > 2) disallow "_"
> > >> > 3) find a reversible way to encode "." and "_" that won't break
> > existing
> > >> > metrics
> > >> > 4) all of the above?
> > >> >
> > >> > btw. it looks like "." and ".." are currently valid. Topic names are
> > >> > used for directories, right? this sounds like fun :)
> > >> >
> > >> > I vote for option #1, although if someone has a good idea for #3 it
> > >> > will be even better.
> > >> >
> > >> > Gwen
> > >> >
> > >> >
> > >> >
> > >> > On Fri, Jul 10, 2015 at 1:22 PM, Grant Henke <ghe...@cloudera.com>
> > >> wrote:
> > >> > > Found it was added here:
> > >> https://issues.apache.org/jira/browse/KAFKA-697
> > >> > >
> > >> > > On Fri, Jul 10, 2015 at 3:18 PM, Todd Palino <tpal...@gmail.com>
> > >> wrote:
> > >> > >
> > >> > >> This was definitely changed at some point after KAFKA-495. The
> > >> question
> > >> > is
> > >> > >> when and why.
> > >> > >>
> > >> > >> Here's the relevant code from that patch:
> > >> > >>
> > >> > >>
> ===================================================================
> > >> > >> --- core/src/main/scala/kafka/utils/Topic.scala (revision
> 1390178)
> > >> > >> +++ core/src/main/scala/kafka/utils/Topic.scala (working copy)
> > >> > >> @@ -21,24 +21,21 @@
> > >> > >>  import util.matching.Regex
> > >> > >>
> > >> > >>  object Topic {
> > >> > >> +  val legalChars = "[a-zA-Z0-9_-]"
> > >> > >>
> > >> > >>
> > >> > >>
> > >> > >> -Todd
> > >> > >>
> > >> > >>
> > >> > >> On Fri, Jul 10, 2015 at 1:02 PM, Grant Henke <
> ghe...@cloudera.com>
> > >> > wrote:
> > >> > >>
> > >> > >> > kafka.common.Topic shows that currently period is a valid
> > character
> > >> > and I
> > >> > >> > have verified I can use kafka-topics.sh to create a new topic
> > with a
> > >> > >> > period.
> > >> > >> >
> > >> > >> >
> > >> > >> > AdminUtils.createOrUpdateTopicPartitionAssignmentPathInZK
> > currently
> > >> > uses
> > >> > >> > Topic.validate before writing to Zookeeper.
> > >> > >> >
> > >> > >> > Should period character support be removed? I was under the
> same
> > >> > >> impression
> > >> > >> > as Gwen, that a period was used by many as a way to "group"
> > topics.
> > >> > >> >
> > >> > >> > The code is pasted below since its small:
> > >> > >> >
> > >> > >> > object Topic {
> > >> > >> >   val legalChars = "[a-zA-Z0-9\\._\\-]"
> > >> > >> >   private val maxNameLength = 255
> > >> > >> >   private val rgx = new Regex(legalChars + "+")
> > >> > >> >
> > >> > >> >   val InternalTopics = Set(OffsetManager.OffsetsTopicName)
> > >> > >> >
> > >> > >> >   def validate(topic: String) {
> > >> > >> >     if (topic.length <= 0)
> > >> > >> >       throw new InvalidTopicException("topic name is illegal,
> > can't
> > >> be
> > >> > >> > empty")
> > >> > >> >     else if (topic.equals(".") || topic.equals(".."))
> > >> > >> >       throw new InvalidTopicException("topic name cannot be
> > \".\" or
> > >> > >> > \"..\"")
> > >> > >> >     else if (topic.length > maxNameLength)
> > >> > >> >       throw new InvalidTopicException("topic name is illegal,
> > can't
> > >> be
> > >> > >> > longer than " + maxNameLength + " characters")
> > >> > >> >
> > >> > >> >     rgx.findFirstIn(topic) match {
> > >> > >> >       case Some(t) =>
> > >> > >> >         if (!t.equals(topic))
> > >> > >> >           throw new InvalidTopicException("topic name " + topic
> > + "
> > >> is
> > >> > >> > illegal, contains a character other than ASCII alphanumerics,
> > '.',
> > >> '_'
> > >> > >> and
> > >> > >> > '-'")
> > >> > >> >       case None => throw new InvalidTopicException("topic name
> "
> > +
> > >> > topic
> > >> > >> +
> > >> > >> > " is illegal,  contains a character other than ASCII
> > alphanumerics,
> > >> > '.',
> > >> > >> > '_' and '-'")
> > >> > >> >     }
> > >> > >> >   }
> > >> > >> > }
> > >> > >> >
> > >> > >> > On Fri, Jul 10, 2015 at 2:50 PM, Todd Palino <
> tpal...@gmail.com>
> > >> > wrote:
> > >> > >> >
> > >> > >> > > I had to go look this one up again to make sure -
> > >> > >> > > https://issues.apache.org/jira/browse/KAFKA-495
> > >> > >> > >
> > >> > >> > > The only valid character names for topics are alphanumeric,
> > >> > underscore,
> > >> > >> > and
> > >> > >> > > dash. A period is not supposed to be a valid character to
> use.
> > If
> > >> > >> you're
> > >> > >> > > seeing them, then one of two things have happened:
> > >> > >> > >
> > >> > >> > > 1) You have topic names that are grandfathered in from before
> > that
> > >> > >> patch
> > >> > >> > > 2) The patch is not working properly and there is somewhere
> in
> > the
> > >> > >> broker
> > >> > >> > > that the standard is not being enforced.
> > >> > >> > >
> > >> > >> > > -Todd
> > >> > >> > >
> > >> > >> > >
> > >> > >> > > On Fri, Jul 10, 2015 at 12:13 PM, Brock Noland <
> > br...@apache.org>
> > >> > >> wrote:
> > >> > >> > >
> > >> > >> > > > On Fri, Jul 10, 2015 at 11:34 AM, Gwen Shapira <
> > >> > >> gshap...@cloudera.com>
> > >> > >> > > > wrote:
> > >> > >> > > > > Hi Kafka Fans,
> > >> > >> > > > >
> > >> > >> > > > > If you have one topic named "kafka_lab_2" and the other
> > named
> > >> > >> > > > > "kafka.lab.2", the topic level metrics will be named
> > >> kafka_lab_2
> > >> > >> for
> > >> > >> > > > > both, effectively making it impossible to monitor them
> > >> properly.
> > >> > >> > > > >
> > >> > >> > > > > The reason this happens is that using "." in topic names
> is
> > >> > pretty
> > >> > >> > > > > common, especially as a way to group topics into data
> > centers,
> > >> > >> > > > > relevant apps, etc - basically a work-around to our
> current
> > >> > lack of
> > >> > >> > > > > name spaces. However, most metric monitoring systems
> using
> > "."
> > >> > to
> > >> > >> > > > > annotate hierarchy, so to avoid issues around metric
> names,
> > >> > Kafka
> > >> > >> > > > > replaces the "." in the name with an underscore.
> > >> > >> > > > >
> > >> > >> > > > > This generates good metric names, but creates the problem
> > with
> > >> > name
> > >> > >> > > > collisions.
> > >> > >> > > > >
> > >> > >> > > > > I'm wondering if it makes sense to simply limit the range
> > of
> > >> > >> > > > > characters permitted in a topic name and disallow "_"?
> > >> Obviously
> > >> > >> > > > > existing topics will need to remain as is, which is a bit
> > >> > awkward.
> > >> > >> > > >
> > >> > >> > > > Interesting problem! Many if not most users I personally am
> > >> aware
> > >> > of
> > >> > >> > > > use "_" as a separator in topic names. I am sure that many
> > users
> > >> > >> would
> > >> > >> > > > be quite surprised by this limitation. With that said, I am
> > sure
> > >> > >> > > > they'd transition accordingly.
> > >> > >> > > >
> > >> > >> > > > >
> > >> > >> > > > > If anyone has better backward-compatible solutions to
> this,
> > >> I'm
> > >> > all
> > >> > >> > > ears
> > >> > >> > > > :)
> > >> > >> > > > >
> > >> > >> > > > > Gwen
> > >> > >> > > >
> > >> > >> > >
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >> > --
> > >> > >> > Grant Henke
> > >> > >> > Solutions Consultant | Cloudera
> > >> > >> > ghe...@cloudera.com | twitter.com/gchenke |
> > >> > linkedin.com/in/granthenke
> > >> > >> >
> > >> > >>
> > >> > >
> > >> > >
> > >> > >
> > >> > > --
> > >> > > Grant Henke
> > >> > > Solutions Consultant | Cloudera
> > >> > > ghe...@cloudera.com | twitter.com/gchenke |
> > linkedin.com/in/granthenke
> > >> >
> > >>
> >
>
>
>
> --
> Thanks,
> Neha
>

Reply via email to