Perhaps we discuss this during the next KIP hangout? I do see that a pluggable rack locator can be useful but I do see a few concerns:
- The RackLocator (as described in the document), implies that it can discover rack information for any node in the cluster. How does it deal with rack location changes? For example, if I moved broker id (1) from rack X to Y, I only have to start that broker with a newer rack config. If RackLocator discovers broker -> rack information at start up time, any change to a broker will require bouncing the entire cluster since createTopic requests can be sent to any node in the cluster. For this reason it may be simpler to have each node be aware of its own rack and persist it in ZK during start up time. - A pluggable RackLocator relies on an external service being available to serve rack information. Out of curiosity, I looked up how a couple of other systems deal with zone/rack awareness. For Cassandra some interesting modes are: (Property File configuration) http://docs.datastax.com/en/cassandra/2.0/cassandra/architecture/architectureSnitchPFSnitch_t.html (Dynamic inference) http://docs.datastax.com/en/cassandra/2.0/cassandra/architecture/architectureSnitchRackInf_c.html Voldemort does a static node -> zone assignment based on configuration. Aditya On Wed, Sep 30, 2015 at 10:05 AM, Allen Wang <allenxw...@gmail.com> wrote: > I would like to see if we can do both: > > - Make RackLocator pluggable to facilitate migration with existing > broker-rack mapping > > - Make rack an optional property for broker. If rack is available from > broker, treat it as source of truth. For users with existing broker-rack > mapping somewhere else, they can use the pluggable way or they can transfer > the mapping to the broker rack property. > > One thing I am not sure is what happens at rolling upgrade when we have > rack as a broker property. For brokers with older version of Kafka, will it > cause problem for them? If so, is there any workaround? I also think it > would be better not to have rack in the controller wire protocol but not > sure if it is achievable. > > Thanks, > Allen > > > > > > On Mon, Sep 28, 2015 at 4:55 PM, Todd Palino <tpal...@gmail.com> wrote: > > > I tend to like the idea of a pluggable locator. For example, we already > > have an interface for discovering information about the physical location > > of servers. I don't relish the idea of having to maintain data in > multiple > > places. > > > > -Todd > > > > On Mon, Sep 28, 2015 at 4:48 PM, Aditya Auradkar < > > aaurad...@linkedin.com.invalid> wrote: > > > > > Thanks for starting this KIP Allen. > > > > > > I agree with Gwen that having a RackLocator class that is pluggable > seems > > > to be too complex. The KIP refers to potentially non-ZK storage for the > > > rack info which I don't think is necessary. > > > > > > Perhaps we can persist this info in zk under /brokers/ids/<broker_id> > > > similar to other broker properties and add a config in KafkaConfig > called > > > "rack". > > > {"jmx_port":-1,"endpoints":[...],"host":"xxx","port":yyy, "rack": > "abc"} > > > > > > Aditya > > > > > > On Mon, Sep 28, 2015 at 2:30 PM, Gwen Shapira <g...@confluent.io> > wrote: > > > > > > > Hi, > > > > > > > > First, thanks for putting out a KIP for this. This is super important > > for > > > > production deployments of Kafka. > > > > > > > > Few questions: > > > > > > > > 1) Are we sure we want "as many racks as possible"? I'd want to > balance > > > > between safety (more racks) and network utilization (traffic within a > > > rack > > > > uses the high-bandwidth TOR switch). One replica on a different rack > > and > > > > the rest on same rack (if possible) sounds better to me. > > > > > > > > 2) Rack-locator class seems overly complex compared to adding a > > > rack.number > > > > property to the broker properties file. Why do we want that? > > > > > > > > Gwen > > > > > > > > > > > > > > > > On Mon, Sep 28, 2015 at 12:15 PM, Allen Wang <allenxw...@gmail.com> > > > wrote: > > > > > > > > > Hello Kafka Developers, > > > > > > > > > > I just created KIP-36 for rack aware replica assignment. > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-36+Rack+aware+replica+assignment > > > > > > > > > > The goal is to utilize the isolation provided by the racks in data > > > center > > > > > and distribute replicas to racks to provide fault tolerance. > > > > > > > > > > Comments are welcome. > > > > > > > > > > Thanks, > > > > > Allen > > > > > > > > > > > > > > >