[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201709#comment-15201709 ] Jun Rao commented on KAFKA-1215: [~allenxwang], could you also update the changes to ZK structure in https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper ? > Rack-Aware replica assignment option > > > Key: KAFKA-1215 > URL: https://issues.apache.org/jira/browse/KAFKA-1215 > Project: Kafka > Issue Type: Improvement > Components: replication >Affects Versions: 0.8.0 >Reporter: Joris Van Remoortere >Assignee: Allen Wang >Priority: Critical > Fix For: 0.10.0.0 > > Attachments: rack_aware_replica_assignment_v1.patch, > rack_aware_replica_assignment_v2.patch > > > Adding a rack-id to kafka config. This rack-id can be used during replica > assignment by using the max-rack-replication argument in the admin scripts > (create topic, etc.). By default the original replication assignment > algorithm is used because max-rack-replication defaults to -1. > max-rack-replication > -1 is not honored if you are doing manual replica > assignment (preffered). > If this looks good I can add some test cases specific to the rack-aware > assignment. > I can also port this to trunk. We are currently running 0.8.0 in production > and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15202369#comment-15202369 ] Jun Rao commented on KAFKA-1215: Great, thanks Allen. > Rack-Aware replica assignment option > > > Key: KAFKA-1215 > URL: https://issues.apache.org/jira/browse/KAFKA-1215 > Project: Kafka > Issue Type: Improvement > Components: replication >Affects Versions: 0.8.0 >Reporter: Joris Van Remoortere >Assignee: Allen Wang >Priority: Critical > Fix For: 0.10.0.0 > > Attachments: rack_aware_replica_assignment_v1.patch, > rack_aware_replica_assignment_v2.patch > > > Adding a rack-id to kafka config. This rack-id can be used during replica > assignment by using the max-rack-replication argument in the admin scripts > (create topic, etc.). By default the original replication assignment > algorithm is used because max-rack-replication defaults to -1. > max-rack-replication > -1 is not honored if you are doing manual replica > assignment (preffered). > If this looks good I can add some test cases specific to the rack-aware > assignment. > I can also port this to trunk. We are currently running 0.8.0 in production > and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15202362#comment-15202362 ] Allen Wang commented on KAFKA-1215: --- [~junrao] Updated. > Rack-Aware replica assignment option > > > Key: KAFKA-1215 > URL: https://issues.apache.org/jira/browse/KAFKA-1215 > Project: Kafka > Issue Type: Improvement > Components: replication >Affects Versions: 0.8.0 >Reporter: Joris Van Remoortere >Assignee: Allen Wang >Priority: Critical > Fix For: 0.10.0.0 > > Attachments: rack_aware_replica_assignment_v1.patch, > rack_aware_replica_assignment_v2.patch > > > Adding a rack-id to kafka config. This rack-id can be used during replica > assignment by using the max-rack-replication argument in the admin scripts > (create topic, etc.). By default the original replication assignment > algorithm is used because max-rack-replication defaults to -1. > max-rack-replication > -1 is not honored if you are doing manual replica > assignment (preffered). > If this looks good I can add some test cases specific to the rack-aware > assignment. > I can also port this to trunk. We are currently running 0.8.0 in production > and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195674#comment-15195674 ] ASF GitHub Bot commented on KAFKA-1215: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/132 > Rack-Aware replica assignment option > > > Key: KAFKA-1215 > URL: https://issues.apache.org/jira/browse/KAFKA-1215 > Project: Kafka > Issue Type: Improvement > Components: replication >Affects Versions: 0.8.0 >Reporter: Joris Van Remoortere >Assignee: Allen Wang >Priority: Critical > Fix For: 0.10.0.0 > > Attachments: rack_aware_replica_assignment_v1.patch, > rack_aware_replica_assignment_v2.patch > > > Adding a rack-id to kafka config. This rack-id can be used during replica > assignment by using the max-rack-replication argument in the admin scripts > (create topic, etc.). By default the original replication assignment > algorithm is used because max-rack-replication defaults to -1. > max-rack-replication > -1 is not honored if you are doing manual replica > assignment (preffered). > If this looks good I can add some test cases specific to the rack-aware > assignment. > I can also port this to trunk. We are currently running 0.8.0 in production > and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169592#comment-15169592 ] Allen Wang commented on KAFKA-1215: --- [~aauradkar] Yes it is ready for review. > Rack-Aware replica assignment option > > > Key: KAFKA-1215 > URL: https://issues.apache.org/jira/browse/KAFKA-1215 > Project: Kafka > Issue Type: Improvement > Components: replication >Affects Versions: 0.8.0 >Reporter: Joris Van Remoortere >Assignee: Allen Wang > Fix For: 0.10.0.0 > > Attachments: rack_aware_replica_assignment_v1.patch, > rack_aware_replica_assignment_v2.patch > > > Adding a rack-id to kafka config. This rack-id can be used during replica > assignment by using the max-rack-replication argument in the admin scripts > (create topic, etc.). By default the original replication assignment > algorithm is used because max-rack-replication defaults to -1. > max-rack-replication > -1 is not honored if you are doing manual replica > assignment (preffered). > If this looks good I can add some test cases specific to the rack-aware > assignment. > I can also port this to trunk. We are currently running 0.8.0 in production > and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159491#comment-15159491 ] Aditya Auradkar commented on KAFKA-1215: [~allenxwang] - Is this patch ready for review? I noticed you add several commits recently but I'm not sure if you are done. > Rack-Aware replica assignment option > > > Key: KAFKA-1215 > URL: https://issues.apache.org/jira/browse/KAFKA-1215 > Project: Kafka > Issue Type: Improvement > Components: replication >Affects Versions: 0.8.0 >Reporter: Joris Van Remoortere >Assignee: Allen Wang > Fix For: 0.10.0.0 > > Attachments: rack_aware_replica_assignment_v1.patch, > rack_aware_replica_assignment_v2.patch > > > Adding a rack-id to kafka config. This rack-id can be used during replica > assignment by using the max-rack-replication argument in the admin scripts > (create topic, etc.). By default the original replication assignment > algorithm is used because max-rack-replication defaults to -1. > max-rack-replication > -1 is not honored if you are doing manual replica > assignment (preffered). > If this looks good I can add some test cases specific to the rack-aware > assignment. > I can also port this to trunk. We are currently running 0.8.0 in production > and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993035#comment-14993035 ] Vidhya Arvind commented on KAFKA-1215: -- Is there anyway this patch can be part of 0.9.0.0? > Rack-Aware replica assignment option > > > Key: KAFKA-1215 > URL: https://issues.apache.org/jira/browse/KAFKA-1215 > Project: Kafka > Issue Type: Improvement > Components: replication >Affects Versions: 0.8.0 >Reporter: Joris Van Remoortere >Assignee: Jun Rao > Fix For: 0.10.0.0 > > Attachments: rack_aware_replica_assignment_v1.patch, > rack_aware_replica_assignment_v2.patch > > > Adding a rack-id to kafka config. This rack-id can be used during replica > assignment by using the max-rack-replication argument in the admin scripts > (create topic, etc.). By default the original replication assignment > algorithm is used because max-rack-replication defaults to -1. > max-rack-replication > -1 is not honored if you are doing manual replica > assignment (preffered). > If this looks good I can add some test cases specific to the rack-aware > assignment. > I can also port this to trunk. We are currently running 0.8.0 in production > and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908460#comment-14908460 ] Aditya Auradkar commented on KAFKA-1215: [~allenxwang] - bump. > Rack-Aware replica assignment option > > > Key: KAFKA-1215 > URL: https://issues.apache.org/jira/browse/KAFKA-1215 > Project: Kafka > Issue Type: Improvement > Components: replication >Affects Versions: 0.8.0 >Reporter: Joris Van Remoortere >Assignee: Jun Rao > Fix For: 0.10.0.0 > > Attachments: rack_aware_replica_assignment_v1.patch, > rack_aware_replica_assignment_v2.patch > > > Adding a rack-id to kafka config. This rack-id can be used during replica > assignment by using the max-rack-replication argument in the admin scripts > (create topic, etc.). By default the original replication assignment > algorithm is used because max-rack-replication defaults to -1. > max-rack-replication > -1 is not honored if you are doing manual replica > assignment (preffered). > If this looks good I can add some test cases specific to the rack-aware > assignment. > I can also port this to trunk. We are currently running 0.8.0 in production > and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804835#comment-14804835 ] Joel Koshy commented on KAFKA-1215: --- [~allenxwang] you should have access now. > Rack-Aware replica assignment option > > > Key: KAFKA-1215 > URL: https://issues.apache.org/jira/browse/KAFKA-1215 > Project: Kafka > Issue Type: Improvement > Components: replication >Affects Versions: 0.8.0 >Reporter: Joris Van Remoortere >Assignee: Jun Rao > Fix For: 0.10.0.0 > > Attachments: rack_aware_replica_assignment_v1.patch, > rack_aware_replica_assignment_v2.patch > > > Adding a rack-id to kafka config. This rack-id can be used during replica > assignment by using the max-rack-replication argument in the admin scripts > (create topic, etc.). By default the original replication assignment > algorithm is used because max-rack-replication defaults to -1. > max-rack-replication > -1 is not honored if you are doing manual replica > assignment (preffered). > If this looks good I can add some test cases specific to the rack-aware > assignment. > I can also port this to trunk. We are currently running 0.8.0 in production > and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14745803#comment-14745803 ] Aditya Auradkar commented on KAFKA-1215: [~allenxwang] One of the committers can provide you write access once you provide your confluence apache id. Please let me know if you need any help with the KIP/reviews etc. Thanks! > Rack-Aware replica assignment option > > > Key: KAFKA-1215 > URL: https://issues.apache.org/jira/browse/KAFKA-1215 > Project: Kafka > Issue Type: Improvement > Components: replication >Affects Versions: 0.8.0 >Reporter: Joris Van Remoortere >Assignee: Jun Rao > Fix For: 0.10.0.0 > > Attachments: rack_aware_replica_assignment_v1.patch, > rack_aware_replica_assignment_v2.patch > > > Adding a rack-id to kafka config. This rack-id can be used during replica > assignment by using the max-rack-replication argument in the admin scripts > (create topic, etc.). By default the original replication assignment > algorithm is used because max-rack-replication defaults to -1. > max-rack-replication > -1 is not honored if you are doing manual replica > assignment (preffered). > If this looks good I can add some test cases specific to the rack-aware > assignment. > I can also port this to trunk. We are currently running 0.8.0 in production > and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746264#comment-14746264 ] Allen Wang commented on KAFKA-1215: --- [~aauradkar] [~junrao] [~jkreps] My apache confluence id is allenxwang, same as my JIRA id. Please let me know when write access is granted. Thanks. > Rack-Aware replica assignment option > > > Key: KAFKA-1215 > URL: https://issues.apache.org/jira/browse/KAFKA-1215 > Project: Kafka > Issue Type: Improvement > Components: replication >Affects Versions: 0.8.0 >Reporter: Joris Van Remoortere >Assignee: Jun Rao > Fix For: 0.10.0.0 > > Attachments: rack_aware_replica_assignment_v1.patch, > rack_aware_replica_assignment_v2.patch > > > Adding a rack-id to kafka config. This rack-id can be used during replica > assignment by using the max-rack-replication argument in the admin scripts > (create topic, etc.). By default the original replication assignment > algorithm is used because max-rack-replication defaults to -1. > max-rack-replication > -1 is not honored if you are doing manual replica > assignment (preffered). > If this looks good I can add some test cases specific to the rack-aware > assignment. > I can also port this to trunk. We are currently running 0.8.0 in production > and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14744465#comment-14744465 ] Allen Wang commented on KAFKA-1215: --- [~aauradkar] Sure I can create a KIP. However, after I signed up for Apache wiki, I don't seem to have write permission as I don't see "create" on the page header. Anything I need to do? > Rack-Aware replica assignment option > > > Key: KAFKA-1215 > URL: https://issues.apache.org/jira/browse/KAFKA-1215 > Project: Kafka > Issue Type: Improvement > Components: replication >Affects Versions: 0.8.0 >Reporter: Joris Van Remoortere >Assignee: Jun Rao > Fix For: 0.10.0.0 > > Attachments: rack_aware_replica_assignment_v1.patch, > rack_aware_replica_assignment_v2.patch > > > Adding a rack-id to kafka config. This rack-id can be used during replica > assignment by using the max-rack-replication argument in the admin scripts > (create topic, etc.). By default the original replication assignment > algorithm is used because max-rack-replication defaults to -1. > max-rack-replication > -1 is not honored if you are doing manual replica > assignment (preffered). > If this looks good I can add some test cases specific to the rack-aware > assignment. > I can also port this to trunk. We are currently running 0.8.0 in production > and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737918#comment-14737918 ] Aditya Auradkar commented on KAFKA-1215: [~allenxwang] Hi Allen. Thanks for the patch. Can you create a KIP to discuss the changes being proposed (since this patch adds configs and ZK structures) ? https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals We are hoping to leverage this patch within LinkedIn as well. > Rack-Aware replica assignment option > > > Key: KAFKA-1215 > URL: https://issues.apache.org/jira/browse/KAFKA-1215 > Project: Kafka > Issue Type: Improvement > Components: replication >Affects Versions: 0.8.0 >Reporter: Joris Van Remoortere >Assignee: Jun Rao > Fix For: 0.9.0 > > Attachments: rack_aware_replica_assignment_v1.patch, > rack_aware_replica_assignment_v2.patch > > > Adding a rack-id to kafka config. This rack-id can be used during replica > assignment by using the max-rack-replication argument in the admin scripts > (create topic, etc.). By default the original replication assignment > algorithm is used because max-rack-replication defaults to -1. > max-rack-replication > -1 is not honored if you are doing manual replica > assignment (preffered). > If this looks good I can add some test cases specific to the rack-aware > assignment. > I can also port this to trunk. We are currently running 0.8.0 in production > and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696026#comment-14696026 ] Allen Wang commented on KAFKA-1215: --- [~junrao] Can you review the GitHub pull request or have someone take a look? Rack-Aware replica assignment option Key: KAFKA-1215 URL: https://issues.apache.org/jira/browse/KAFKA-1215 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0 Reporter: Joris Van Remoortere Assignee: Jun Rao Fix For: 0.9.0 Attachments: rack_aware_replica_assignment_v1.patch, rack_aware_replica_assignment_v2.patch Adding a rack-id to kafka config. This rack-id can be used during replica assignment by using the max-rack-replication argument in the admin scripts (create topic, etc.). By default the original replication assignment algorithm is used because max-rack-replication defaults to -1. max-rack-replication -1 is not honored if you are doing manual replica assignment (preffered). If this looks good I can add some test cases specific to the rack-aware assignment. I can also port this to trunk. We are currently running 0.8.0 in production and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14682185#comment-14682185 ] ASF GitHub Bot commented on KAFKA-1215: --- GitHub user allenxwang opened a pull request: https://github.com/apache/kafka/pull/132 KAFKA-1215: Rack-Aware replica assignment option The PR tries to achieve the following: - Make rack-aware assignment and rack data structure optional as opposed to be part of the core data structure/protocol to ease the migration. The implementation of that returns the map of broker to rack is pluggable. User needs to pass the implementation class as a Kafka runtime configuration or command line argument. - The rack aware replica assignment is best effort when distributing the replicas to racks. When there are more replicas than racks, it ensures each rack has at least one replica. When there are more racks than replicas, it ensures each rack has at most one replica. It also tries to keep the even distribution of replicas among brokers and racks when possible. You can merge this pull request into a Git repository by running: $ git pull https://github.com/allenxwang/kafka KAFKA-1215 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/132.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #132 commit 35db23ee7987a1811d630f14de66a99ce638 Author: Allen Wang aw...@netflix.com Date: 2015-08-11T17:52:37Z KAFKA-1215: Rack-Aware replica assignment option Rack-Aware replica assignment option Key: KAFKA-1215 URL: https://issues.apache.org/jira/browse/KAFKA-1215 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0 Reporter: Joris Van Remoortere Assignee: Jun Rao Fix For: 0.9.0 Attachments: rack_aware_replica_assignment_v1.patch, rack_aware_replica_assignment_v2.patch Adding a rack-id to kafka config. This rack-id can be used during replica assignment by using the max-rack-replication argument in the admin scripts (create topic, etc.). By default the original replication assignment algorithm is used because max-rack-replication defaults to -1. max-rack-replication -1 is not honored if you are doing manual replica assignment (preffered). If this looks good I can add some test cases specific to the rack-aware assignment. I can also port this to trunk. We are currently running 0.8.0 in production and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606043#comment-14606043 ] Allen Wang commented on KAFKA-1215: --- [~junrao] AWS region (for example us-east-1) can be modeled as a DC. Each region has one or more zones (us-east-1c, us-east-1d, us-east-1e, etc). We model the zone as a rack. Our Kafka cluster spans across zones, but not regions. Rack-Aware replica assignment option Key: KAFKA-1215 URL: https://issues.apache.org/jira/browse/KAFKA-1215 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0 Reporter: Joris Van Remoortere Assignee: Jun Rao Fix For: 0.9.0 Attachments: rack_aware_replica_assignment_v1.patch, rack_aware_replica_assignment_v2.patch Adding a rack-id to kafka config. This rack-id can be used during replica assignment by using the max-rack-replication argument in the admin scripts (create topic, etc.). By default the original replication assignment algorithm is used because max-rack-replication defaults to -1. max-rack-replication -1 is not honored if you are doing manual replica assignment (preffered). If this looks good I can add some test cases specific to the rack-aware assignment. I can also port this to trunk. We are currently running 0.8.0 in production and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604154#comment-14604154 ] Jun Rao commented on KAFKA-1215: [~allenxwang], thanks for the update. How do you model this in Neflix's deployment in AWS? Do you just model each DC as a rack? Rack-Aware replica assignment option Key: KAFKA-1215 URL: https://issues.apache.org/jira/browse/KAFKA-1215 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0 Reporter: Joris Van Remoortere Assignee: Jun Rao Fix For: 0.9.0 Attachments: rack_aware_replica_assignment_v1.patch, rack_aware_replica_assignment_v2.patch Adding a rack-id to kafka config. This rack-id can be used during replica assignment by using the max-rack-replication argument in the admin scripts (create topic, etc.). By default the original replication assignment algorithm is used because max-rack-replication defaults to -1. max-rack-replication -1 is not honored if you are doing manual replica assignment (preffered). If this looks good I can add some test cases specific to the rack-aware assignment. I can also port this to trunk. We are currently running 0.8.0 in production and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14598643#comment-14598643 ] Jay Kreps commented on KAFKA-1215: -- This is great! Rack-Aware replica assignment option Key: KAFKA-1215 URL: https://issues.apache.org/jira/browse/KAFKA-1215 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0 Reporter: Joris Van Remoortere Assignee: Jun Rao Fix For: 0.9.0 Attachments: rack_aware_replica_assignment_v1.patch, rack_aware_replica_assignment_v2.patch Adding a rack-id to kafka config. This rack-id can be used during replica assignment by using the max-rack-replication argument in the admin scripts (create topic, etc.). By default the original replication assignment algorithm is used because max-rack-replication defaults to -1. max-rack-replication -1 is not honored if you are doing manual replica assignment (preffered). If this looks good I can add some test cases specific to the rack-aware assignment. I can also port this to trunk. We are currently running 0.8.0 in production and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14598615#comment-14598615 ] Allen Wang commented on KAFKA-1215: --- We have a working solution now for rack aware assignment. It is based on current patch for this JIRA but with some improvement. The key idea of the solution is: - Rack ID is a String instead of integer - For replica assignment, add an extra parameter of Map[Int, String] to assignReplicasToBrokers() method which maps broker ID to rack ID - Before doing the rack aware assignment, sort the broker list such that they are interlaced according to the rack. In other words, adjacent brokers should not be in the same rack if possible . For example, assuming 6 brokers mapping to 3 racks: 0 - rack1, 1 - rack1, 2 - rack2, 3 - rack2, 4 - rack3, 5 - rack3 The sorted broker list could be (0, 2, 4, 1, 3, 5) - Apply the same assignment algorithm to assign replicas, with the addition of skipping a broker if its rack is already used for the same partition (similar to what has been done in current patch) The benefit of this approach is that replica distribution is kept as even as possible to all the racks and brokers. With regard to KAFKA-1792, an easy solution is to restrict replica movement within the same rack, which I think should work in most practical cases. It will also have added benefit that usually replicas move faster within a rack. So basically we can apply the same algorithm described in KAFKA-1792 for each rack. For example, if there are three racks, then apply the algorithm three times, each time with broker list and assignment for that specific rack. Again, we assume the broker to rack mapping will be available in the method signature. The open question is how to obtain broker to rack mapping. The information can be supplied when Kafka registers the broker with ZooKeeper which means some information has to be added to ZooKeeper. However, it could be that the rack information is already available in a deployment independent way. For example, for some deployment, the rack information may be available in a database. What we can do is to abstract out the API required to obtain rack information in an interface and allow user to supply an implementation in command line or at broker start up (to handle auto topic creation). Rack-Aware replica assignment option Key: KAFKA-1215 URL: https://issues.apache.org/jira/browse/KAFKA-1215 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0 Reporter: Joris Van Remoortere Assignee: Jun Rao Fix For: 0.9.0 Attachments: rack_aware_replica_assignment_v1.patch, rack_aware_replica_assignment_v2.patch Adding a rack-id to kafka config. This rack-id can be used during replica assignment by using the max-rack-replication argument in the admin scripts (create topic, etc.). By default the original replication assignment algorithm is used because max-rack-replication defaults to -1. max-rack-replication -1 is not honored if you are doing manual replica assignment (preffered). If this looks good I can add some test cases specific to the rack-aware assignment. I can also port this to trunk. We are currently running 0.8.0 in production and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14386037#comment-14386037 ] Neha Narkhede commented on KAFKA-1215: -- [~allenxwang] This was inactive for a while, but I think it will be good to wait until KAFKA-1792 is done to propose a solution for rack-awareness. Rack-Aware replica assignment option Key: KAFKA-1215 URL: https://issues.apache.org/jira/browse/KAFKA-1215 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0 Reporter: Joris Van Remoortere Assignee: Jun Rao Fix For: 0.9.0 Attachments: rack_aware_replica_assignment_v1.patch, rack_aware_replica_assignment_v2.patch Adding a rack-id to kafka config. This rack-id can be used during replica assignment by using the max-rack-replication argument in the admin scripts (create topic, etc.). By default the original replication assignment algorithm is used because max-rack-replication defaults to -1. max-rack-replication -1 is not honored if you are doing manual replica assignment (preffered). If this looks good I can add some test cases specific to the rack-aware assignment. I can also port this to trunk. We are currently running 0.8.0 in production and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372058#comment-14372058 ] Allen Wang commented on KAFKA-1215: --- What's the status of this JIRA? I have two questions: - Can we simply use string for rack ID? This will make it much easier to use in AWS where the zone ID is a string. Otherwise there will be unnecessary code to convert them back and forth. - Why is max-rack-replication necessary? In most use cases you want to have even distribution of replicas to racks without having to consider max replication per rack. Rack-Aware replica assignment option Key: KAFKA-1215 URL: https://issues.apache.org/jira/browse/KAFKA-1215 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0 Reporter: Joris Van Remoortere Assignee: Jun Rao Fix For: 0.9.0 Attachments: rack_aware_replica_assignment_v1.patch, rack_aware_replica_assignment_v2.patch Adding a rack-id to kafka config. This rack-id can be used during replica assignment by using the max-rack-replication argument in the admin scripts (create topic, etc.). By default the original replication assignment algorithm is used because max-rack-replication defaults to -1. max-rack-replication -1 is not honored if you are doing manual replica assignment (preffered). If this looks good I can add some test cases specific to the rack-aware assignment. I can also port this to trunk. We are currently running 0.8.0 in production and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122028#comment-14122028 ] Guozhang Wang commented on KAFKA-1215: -- Moving out of 0.8.2 for now.. Rack-Aware replica assignment option Key: KAFKA-1215 URL: https://issues.apache.org/jira/browse/KAFKA-1215 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0 Reporter: Joris Van Remoortere Assignee: Jun Rao Fix For: 0.9.0 Attachments: rack_aware_replica_assignment_v1.patch, rack_aware_replica_assignment_v2.patch Adding a rack-id to kafka config. This rack-id can be used during replica assignment by using the max-rack-replication argument in the admin scripts (create topic, etc.). By default the original replication assignment algorithm is used because max-rack-replication defaults to -1. max-rack-replication -1 is not honored if you are doing manual replica assignment (preffered). If this looks good I can add some test cases specific to the rack-aware assignment. I can also port this to trunk. We are currently running 0.8.0 in production and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030981#comment-14030981 ] Joris Van Remoortere commented on KAFKA-1215: - [~jjkoshy] I am out till the end of the month. I was going to take a deeper look at how to integrate this into the auto-rebalance as that has since been released (post initial patch). Rack-Aware replica assignment option Key: KAFKA-1215 URL: https://issues.apache.org/jira/browse/KAFKA-1215 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0 Reporter: Joris Van Remoortere Assignee: Jun Rao Fix For: 0.8.2 Attachments: rack_aware_replica_assignment_v1.patch, rack_aware_replica_assignment_v2.patch Adding a rack-id to kafka config. This rack-id can be used during replica assignment by using the max-rack-replication argument in the admin scripts (create topic, etc.). By default the original replication assignment algorithm is used because max-rack-replication defaults to -1. max-rack-replication -1 is not honored if you are doing manual replica assignment (preffered). If this looks good I can add some test cases specific to the rack-aware assignment. I can also port this to trunk. We are currently running 0.8.0 in production and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013151#comment-14013151 ] Jorge Ortiz commented on KAFKA-1215: Any update on this? We're deploying Kafka on AWS and rack-awareness would be lovely. Rack-Aware replica assignment option Key: KAFKA-1215 URL: https://issues.apache.org/jira/browse/KAFKA-1215 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0 Reporter: Joris Van Remoortere Assignee: Jun Rao Fix For: 0.8.2 Attachments: rack_aware_replica_assignment_v1.patch, rack_aware_replica_assignment_v2.patch Adding a rack-id to kafka config. This rack-id can be used during replica assignment by using the max-rack-replication argument in the admin scripts (create topic, etc.). By default the original replication assignment algorithm is used because max-rack-replication defaults to -1. max-rack-replication -1 is not honored if you are doing manual replica assignment (preffered). If this looks good I can add some test cases specific to the rack-aware assignment. I can also port this to trunk. We are currently running 0.8.0 in production and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13958242#comment-13958242 ] Joris Van Remoortere commented on KAFKA-1215: - Since there is further interest in this #1357 I will try to look at this soon. Rack-Aware replica assignment option Key: KAFKA-1215 URL: https://issues.apache.org/jira/browse/KAFKA-1215 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0 Reporter: Joris Van Remoortere Assignee: Jun Rao Fix For: 0.8.2 Attachments: rack_aware_replica_assignment_v1.patch, rack_aware_replica_assignment_v2.patch Adding a rack-id to kafka config. This rack-id can be used during replica assignment by using the max-rack-replication argument in the admin scripts (create topic, etc.). By default the original replication assignment algorithm is used because max-rack-replication defaults to -1. max-rack-replication -1 is not honored if you are doing manual replica assignment (preffered). If this looks good I can add some test cases specific to the rack-aware assignment. I can also port this to trunk. We are currently running 0.8.0 in production and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13893535#comment-13893535 ] Joris Van Remoortere commented on KAFKA-1215: - [~junrao] could you please look at this? Review Request #17248 Thanks! Rack-Aware replica assignment option Key: KAFKA-1215 URL: https://issues.apache.org/jira/browse/KAFKA-1215 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0, 0.8.1 Reporter: Joris Van Remoortere Assignee: Neha Narkhede Fix For: 0.8.0, 0.8.1 Attachments: rack_aware_replica_assignment_v1.patch, rack_aware_replica_assignment_v2.patch Adding a rack-id to kafka config. This rack-id can be used during replica assignment by using the max-rack-replication argument in the admin scripts (create topic, etc.). By default the original replication assignment algorithm is used because max-rack-replication defaults to -1. max-rack-replication -1 is not honored if you are doing manual replica assignment (preffered). If this looks good I can add some test cases specific to the rack-aware assignment. I can also port this to trunk. We are currently running 0.8.0 in production and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: [jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
Replica assignments can change during topic creation, adding partitions, and reassign partitions. So, it would be good to integrate rack-aware assignment in those cases. For compatibility, to do a rolling upgrade of a Kafka cluster means that at a given point of time, some brokers could be registered in the old format and some others could be registered in the new format. Both the consumer and the controller ( https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Controller+Internals) read the broker registration in ZK. So, we need to make sure that the new code can read the old format and the old code can read the new format too. For example, in Broker.createBroker(), if the code is looking for a field rackid but couldn't find it, we need to be able to provide a reasonable default. Also, take a look at the format of broker registration in ZK in https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper. Since we are changing the format, we should probably change the version from 1 to 2. However, we can make this change backward compatible. Thanks, Jun On Fri, Jan 24, 2014 at 1:57 PM, Joris VanRemoortere jvanremoort...@tagged.com wrote: Working on the above. Since trunk currently allows Topic configs, and altering topics, does it make sense to you to make this a topic config so that it is kept in sync between topic alterations and partition adds? For compatibility: I can make the zookeeper state parsing backwards / forwards compatible. I don't think I can make the readFrom / writeTo ByteBuffer compatible since chained calls of these during api parsing are size dependent? Can you clarify exactly what you expect to be compatible? I'm also not very familiar with the consequences of changing the broker version in zookeeper registration. Can you please provide a reference to documentation for this? Thanks, Joris On Fri, Jan 24, 2014 at 9:43 AM, Jun Rao (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881193#comment-13881193 ] Jun Rao commented on KAFKA-1215: Thanks for the patch. Looks good over all. Some comments. 1. KafkaConfig: 1.1 We need a config for default max-rack-replication for auto topic creation. 1.2 rackId: We probably don't want to make this a required property. So, perhaps we can default it to 0? 2. AdminUtils.assignReplicasToBrokers(): 2.1 Could you add some comments on the rack-aware assignment algorithm? 2.2 It's a bit weird for this method to take zkclient in the input. We probably can pass in a list of Broker objects instead. 3. Unit tests: I suggest that we leave most existing tests intact by keeping the rackId default and add a new test for rack-aware assignment. 4. Compatibility test: It seems that the changes for the broker format in ZK is backward compatible. Could you double check? For example, an old reader (controller, consumer, etc) should be able to parse the broker registered in new format and a new reader should be able to parse the broker registered in the old format. Also, we probably should increase the version in the ZK registration for the broker. 5. Could you rebase to trunk? Rack-Aware replica assignment option Key: KAFKA-1215 URL: https://issues.apache.org/jira/browse/KAFKA-1215 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0, 0.8.1 Reporter: Joris Van Remoortere Assignee: Neha Narkhede Fix For: 0.8.0, 0.8.1 Attachments: rack_aware_replica_assignment_v1.patch Adding a rack-id to kafka config. This rack-id can be used during replica assignment by using the max-rack-replication argument in the admin scripts (create topic, etc.). By default the original replication assignment algorithm is used because max-rack-replication defaults to -1. max-rack-replication -1 is not honored if you are doing manual replica assignment (preffered). If this looks good I can add some test cases specific to the rack-aware assignment. I can also port this to trunk. We are currently running 0.8.0 in production and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: [jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
Thanks Jun, As you said, rack-aware assignment needs to be supported during topic creation, adding of partitions, and partition re-assignment. In order to reduce the risk of using a different value of max-rack-replication during different calls to those functions, I'd like to carry / persist that information as part of the topic config. This way the option is mandatory during topic creation, and optional partition re-assign. Here is my proposal: Topic Create: - max-rack-replication mandatory - store max-rack-replication as a topic config Partition Add: - Use the stored value of max-rack-replication to add another partition using the same max-rack-replication value as the rest of the partitions in the topic Partition Re-assign: - If max-rack-replication is provided, treat this as an altered config. This forces a re-validation of all replica assignments of all partitions for the topic. If this succeeds, save the updated topic config. - If max-rack-replication is not provided, used the saved value in the topic config. Alter Topic: - If someone over-rides the max-rack-replication value, we re-validate all replica assignments as with partition re-assignment. I think this strategy will be less error prone, making it impossible to have multiple partitions in a topic using different max-rack-replication values. Let me know if this sounds ok to you, Joris On Sun, Jan 26, 2014 at 9:15 AM, Jun Rao jun...@gmail.com wrote: Replica assignments can change during topic creation, adding partitions, and reassign partitions. So, it would be good to integrate rack-aware assignment in those cases. For compatibility, to do a rolling upgrade of a Kafka cluster means that at a given point of time, some brokers could be registered in the old format and some others could be registered in the new format. Both the consumer and the controller ( https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Controller+Internals ) read the broker registration in ZK. So, we need to make sure that the new code can read the old format and the old code can read the new format too. For example, in Broker.createBroker(), if the code is looking for a field rackid but couldn't find it, we need to be able to provide a reasonable default. Also, take a look at the format of broker registration in ZK in https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper . Since we are changing the format, we should probably change the version from 1 to 2. However, we can make this change backward compatible. Thanks, Jun On Fri, Jan 24, 2014 at 1:57 PM, Joris VanRemoortere jvanremoort...@tagged.com wrote: Working on the above. Since trunk currently allows Topic configs, and altering topics, does it make sense to you to make this a topic config so that it is kept in sync between topic alterations and partition adds? For compatibility: I can make the zookeeper state parsing backwards / forwards compatible. I don't think I can make the readFrom / writeTo ByteBuffer compatible since chained calls of these during api parsing are size dependent? Can you clarify exactly what you expect to be compatible? I'm also not very familiar with the consequences of changing the broker version in zookeeper registration. Can you please provide a reference to documentation for this? Thanks, Joris On Fri, Jan 24, 2014 at 9:43 AM, Jun Rao (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881193#comment-13881193 ] Jun Rao commented on KAFKA-1215: Thanks for the patch. Looks good over all. Some comments. 1. KafkaConfig: 1.1 We need a config for default max-rack-replication for auto topic creation. 1.2 rackId: We probably don't want to make this a required property. So, perhaps we can default it to 0? 2. AdminUtils.assignReplicasToBrokers(): 2.1 Could you add some comments on the rack-aware assignment algorithm? 2.2 It's a bit weird for this method to take zkclient in the input. We probably can pass in a list of Broker objects instead. 3. Unit tests: I suggest that we leave most existing tests intact by keeping the rackId default and add a new test for rack-aware assignment. 4. Compatibility test: It seems that the changes for the broker format in ZK is backward compatible. Could you double check? For example, an old reader (controller, consumer, etc) should be able to parse the broker registered in new format and a new reader should be able to parse the broker registered in the old format. Also, we probably should increase the version in the ZK registration for the broker. 5. Could you rebase to trunk? Rack-Aware replica assignment option
Re: [jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
Thanks for the update. This proposal looks reasonable to me. One issue that needs a bit more thinking is what happens when a broker is moved from one rack to another. After the move, constraints on max-rack-replication could be violated. The question is what we should do when this happens. A less intrusive approach is to let the broker start and simply log a warning about the violation. Not sure if this is the best approach though. Jun On Sun, Jan 26, 2014 at 10:07 AM, Joris VanRemoortere jvanremoort...@tagged.com wrote: Thanks Jun, As you said, rack-aware assignment needs to be supported during topic creation, adding of partitions, and partition re-assignment. In order to reduce the risk of using a different value of max-rack-replication during different calls to those functions, I'd like to carry / persist that information as part of the topic config. This way the option is mandatory during topic creation, and optional partition re-assign. Here is my proposal: Topic Create: - max-rack-replication mandatory - store max-rack-replication as a topic config Partition Add: - Use the stored value of max-rack-replication to add another partition using the same max-rack-replication value as the rest of the partitions in the topic Partition Re-assign: - If max-rack-replication is provided, treat this as an altered config. This forces a re-validation of all replica assignments of all partitions for the topic. If this succeeds, save the updated topic config. - If max-rack-replication is not provided, used the saved value in the topic config. Alter Topic: - If someone over-rides the max-rack-replication value, we re-validate all replica assignments as with partition re-assignment. I think this strategy will be less error prone, making it impossible to have multiple partitions in a topic using different max-rack-replication values. Let me know if this sounds ok to you, Joris On Sun, Jan 26, 2014 at 9:15 AM, Jun Rao jun...@gmail.com wrote: Replica assignments can change during topic creation, adding partitions, and reassign partitions. So, it would be good to integrate rack-aware assignment in those cases. For compatibility, to do a rolling upgrade of a Kafka cluster means that at a given point of time, some brokers could be registered in the old format and some others could be registered in the new format. Both the consumer and the controller ( https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Controller+Internals ) read the broker registration in ZK. So, we need to make sure that the new code can read the old format and the old code can read the new format too. For example, in Broker.createBroker(), if the code is looking for a field rackid but couldn't find it, we need to be able to provide a reasonable default. Also, take a look at the format of broker registration in ZK in https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper . Since we are changing the format, we should probably change the version from 1 to 2. However, we can make this change backward compatible. Thanks, Jun On Fri, Jan 24, 2014 at 1:57 PM, Joris VanRemoortere jvanremoort...@tagged.com wrote: Working on the above. Since trunk currently allows Topic configs, and altering topics, does it make sense to you to make this a topic config so that it is kept in sync between topic alterations and partition adds? For compatibility: I can make the zookeeper state parsing backwards / forwards compatible. I don't think I can make the readFrom / writeTo ByteBuffer compatible since chained calls of these during api parsing are size dependent? Can you clarify exactly what you expect to be compatible? I'm also not very familiar with the consequences of changing the broker version in zookeeper registration. Can you please provide a reference to documentation for this? Thanks, Joris On Fri, Jan 24, 2014 at 9:43 AM, Jun Rao (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881193#comment-13881193 ] Jun Rao commented on KAFKA-1215: Thanks for the patch. Looks good over all. Some comments. 1. KafkaConfig: 1.1 We need a config for default max-rack-replication for auto topic creation. 1.2 rackId: We probably don't want to make this a required property. So, perhaps we can default it to 0? 2. AdminUtils.assignReplicasToBrokers(): 2.1 Could you add some comments on the rack-aware assignment algorithm? 2.2 It's a bit weird for this method to take zkclient in the input. We probably can pass in a list of Broker objects instead. 3. Unit tests: I suggest that we
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
[ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881193#comment-13881193 ] Jun Rao commented on KAFKA-1215: Thanks for the patch. Looks good over all. Some comments. 1. KafkaConfig: 1.1 We need a config for default max-rack-replication for auto topic creation. 1.2 rackId: We probably don't want to make this a required property. So, perhaps we can default it to 0? 2. AdminUtils.assignReplicasToBrokers(): 2.1 Could you add some comments on the rack-aware assignment algorithm? 2.2 It's a bit weird for this method to take zkclient in the input. We probably can pass in a list of Broker objects instead. 3. Unit tests: I suggest that we leave most existing tests intact by keeping the rackId default and add a new test for rack-aware assignment. 4. Compatibility test: It seems that the changes for the broker format in ZK is backward compatible. Could you double check? For example, an old reader (controller, consumer, etc) should be able to parse the broker registered in new format and a new reader should be able to parse the broker registered in the old format. Also, we probably should increase the version in the ZK registration for the broker. 5. Could you rebase to trunk? Rack-Aware replica assignment option Key: KAFKA-1215 URL: https://issues.apache.org/jira/browse/KAFKA-1215 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0, 0.8.1 Reporter: Joris Van Remoortere Assignee: Neha Narkhede Fix For: 0.8.0, 0.8.1 Attachments: rack_aware_replica_assignment_v1.patch Adding a rack-id to kafka config. This rack-id can be used during replica assignment by using the max-rack-replication argument in the admin scripts (create topic, etc.). By default the original replication assignment algorithm is used because max-rack-replication defaults to -1. max-rack-replication -1 is not honored if you are doing manual replica assignment (preffered). If this looks good I can add some test cases specific to the rack-aware assignment. I can also port this to trunk. We are currently running 0.8.0 in production and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: [jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option
Working on the above. Since trunk currently allows Topic configs, and altering topics, does it make sense to you to make this a topic config so that it is kept in sync between topic alterations and partition adds? For compatibility: I can make the zookeeper state parsing backwards / forwards compatible. I don't think I can make the readFrom / writeTo ByteBuffer compatible since chained calls of these during api parsing are size dependent? Can you clarify exactly what you expect to be compatible? I'm also not very familiar with the consequences of changing the broker version in zookeeper registration. Can you please provide a reference to documentation for this? Thanks, Joris On Fri, Jan 24, 2014 at 9:43 AM, Jun Rao (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881193#comment-13881193] Jun Rao commented on KAFKA-1215: Thanks for the patch. Looks good over all. Some comments. 1. KafkaConfig: 1.1 We need a config for default max-rack-replication for auto topic creation. 1.2 rackId: We probably don't want to make this a required property. So, perhaps we can default it to 0? 2. AdminUtils.assignReplicasToBrokers(): 2.1 Could you add some comments on the rack-aware assignment algorithm? 2.2 It's a bit weird for this method to take zkclient in the input. We probably can pass in a list of Broker objects instead. 3. Unit tests: I suggest that we leave most existing tests intact by keeping the rackId default and add a new test for rack-aware assignment. 4. Compatibility test: It seems that the changes for the broker format in ZK is backward compatible. Could you double check? For example, an old reader (controller, consumer, etc) should be able to parse the broker registered in new format and a new reader should be able to parse the broker registered in the old format. Also, we probably should increase the version in the ZK registration for the broker. 5. Could you rebase to trunk? Rack-Aware replica assignment option Key: KAFKA-1215 URL: https://issues.apache.org/jira/browse/KAFKA-1215 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0, 0.8.1 Reporter: Joris Van Remoortere Assignee: Neha Narkhede Fix For: 0.8.0, 0.8.1 Attachments: rack_aware_replica_assignment_v1.patch Adding a rack-id to kafka config. This rack-id can be used during replica assignment by using the max-rack-replication argument in the admin scripts (create topic, etc.). By default the original replication assignment algorithm is used because max-rack-replication defaults to -1. max-rack-replication -1 is not honored if you are doing manual replica assignment (preffered). If this looks good I can add some test cases specific to the rack-aware assignment. I can also port this to trunk. We are currently running 0.8.0 in production and need this, so i wrote the patch against that. -- This message was sent by Atlassian JIRA (v6.1.5#6160)