[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2016-03-19 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201709#comment-15201709
 ] 

Jun Rao commented on KAFKA-1215:


[~allenxwang], could you also update the changes to ZK structure in 
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper
 ?

> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Allen Wang
>Priority: Critical
> Fix For: 0.10.0.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2016-03-18 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15202369#comment-15202369
 ] 

Jun Rao commented on KAFKA-1215:


Great, thanks Allen.

> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Allen Wang
>Priority: Critical
> Fix For: 0.10.0.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2016-03-18 Thread Allen Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15202362#comment-15202362
 ] 

Allen Wang commented on KAFKA-1215:
---

[~junrao] Updated.


> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Allen Wang
>Priority: Critical
> Fix For: 0.10.0.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2016-03-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195674#comment-15195674
 ] 

ASF GitHub Bot commented on KAFKA-1215:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/132


> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Allen Wang
>Priority: Critical
> Fix For: 0.10.0.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2016-02-26 Thread Allen Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169592#comment-15169592
 ] 

Allen Wang commented on KAFKA-1215:
---

[~aauradkar] Yes it is ready for review.


> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Allen Wang
> Fix For: 0.10.0.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2016-02-23 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159491#comment-15159491
 ] 

Aditya Auradkar commented on KAFKA-1215:


[~allenxwang] - Is this patch ready for review? I noticed you add several 
commits recently but I'm not sure if you are done.

> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Allen Wang
> Fix For: 0.10.0.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2015-11-05 Thread Vidhya Arvind (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993035#comment-14993035
 ] 

Vidhya Arvind commented on KAFKA-1215:
--

Is there anyway this patch can be part of 0.9.0.0?

> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Jun Rao
> Fix For: 0.10.0.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2015-09-25 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908460#comment-14908460
 ] 

Aditya Auradkar commented on KAFKA-1215:


[~allenxwang] - bump. 

> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Jun Rao
> Fix For: 0.10.0.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2015-09-17 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804835#comment-14804835
 ] 

Joel Koshy commented on KAFKA-1215:
---

[~allenxwang] you should have access now.

> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Jun Rao
> Fix For: 0.10.0.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2015-09-15 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14745803#comment-14745803
 ] 

Aditya Auradkar commented on KAFKA-1215:


[~allenxwang] One of the committers can provide you write access once you 
provide your confluence apache id. Please let me know if you need any help with 
the KIP/reviews etc. Thanks!

> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Jun Rao
> Fix For: 0.10.0.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2015-09-15 Thread Allen Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746264#comment-14746264
 ] 

Allen Wang commented on KAFKA-1215:
---

[~aauradkar] [~junrao] [~jkreps] My apache confluence id is allenxwang, same as 
my JIRA id. Please let me know when write access is granted. Thanks.

> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Jun Rao
> Fix For: 0.10.0.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2015-09-14 Thread Allen Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14744465#comment-14744465
 ] 

Allen Wang commented on KAFKA-1215:
---

[~aauradkar] Sure I can create a KIP. However, after I signed up for Apache 
wiki, I don't seem to have write permission as I don't see "create" on the page 
header. Anything I need to do?

> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Jun Rao
> Fix For: 0.10.0.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2015-09-09 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737918#comment-14737918
 ] 

Aditya Auradkar commented on KAFKA-1215:


[~allenxwang] Hi Allen. Thanks for the patch. Can you create a KIP to discuss 
the changes being proposed (since this patch adds configs and ZK structures) ? 
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals

We are hoping to leverage this patch within LinkedIn as well.

> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Jun Rao
> Fix For: 0.9.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2015-08-13 Thread Allen Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696026#comment-14696026
 ] 

Allen Wang commented on KAFKA-1215:
---

[~junrao] Can you review the GitHub pull request or have someone take a look?


 Rack-Aware replica assignment option
 

 Key: KAFKA-1215
 URL: https://issues.apache.org/jira/browse/KAFKA-1215
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.0
Reporter: Joris Van Remoortere
Assignee: Jun Rao
 Fix For: 0.9.0

 Attachments: rack_aware_replica_assignment_v1.patch, 
 rack_aware_replica_assignment_v2.patch


 Adding a rack-id to kafka config. This rack-id can be used during replica 
 assignment by using the max-rack-replication argument in the admin scripts 
 (create topic, etc.). By default the original replication assignment 
 algorithm is used because max-rack-replication defaults to -1. 
 max-rack-replication  -1 is not honored if you are doing manual replica 
 assignment (preffered).
 If this looks good I can add some test cases specific to the rack-aware 
 assignment.
 I can also port this to trunk. We are currently running 0.8.0 in production 
 and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14682185#comment-14682185
 ] 

ASF GitHub Bot commented on KAFKA-1215:
---

GitHub user allenxwang opened a pull request:

https://github.com/apache/kafka/pull/132

KAFKA-1215: Rack-Aware replica assignment option

The PR tries to achieve the following:

- Make rack-aware assignment and rack data structure optional as opposed to 
be part of the core data structure/protocol to ease the migration. The 
implementation of that returns the map of broker to rack is pluggable. User 
needs to pass the implementation class as a Kafka runtime configuration or 
command line argument.

- The rack aware replica assignment is best effort when distributing the 
replicas to racks. When there are more replicas than racks, it ensures each 
rack has at least one replica. When there are more racks than replicas, it 
ensures each rack has at most one replica. It also tries to keep the even 
distribution of replicas among brokers and racks when possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/allenxwang/kafka KAFKA-1215

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/132.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #132


commit 35db23ee7987a1811d630f14de66a99ce638
Author: Allen Wang aw...@netflix.com
Date:   2015-08-11T17:52:37Z

KAFKA-1215: Rack-Aware replica assignment option




 Rack-Aware replica assignment option
 

 Key: KAFKA-1215
 URL: https://issues.apache.org/jira/browse/KAFKA-1215
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.0
Reporter: Joris Van Remoortere
Assignee: Jun Rao
 Fix For: 0.9.0

 Attachments: rack_aware_replica_assignment_v1.patch, 
 rack_aware_replica_assignment_v2.patch


 Adding a rack-id to kafka config. This rack-id can be used during replica 
 assignment by using the max-rack-replication argument in the admin scripts 
 (create topic, etc.). By default the original replication assignment 
 algorithm is used because max-rack-replication defaults to -1. 
 max-rack-replication  -1 is not honored if you are doing manual replica 
 assignment (preffered).
 If this looks good I can add some test cases specific to the rack-aware 
 assignment.
 I can also port this to trunk. We are currently running 0.8.0 in production 
 and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2015-06-29 Thread Allen Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606043#comment-14606043
 ] 

Allen Wang commented on KAFKA-1215:
---

[~junrao] AWS region (for example us-east-1) can be modeled as a DC. Each 
region has one or more zones (us-east-1c, us-east-1d, us-east-1e, etc). We 
model the zone as a rack. Our Kafka cluster spans across zones, but not regions.

 Rack-Aware replica assignment option
 

 Key: KAFKA-1215
 URL: https://issues.apache.org/jira/browse/KAFKA-1215
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.0
Reporter: Joris Van Remoortere
Assignee: Jun Rao
 Fix For: 0.9.0

 Attachments: rack_aware_replica_assignment_v1.patch, 
 rack_aware_replica_assignment_v2.patch


 Adding a rack-id to kafka config. This rack-id can be used during replica 
 assignment by using the max-rack-replication argument in the admin scripts 
 (create topic, etc.). By default the original replication assignment 
 algorithm is used because max-rack-replication defaults to -1. 
 max-rack-replication  -1 is not honored if you are doing manual replica 
 assignment (preffered).
 If this looks good I can add some test cases specific to the rack-aware 
 assignment.
 I can also port this to trunk. We are currently running 0.8.0 in production 
 and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2015-06-27 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604154#comment-14604154
 ] 

Jun Rao commented on KAFKA-1215:


[~allenxwang], thanks for the update. How do you model this in Neflix's 
deployment in AWS? Do you just model each DC as a rack? 

 Rack-Aware replica assignment option
 

 Key: KAFKA-1215
 URL: https://issues.apache.org/jira/browse/KAFKA-1215
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.0
Reporter: Joris Van Remoortere
Assignee: Jun Rao
 Fix For: 0.9.0

 Attachments: rack_aware_replica_assignment_v1.patch, 
 rack_aware_replica_assignment_v2.patch


 Adding a rack-id to kafka config. This rack-id can be used during replica 
 assignment by using the max-rack-replication argument in the admin scripts 
 (create topic, etc.). By default the original replication assignment 
 algorithm is used because max-rack-replication defaults to -1. 
 max-rack-replication  -1 is not honored if you are doing manual replica 
 assignment (preffered).
 If this looks good I can add some test cases specific to the rack-aware 
 assignment.
 I can also port this to trunk. We are currently running 0.8.0 in production 
 and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2015-06-23 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14598643#comment-14598643
 ] 

Jay Kreps commented on KAFKA-1215:
--

This is great!

 Rack-Aware replica assignment option
 

 Key: KAFKA-1215
 URL: https://issues.apache.org/jira/browse/KAFKA-1215
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.0
Reporter: Joris Van Remoortere
Assignee: Jun Rao
 Fix For: 0.9.0

 Attachments: rack_aware_replica_assignment_v1.patch, 
 rack_aware_replica_assignment_v2.patch


 Adding a rack-id to kafka config. This rack-id can be used during replica 
 assignment by using the max-rack-replication argument in the admin scripts 
 (create topic, etc.). By default the original replication assignment 
 algorithm is used because max-rack-replication defaults to -1. 
 max-rack-replication  -1 is not honored if you are doing manual replica 
 assignment (preffered).
 If this looks good I can add some test cases specific to the rack-aware 
 assignment.
 I can also port this to trunk. We are currently running 0.8.0 in production 
 and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2015-06-23 Thread Allen Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14598615#comment-14598615
 ] 

Allen Wang commented on KAFKA-1215:
---

We have a working solution now for rack aware assignment. It is based on 
current patch for this JIRA but with some improvement. The key idea of the 
solution is:

- Rack ID is a String instead of integer
- For replica assignment, add an extra parameter of Map[Int, String] to 
assignReplicasToBrokers() method which maps broker ID to rack ID
- Before doing the rack aware assignment, sort the broker list such that they 
are interlaced according to the rack. In other words, adjacent brokers should 
not be in the same rack if possible . For example, assuming 6 brokers mapping 
to 3 racks:

0 - rack1, 1 - rack1, 2 - rack2, 3 - rack2, 4 - rack3, 5 - 
rack3

The sorted broker list could be (0, 2, 4, 1, 3, 5)

- Apply the same assignment algorithm to assign replicas, with the addition of 
skipping a broker if its rack is already used for the same partition (similar 
to what has been done in current patch)

The benefit of this approach is that replica distribution is kept as even as 
possible to all the racks and brokers.

With regard to KAFKA-1792, an easy solution is to restrict replica movement 
within the same rack, which I think should work in most practical cases. It 
will also have added benefit that usually replicas move faster within a rack. 
So basically we can apply the same algorithm described in KAFKA-1792 for each 
rack. For example, if there are three racks, then apply the algorithm three 
times, each time with broker list and assignment for that specific rack. Again, 
we assume the broker to rack mapping will be available in the method signature.

The open question is how to obtain broker to rack mapping. The information can 
be supplied when Kafka registers the broker with ZooKeeper which means some 
information has to be added to ZooKeeper. However, it could be that the rack 
information is already available in a deployment independent way. For example, 
for some deployment, the rack information may be available in a database. What 
we can do is to abstract out the API required to obtain rack information in an 
interface and allow user to supply an implementation in command line or at 
broker start up (to handle auto topic creation).





 

 Rack-Aware replica assignment option
 

 Key: KAFKA-1215
 URL: https://issues.apache.org/jira/browse/KAFKA-1215
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.0
Reporter: Joris Van Remoortere
Assignee: Jun Rao
 Fix For: 0.9.0

 Attachments: rack_aware_replica_assignment_v1.patch, 
 rack_aware_replica_assignment_v2.patch


 Adding a rack-id to kafka config. This rack-id can be used during replica 
 assignment by using the max-rack-replication argument in the admin scripts 
 (create topic, etc.). By default the original replication assignment 
 algorithm is used because max-rack-replication defaults to -1. 
 max-rack-replication  -1 is not honored if you are doing manual replica 
 assignment (preffered).
 If this looks good I can add some test cases specific to the rack-aware 
 assignment.
 I can also port this to trunk. We are currently running 0.8.0 in production 
 and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2015-03-29 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14386037#comment-14386037
 ] 

Neha Narkhede commented on KAFKA-1215:
--

[~allenxwang] This was inactive for a while, but I think it will be good to 
wait until KAFKA-1792 is done to propose a solution for rack-awareness. 

 Rack-Aware replica assignment option
 

 Key: KAFKA-1215
 URL: https://issues.apache.org/jira/browse/KAFKA-1215
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.0
Reporter: Joris Van Remoortere
Assignee: Jun Rao
 Fix For: 0.9.0

 Attachments: rack_aware_replica_assignment_v1.patch, 
 rack_aware_replica_assignment_v2.patch


 Adding a rack-id to kafka config. This rack-id can be used during replica 
 assignment by using the max-rack-replication argument in the admin scripts 
 (create topic, etc.). By default the original replication assignment 
 algorithm is used because max-rack-replication defaults to -1. 
 max-rack-replication  -1 is not honored if you are doing manual replica 
 assignment (preffered).
 If this looks good I can add some test cases specific to the rack-aware 
 assignment.
 I can also port this to trunk. We are currently running 0.8.0 in production 
 and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2015-03-20 Thread Allen Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372058#comment-14372058
 ] 

Allen Wang commented on KAFKA-1215:
---

What's the status of this JIRA?

I have two questions:

- Can we simply use string for rack ID? This will make it much easier to use in 
AWS where the zone ID is a string. Otherwise there will be unnecessary code to 
convert them back and forth. 
- Why is max-rack-replication necessary? In most use cases you want to have 
even distribution of replicas to racks without having to consider max 
replication per rack. 


 Rack-Aware replica assignment option
 

 Key: KAFKA-1215
 URL: https://issues.apache.org/jira/browse/KAFKA-1215
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.0
Reporter: Joris Van Remoortere
Assignee: Jun Rao
 Fix For: 0.9.0

 Attachments: rack_aware_replica_assignment_v1.patch, 
 rack_aware_replica_assignment_v2.patch


 Adding a rack-id to kafka config. This rack-id can be used during replica 
 assignment by using the max-rack-replication argument in the admin scripts 
 (create topic, etc.). By default the original replication assignment 
 algorithm is used because max-rack-replication defaults to -1. 
 max-rack-replication  -1 is not honored if you are doing manual replica 
 assignment (preffered).
 If this looks good I can add some test cases specific to the rack-aware 
 assignment.
 I can also port this to trunk. We are currently running 0.8.0 in production 
 and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2014-09-04 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122028#comment-14122028
 ] 

Guozhang Wang commented on KAFKA-1215:
--

Moving out of 0.8.2 for now..

 Rack-Aware replica assignment option
 

 Key: KAFKA-1215
 URL: https://issues.apache.org/jira/browse/KAFKA-1215
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.0
Reporter: Joris Van Remoortere
Assignee: Jun Rao
 Fix For: 0.9.0

 Attachments: rack_aware_replica_assignment_v1.patch, 
 rack_aware_replica_assignment_v2.patch


 Adding a rack-id to kafka config. This rack-id can be used during replica 
 assignment by using the max-rack-replication argument in the admin scripts 
 (create topic, etc.). By default the original replication assignment 
 algorithm is used because max-rack-replication defaults to -1. 
 max-rack-replication  -1 is not honored if you are doing manual replica 
 assignment (preffered).
 If this looks good I can add some test cases specific to the rack-aware 
 assignment.
 I can also port this to trunk. We are currently running 0.8.0 in production 
 and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2014-06-13 Thread Joris Van Remoortere (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030981#comment-14030981
 ] 

Joris Van Remoortere commented on KAFKA-1215:
-

[~jjkoshy] I am out till the end of the month. I was going to take a deeper 
look at how to integrate this into the auto-rebalance as that has since been 
released (post initial patch).

 Rack-Aware replica assignment option
 

 Key: KAFKA-1215
 URL: https://issues.apache.org/jira/browse/KAFKA-1215
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.0
Reporter: Joris Van Remoortere
Assignee: Jun Rao
 Fix For: 0.8.2

 Attachments: rack_aware_replica_assignment_v1.patch, 
 rack_aware_replica_assignment_v2.patch


 Adding a rack-id to kafka config. This rack-id can be used during replica 
 assignment by using the max-rack-replication argument in the admin scripts 
 (create topic, etc.). By default the original replication assignment 
 algorithm is used because max-rack-replication defaults to -1. 
 max-rack-replication  -1 is not honored if you are doing manual replica 
 assignment (preffered).
 If this looks good I can add some test cases specific to the rack-aware 
 assignment.
 I can also port this to trunk. We are currently running 0.8.0 in production 
 and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2014-05-29 Thread Jorge Ortiz (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013151#comment-14013151
 ] 

Jorge Ortiz commented on KAFKA-1215:


Any update on this? We're deploying Kafka on AWS and rack-awareness would be 
lovely.

 Rack-Aware replica assignment option
 

 Key: KAFKA-1215
 URL: https://issues.apache.org/jira/browse/KAFKA-1215
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.0
Reporter: Joris Van Remoortere
Assignee: Jun Rao
 Fix For: 0.8.2

 Attachments: rack_aware_replica_assignment_v1.patch, 
 rack_aware_replica_assignment_v2.patch


 Adding a rack-id to kafka config. This rack-id can be used during replica 
 assignment by using the max-rack-replication argument in the admin scripts 
 (create topic, etc.). By default the original replication assignment 
 algorithm is used because max-rack-replication defaults to -1. 
 max-rack-replication  -1 is not honored if you are doing manual replica 
 assignment (preffered).
 If this looks good I can add some test cases specific to the rack-aware 
 assignment.
 I can also port this to trunk. We are currently running 0.8.0 in production 
 and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2014-04-02 Thread Joris Van Remoortere (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13958242#comment-13958242
 ] 

Joris Van Remoortere commented on KAFKA-1215:
-

Since there is further interest in this #1357 I will try to look at this soon.

 Rack-Aware replica assignment option
 

 Key: KAFKA-1215
 URL: https://issues.apache.org/jira/browse/KAFKA-1215
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.0
Reporter: Joris Van Remoortere
Assignee: Jun Rao
 Fix For: 0.8.2

 Attachments: rack_aware_replica_assignment_v1.patch, 
 rack_aware_replica_assignment_v2.patch


 Adding a rack-id to kafka config. This rack-id can be used during replica 
 assignment by using the max-rack-replication argument in the admin scripts 
 (create topic, etc.). By default the original replication assignment 
 algorithm is used because max-rack-replication defaults to -1. 
 max-rack-replication  -1 is not honored if you are doing manual replica 
 assignment (preffered).
 If this looks good I can add some test cases specific to the rack-aware 
 assignment.
 I can also port this to trunk. We are currently running 0.8.0 in production 
 and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2014-02-06 Thread Joris Van Remoortere (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13893535#comment-13893535
 ] 

Joris Van Remoortere commented on KAFKA-1215:
-

[~junrao] could you please look at this? Review Request #17248

Thanks!

 Rack-Aware replica assignment option
 

 Key: KAFKA-1215
 URL: https://issues.apache.org/jira/browse/KAFKA-1215
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.0, 0.8.1
Reporter: Joris Van Remoortere
Assignee: Neha Narkhede
 Fix For: 0.8.0, 0.8.1

 Attachments: rack_aware_replica_assignment_v1.patch, 
 rack_aware_replica_assignment_v2.patch


 Adding a rack-id to kafka config. This rack-id can be used during replica 
 assignment by using the max-rack-replication argument in the admin scripts 
 (create topic, etc.). By default the original replication assignment 
 algorithm is used because max-rack-replication defaults to -1. 
 max-rack-replication  -1 is not honored if you are doing manual replica 
 assignment (preffered).
 If this looks good I can add some test cases specific to the rack-aware 
 assignment.
 I can also port this to trunk. We are currently running 0.8.0 in production 
 and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: [jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2014-01-26 Thread Jun Rao
Replica assignments can change during topic creation, adding partitions,
and reassign partitions. So, it would be good to integrate rack-aware
assignment in those cases.

For compatibility, to do a rolling upgrade of a Kafka cluster means that at
a given point of time, some brokers could be registered in the old format
and some others could be registered in the new format. Both the consumer
and the controller (
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Controller+Internals)
read the broker registration in ZK. So, we need to make sure that the new
code can read the old format and the old code can read the new format too.
For example, in Broker.createBroker(), if the code is looking for a field
rackid but couldn't find it, we need to be able to provide a reasonable
default.

Also, take a look at the format of broker registration in ZK in
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper.
Since we are changing the format, we should probably change the
version
from 1 to 2. However, we can make this change backward compatible.

Thanks,

Jun


On Fri, Jan 24, 2014 at 1:57 PM, Joris VanRemoortere 
jvanremoort...@tagged.com wrote:

 Working on the above.

 Since trunk currently allows Topic configs, and altering topics, does it
 make sense to you to make this a topic config so that it is kept in sync
 between topic alterations and partition adds?

 For compatibility: I can make the zookeeper state parsing backwards /
 forwards compatible. I don't think I can make the readFrom / writeTo
 ByteBuffer compatible since chained calls of these during api parsing are
 size dependent? Can you clarify exactly what you expect to be compatible?

 I'm also not very familiar with the consequences of changing the broker
 version in zookeeper registration. Can you please provide a reference to
 documentation for this?

 Thanks,

 Joris


 On Fri, Jan 24, 2014 at 9:43 AM, Jun Rao (JIRA) j...@apache.org wrote:

 
  [
 
 https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881193#comment-13881193
 ]
 
  Jun Rao commented on KAFKA-1215:
  
 
  Thanks for the patch. Looks good over all. Some comments.
 
  1. KafkaConfig:
  1.1 We need a config for default max-rack-replication for auto topic
  creation.
  1.2 rackId: We probably don't want to make this a required property. So,
  perhaps we can default it to 0?
 
  2. AdminUtils.assignReplicasToBrokers():
  2.1 Could you add some comments on the rack-aware assignment algorithm?
  2.2 It's a bit weird for this method to take zkclient in the input. We
  probably can pass in a list of Broker objects instead.
 
  3. Unit tests: I suggest that we leave most existing tests intact by
  keeping the rackId default and add a new test for rack-aware assignment.
 
  4. Compatibility test: It seems that the changes for the broker format in
  ZK is backward compatible. Could you double check? For example, an old
  reader (controller, consumer, etc) should be able to parse the broker
  registered in new format and a new reader should be able to parse the
  broker registered in the old format. Also, we probably should increase
 the
  version in the ZK registration for the broker.
 
  5. Could you rebase to trunk?
 
   Rack-Aware replica assignment option
   
  
   Key: KAFKA-1215
   URL: https://issues.apache.org/jira/browse/KAFKA-1215
   Project: Kafka
Issue Type: Improvement
Components: replication
  Affects Versions: 0.8.0, 0.8.1
  Reporter: Joris Van Remoortere
  Assignee: Neha Narkhede
   Fix For: 0.8.0, 0.8.1
  
   Attachments: rack_aware_replica_assignment_v1.patch
  
  
   Adding a rack-id to kafka config. This rack-id can be used during
  replica assignment by using the max-rack-replication argument in the
 admin
  scripts (create topic, etc.). By default the original replication
  assignment algorithm is used because max-rack-replication defaults to -1.
  max-rack-replication  -1 is not honored if you are doing manual replica
  assignment (preffered).
   If this looks good I can add some test cases specific to the rack-aware
  assignment.
   I can also port this to trunk. We are currently running 0.8.0 in
  production and need this, so i wrote the patch against that.
 
 
 
  --
  This message was sent by Atlassian JIRA
  (v6.1.5#6160)
 



Re: [jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2014-01-26 Thread Joris VanRemoortere
Thanks Jun,

As you said, rack-aware assignment needs to be supported during topic
creation, adding of partitions, and partition re-assignment.

In order to reduce the risk of using a different value of
max-rack-replication during different calls to those functions, I'd like to
carry / persist that information as part of the topic config. This way the
option is mandatory during topic creation, and optional partition re-assign.

Here is my proposal:

Topic Create:

   - max-rack-replication mandatory
   - store max-rack-replication as a topic config

Partition Add:

   - Use the stored value of max-rack-replication to add another partition
   using the same max-rack-replication value as the rest of the partitions in
   the topic

Partition Re-assign:

   - If max-rack-replication is provided, treat this as an altered config.
   This forces a re-validation of all replica assignments of all partitions
   for the topic. If this succeeds, save the updated topic config.
   - If max-rack-replication is not provided, used the saved value in the
   topic config.

Alter Topic:

   - If someone over-rides the max-rack-replication value, we re-validate
   all replica assignments as with partition re-assignment.

I think this strategy will be less error prone, making it impossible to
have multiple partitions in a topic using different max-rack-replication
values. Let me know if this sounds ok to you,

Joris


On Sun, Jan 26, 2014 at 9:15 AM, Jun Rao jun...@gmail.com wrote:

 Replica assignments can change during topic creation, adding partitions,
 and reassign partitions. So, it would be good to integrate rack-aware
 assignment in those cases.

 For compatibility, to do a rolling upgrade of a Kafka cluster means that at
 a given point of time, some brokers could be registered in the old format
 and some others could be registered in the new format. Both the consumer
 and the controller (

 https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Controller+Internals
 )
 read the broker registration in ZK. So, we need to make sure that the new
 code can read the old format and the old code can read the new format too.
 For example, in Broker.createBroker(), if the code is looking for a field
 rackid but couldn't find it, we need to be able to provide a reasonable
 default.

 Also, take a look at the format of broker registration in ZK in

 https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper
 .
 Since we are changing the format, we should probably change the
 version
 from 1 to 2. However, we can make this change backward compatible.

 Thanks,

 Jun


 On Fri, Jan 24, 2014 at 1:57 PM, Joris VanRemoortere 
 jvanremoort...@tagged.com wrote:

  Working on the above.
 
  Since trunk currently allows Topic configs, and altering topics, does it
  make sense to you to make this a topic config so that it is kept in sync
  between topic alterations and partition adds?
 
  For compatibility: I can make the zookeeper state parsing backwards /
  forwards compatible. I don't think I can make the readFrom / writeTo
  ByteBuffer compatible since chained calls of these during api parsing are
  size dependent? Can you clarify exactly what you expect to be compatible?
 
  I'm also not very familiar with the consequences of changing the broker
  version in zookeeper registration. Can you please provide a reference to
  documentation for this?
 
  Thanks,
 
  Joris
 
 
  On Fri, Jan 24, 2014 at 9:43 AM, Jun Rao (JIRA) j...@apache.org wrote:
 
  
   [
  
 
 https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881193#comment-13881193
  ]
  
   Jun Rao commented on KAFKA-1215:
   
  
   Thanks for the patch. Looks good over all. Some comments.
  
   1. KafkaConfig:
   1.1 We need a config for default max-rack-replication for auto topic
   creation.
   1.2 rackId: We probably don't want to make this a required property.
 So,
   perhaps we can default it to 0?
  
   2. AdminUtils.assignReplicasToBrokers():
   2.1 Could you add some comments on the rack-aware assignment algorithm?
   2.2 It's a bit weird for this method to take zkclient in the input. We
   probably can pass in a list of Broker objects instead.
  
   3. Unit tests: I suggest that we leave most existing tests intact by
   keeping the rackId default and add a new test for rack-aware
 assignment.
  
   4. Compatibility test: It seems that the changes for the broker format
 in
   ZK is backward compatible. Could you double check? For example, an old
   reader (controller, consumer, etc) should be able to parse the broker
   registered in new format and a new reader should be able to parse the
   broker registered in the old format. Also, we probably should increase
  the
   version in the ZK registration for the broker.
  
   5. Could you rebase to trunk?
  
Rack-Aware replica assignment option

Re: [jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2014-01-26 Thread Jun Rao
Thanks for the update. This proposal looks reasonable to me.

One issue that needs a bit more thinking is what happens when a broker is
moved from one rack to another. After the move, constraints on
max-rack-replication could be violated. The question is what we should do
when this happens. A less intrusive approach is to let the broker start and
simply log a warning about the violation. Not sure if this is the best
approach though.

Jun


On Sun, Jan 26, 2014 at 10:07 AM, Joris VanRemoortere 
jvanremoort...@tagged.com wrote:

 Thanks Jun,

 As you said, rack-aware assignment needs to be supported during topic
 creation, adding of partitions, and partition re-assignment.

 In order to reduce the risk of using a different value of
 max-rack-replication during different calls to those functions, I'd like to
 carry / persist that information as part of the topic config. This way the
 option is mandatory during topic creation, and optional partition
 re-assign.

 Here is my proposal:

 Topic Create:

- max-rack-replication mandatory
- store max-rack-replication as a topic config

 Partition Add:

- Use the stored value of max-rack-replication to add another partition
using the same max-rack-replication value as the rest of the partitions
 in
the topic

 Partition Re-assign:

- If max-rack-replication is provided, treat this as an altered config.
This forces a re-validation of all replica assignments of all partitions
for the topic. If this succeeds, save the updated topic config.
- If max-rack-replication is not provided, used the saved value in the
topic config.

 Alter Topic:

- If someone over-rides the max-rack-replication value, we re-validate
all replica assignments as with partition re-assignment.

 I think this strategy will be less error prone, making it impossible to
 have multiple partitions in a topic using different max-rack-replication
 values. Let me know if this sounds ok to you,

 Joris


 On Sun, Jan 26, 2014 at 9:15 AM, Jun Rao jun...@gmail.com wrote:

  Replica assignments can change during topic creation, adding partitions,
  and reassign partitions. So, it would be good to integrate rack-aware
  assignment in those cases.
 
  For compatibility, to do a rolling upgrade of a Kafka cluster means that
 at
  a given point of time, some brokers could be registered in the old format
  and some others could be registered in the new format. Both the consumer
  and the controller (
 
 
 https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Controller+Internals
  )
  read the broker registration in ZK. So, we need to make sure that the new
  code can read the old format and the old code can read the new format
 too.
  For example, in Broker.createBroker(), if the code is looking for a field
  rackid but couldn't find it, we need to be able to provide a reasonable
  default.
 
  Also, take a look at the format of broker registration in ZK in
 
 
 https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper
  .
  Since we are changing the format, we should probably change the
  version
  from 1 to 2. However, we can make this change backward compatible.
 
  Thanks,
 
  Jun
 
 
  On Fri, Jan 24, 2014 at 1:57 PM, Joris VanRemoortere 
  jvanremoort...@tagged.com wrote:
 
   Working on the above.
  
   Since trunk currently allows Topic configs, and altering topics, does
 it
   make sense to you to make this a topic config so that it is kept in
 sync
   between topic alterations and partition adds?
  
   For compatibility: I can make the zookeeper state parsing backwards /
   forwards compatible. I don't think I can make the readFrom / writeTo
   ByteBuffer compatible since chained calls of these during api parsing
 are
   size dependent? Can you clarify exactly what you expect to be
 compatible?
  
   I'm also not very familiar with the consequences of changing the broker
   version in zookeeper registration. Can you please provide a reference
 to
   documentation for this?
  
   Thanks,
  
   Joris
  
  
   On Fri, Jan 24, 2014 at 9:43 AM, Jun Rao (JIRA) j...@apache.org
 wrote:
  
   
[
   
  
 
 https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881193#comment-13881193
   ]
   
Jun Rao commented on KAFKA-1215:

   
Thanks for the patch. Looks good over all. Some comments.
   
1. KafkaConfig:
1.1 We need a config for default max-rack-replication for auto topic
creation.
1.2 rackId: We probably don't want to make this a required property.
  So,
perhaps we can default it to 0?
   
2. AdminUtils.assignReplicasToBrokers():
2.1 Could you add some comments on the rack-aware assignment
 algorithm?
2.2 It's a bit weird for this method to take zkclient in the input.
 We
probably can pass in a list of Broker objects instead.
   
3. Unit tests: I suggest that we 

[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2014-01-24 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881193#comment-13881193
 ] 

Jun Rao commented on KAFKA-1215:


Thanks for the patch. Looks good over all. Some comments.

1. KafkaConfig:
1.1 We need a config for default max-rack-replication for auto topic creation.
1.2 rackId: We probably don't want to make this a required property. So, 
perhaps we can default it to 0?

2. AdminUtils.assignReplicasToBrokers(): 
2.1 Could you add some comments on the rack-aware assignment algorithm?
2.2 It's a bit weird for this method to take zkclient in the input. We probably 
can pass in a list of Broker objects instead.

3. Unit tests: I suggest that we leave most existing tests intact by keeping 
the rackId default and add a new test for rack-aware assignment.

4. Compatibility test: It seems that the changes for the broker format in ZK is 
backward compatible. Could you double check? For example, an old reader 
(controller, consumer, etc) should be able to parse the broker registered in 
new format and a new reader should be able to parse the broker registered in 
the old format. Also, we probably should increase the version in the ZK 
registration for the broker.

5. Could you rebase to trunk?

 Rack-Aware replica assignment option
 

 Key: KAFKA-1215
 URL: https://issues.apache.org/jira/browse/KAFKA-1215
 Project: Kafka
  Issue Type: Improvement
  Components: replication
Affects Versions: 0.8.0, 0.8.1
Reporter: Joris Van Remoortere
Assignee: Neha Narkhede
 Fix For: 0.8.0, 0.8.1

 Attachments: rack_aware_replica_assignment_v1.patch


 Adding a rack-id to kafka config. This rack-id can be used during replica 
 assignment by using the max-rack-replication argument in the admin scripts 
 (create topic, etc.). By default the original replication assignment 
 algorithm is used because max-rack-replication defaults to -1. 
 max-rack-replication  -1 is not honored if you are doing manual replica 
 assignment (preffered).
 If this looks good I can add some test cases specific to the rack-aware 
 assignment.
 I can also port this to trunk. We are currently running 0.8.0 in production 
 and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: [jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2014-01-24 Thread Joris VanRemoortere
Working on the above.

Since trunk currently allows Topic configs, and altering topics, does it
make sense to you to make this a topic config so that it is kept in sync
between topic alterations and partition adds?

For compatibility: I can make the zookeeper state parsing backwards /
forwards compatible. I don't think I can make the readFrom / writeTo
ByteBuffer compatible since chained calls of these during api parsing are
size dependent? Can you clarify exactly what you expect to be compatible?

I'm also not very familiar with the consequences of changing the broker
version in zookeeper registration. Can you please provide a reference to
documentation for this?

Thanks,

Joris


On Fri, Jan 24, 2014 at 9:43 AM, Jun Rao (JIRA) j...@apache.org wrote:


 [
 https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881193#comment-13881193]

 Jun Rao commented on KAFKA-1215:
 

 Thanks for the patch. Looks good over all. Some comments.

 1. KafkaConfig:
 1.1 We need a config for default max-rack-replication for auto topic
 creation.
 1.2 rackId: We probably don't want to make this a required property. So,
 perhaps we can default it to 0?

 2. AdminUtils.assignReplicasToBrokers():
 2.1 Could you add some comments on the rack-aware assignment algorithm?
 2.2 It's a bit weird for this method to take zkclient in the input. We
 probably can pass in a list of Broker objects instead.

 3. Unit tests: I suggest that we leave most existing tests intact by
 keeping the rackId default and add a new test for rack-aware assignment.

 4. Compatibility test: It seems that the changes for the broker format in
 ZK is backward compatible. Could you double check? For example, an old
 reader (controller, consumer, etc) should be able to parse the broker
 registered in new format and a new reader should be able to parse the
 broker registered in the old format. Also, we probably should increase the
 version in the ZK registration for the broker.

 5. Could you rebase to trunk?

  Rack-Aware replica assignment option
  
 
  Key: KAFKA-1215
  URL: https://issues.apache.org/jira/browse/KAFKA-1215
  Project: Kafka
   Issue Type: Improvement
   Components: replication
 Affects Versions: 0.8.0, 0.8.1
 Reporter: Joris Van Remoortere
 Assignee: Neha Narkhede
  Fix For: 0.8.0, 0.8.1
 
  Attachments: rack_aware_replica_assignment_v1.patch
 
 
  Adding a rack-id to kafka config. This rack-id can be used during
 replica assignment by using the max-rack-replication argument in the admin
 scripts (create topic, etc.). By default the original replication
 assignment algorithm is used because max-rack-replication defaults to -1.
 max-rack-replication  -1 is not honored if you are doing manual replica
 assignment (preffered).
  If this looks good I can add some test cases specific to the rack-aware
 assignment.
  I can also port this to trunk. We are currently running 0.8.0 in
 production and need this, so i wrote the patch against that.



 --
 This message was sent by Atlassian JIRA
 (v6.1.5#6160)