Re: [openstack-dev] [Swift] (Non-)consistency of the Swift hash ring implementation
That's great to hear! I see now that Swift's implementation has some additional rebalancing logic that Ironic (and the code example from Gregory's blog) lacked. Cheers, Nejc On 09/08/2014 05:39 AM, John Dickinson wrote: To test Swift directly, I used the CLI tools that Swift provides for managing rings. I wrote the following short script: $ cat remakerings #!/bin/bash swift-ring-builder object.builder create 16 3 0 for zone in {1..4}; do for server in {200..224}; do for drive in {1..12}; do swift-ring-builder object.builder add r1z${zone}-10.0.${zone}.${server}:6010/d${drive} 3000 done done done swift-ring-builder object.builder rebalance This adds 1200 devices. 4 zones, each with 25 servers, each with 12 drives (4*25*12=1200). The important thing is that instead of adding 1000 drives in one zone or in one server, I'm splaying across the placement hierarchy that Swift uses. After running the script, I added one drive to one server to see what the impact would be and rebalanced. The swift-ring-builder tool detected that less than 1% of the partitions would change and therefore didn't move anything (just to avoid unnecessary data movement). --John On Sep 7, 2014, at 11:20 AM, Nejc Saje wrote: Hey guys, in Ceilometer we're using consistent hash rings to do workload partitioning[1]. We've considered using Ironic's hash ring implementation, but found out it wasn't actually consistent (ML[2], patch[3]). The next thing I noticed that the Ironic implementation is based on Swift's. The gist of it is: since you divide your ring into a number of equal sized partitions, instead of hashing hosts onto the ring, when you add a new host, an unbound amount of keys get re-mapped to different hosts (instead of the 1/#nodes remapping guaranteed by hash ring). Swift's hash ring implementation is quite complex though, so I took the conceptually similar code from Gregory Holt's blogpost[4] (which I'm guessing is based on Gregory's efforts on Swift's hash ring implementation) and tested that instead. With a simple test (paste[5]) of first having 1000 nodes and then adding 1, 99.91% of the data was moved. I have no way to test this in Swift directly, so I'm just throwing this out there, so you guys can figure out whether there actually is a problem or not. Cheers, Nejc [1] https://review.openstack.org/#/c/113549/ [2] http://lists.openstack.org/pipermail/openstack-dev/2014-September/044566.html [3] https://review.openstack.org/#/c/118932/4 [4] http://greg.brim.net/page/building_a_consistent_hashing_ring.html [5] http://paste.openstack.org/show/107782/ ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Swift] (Non-)consistency of the Swift hash ring implementation
To test Swift directly, I used the CLI tools that Swift provides for managing rings. I wrote the following short script: $ cat remakerings #!/bin/bash swift-ring-builder object.builder create 16 3 0 for zone in {1..4}; do for server in {200..224}; do for drive in {1..12}; do swift-ring-builder object.builder add r1z${zone}-10.0.${zone}.${server}:6010/d${drive} 3000 done done done swift-ring-builder object.builder rebalance This adds 1200 devices. 4 zones, each with 25 servers, each with 12 drives (4*25*12=1200). The important thing is that instead of adding 1000 drives in one zone or in one server, I'm splaying across the placement hierarchy that Swift uses. After running the script, I added one drive to one server to see what the impact would be and rebalanced. The swift-ring-builder tool detected that less than 1% of the partitions would change and therefore didn't move anything (just to avoid unnecessary data movement). --John On Sep 7, 2014, at 11:20 AM, Nejc Saje wrote: > Hey guys, > > in Ceilometer we're using consistent hash rings to do workload > partitioning[1]. We've considered using Ironic's hash ring implementation, > but found out it wasn't actually consistent (ML[2], patch[3]). The next thing > I noticed that the Ironic implementation is based on Swift's. > > The gist of it is: since you divide your ring into a number of equal sized > partitions, instead of hashing hosts onto the ring, when you add a new host, > an unbound amount of keys get re-mapped to different hosts (instead of the > 1/#nodes remapping guaranteed by hash ring). > > Swift's hash ring implementation is quite complex though, so I took the > conceptually similar code from Gregory Holt's blogpost[4] (which I'm guessing > is based on Gregory's efforts on Swift's hash ring implementation) and tested > that instead. With a simple test (paste[5]) of first having 1000 nodes and > then adding 1, 99.91% of the data was moved. > > I have no way to test this in Swift directly, so I'm just throwing this out > there, so you guys can figure out whether there actually is a problem or not. > > Cheers, > Nejc > > [1] https://review.openstack.org/#/c/113549/ > [2] > http://lists.openstack.org/pipermail/openstack-dev/2014-September/044566.html > [3] https://review.openstack.org/#/c/118932/4 > [4] http://greg.brim.net/page/building_a_consistent_hashing_ring.html > [5] http://paste.openstack.org/show/107782/ > > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Swift] (Non-)consistency of the Swift hash ring implementation
Hey guys, in Ceilometer we're using consistent hash rings to do workload partitioning[1]. We've considered using Ironic's hash ring implementation, but found out it wasn't actually consistent (ML[2], patch[3]). The next thing I noticed that the Ironic implementation is based on Swift's. The gist of it is: since you divide your ring into a number of equal sized partitions, instead of hashing hosts onto the ring, when you add a new host, an unbound amount of keys get re-mapped to different hosts (instead of the 1/#nodes remapping guaranteed by hash ring). Swift's hash ring implementation is quite complex though, so I took the conceptually similar code from Gregory Holt's blogpost[4] (which I'm guessing is based on Gregory's efforts on Swift's hash ring implementation) and tested that instead. With a simple test (paste[5]) of first having 1000 nodes and then adding 1, 99.91% of the data was moved. I have no way to test this in Swift directly, so I'm just throwing this out there, so you guys can figure out whether there actually is a problem or not. Cheers, Nejc [1] https://review.openstack.org/#/c/113549/ [2] http://lists.openstack.org/pipermail/openstack-dev/2014-September/044566.html [3] https://review.openstack.org/#/c/118932/4 [4] http://greg.brim.net/page/building_a_consistent_hashing_ring.html [5] http://paste.openstack.org/show/107782/ ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev