Re: [openstack-dev] Objects not getting distributed across the swift cluster...

John Dickinson Thu, 01 May 2014 10:51:18 -0700

On May 1, 2014, at 10:32 AM, Shyam Prasad N <nspmangal...@gmail.com> wrote:


> Hi Chuck, 
> Thanks for the reply.
> 
> The reason for such weight distribution seems to do with the ring rebalance 
> command. I've scripted the disk addition (and rebalance) process to the ring 
> using a wrapper command. When I trigger the rebalance after each disk 
> addition, only the first rebalance seems to take effect.
> 
> Is there any other way to adjust the weights other than rebalance? Or is 
> there a way to force a rebalance, even if the frequency of the rebalance (as 
> a part of disk addition) is under an hour (the min_part_hours value in ring 
> creation).

Rebalancing only moves one replica at a time to ensure that your data remains 
available, even if you have a hardware failure while you are adding capacity. 
This is why it may take multiple rebalances to get everything evenly balanced.

The min_part_hours setting (perhaps poorly named) should match how long a 
replication pass takes in your cluster. You can understand this because of what 
I said above. By ensuring that replication has completed before putting another 
partition "in flight", Swift can ensure that you keep your data highly 
available.

For completeness to answer your question, there is an (intentionally) 
undocumented option in swift-ring-builder called 
"pretend_min_part_hours_passed", but it should ALMOST NEVER be used in a 
production cluster, unless you really, really know what you are doing. Using 
that option will very likely cause service interruptions to your users. The 
better option is to correctly set the min_part_hours value to match your 
replication pass time (with set_min_part_hours), and then wait for swift to 
move things around.

Here's some more info on how and why to add capacity to a running Swift 
cluster: https://swiftstack.com/blog/2012/04/09/swift-capacity-management/

--John





> On May 1, 2014 9:00 PM, "Chuck Thier" <cth...@gmail.com> wrote:
> Hi Shyam,
> 
> If I am reading your ring output correctly, it looks like only the devices in 
> node .202 have a weight set, and thus why all of your objects are going to 
> that one node.  You can update the weight of the other devices, and 
> rebalance, and things should get distributed correctly.
> 
> --
> Chuck
> 
> 
> On Thu, May 1, 2014 at 5:28 AM, Shyam Prasad N <nspmangal...@gmail.com> wrote:
> Hi,
> 
> I created a swift cluster and configured the rings like this...
> 
> swift-ring-builder object.builder create 10 3 1
> 
> ubuntu-202:/etc/swift$ swift-ring-builder object.builder 
> object.builder, build version 12
> 1024 partitions, 3.000000 replicas, 1 regions, 4 zones, 12 devices, 300.00 
> balance
> The minimum number of hours before a partition can be reassigned is 1
> Devices:    id  region  zone      ip address  port  replication ip  
> replication port      name weight partitions balance meta
>              0       1     1      10.3.0.202  6010      10.3.0.202            
>   6010      xvdb   1.00       1024  300.00 
>              1       1     1      10.3.0.202  6020      10.3.0.202            
>   6020      xvdc   1.00       1024  300.00 
>              2       1     1      10.3.0.202  6030      10.3.0.202            
>   6030      xvde   1.00       1024  300.00 
>              3       1     2      10.3.0.212  6010      10.3.0.212            
>   6010      xvdb   1.00          0 -100.00 
>              4       1     2      10.3.0.212  6020      10.3.0.212            
>   6020      xvdc   1.00          0 -100.00 
>              5       1     2      10.3.0.212  6030      10.3.0.212            
>   6030      xvde   1.00          0 -100.00 
>              6       1     3      10.3.0.222  6010      10.3.0.222            
>   6010      xvdb   1.00          0 -100.00 
>              7       1     3      10.3.0.222  6020      10.3.0.222            
>   6020      xvdc   1.00          0 -100.00 
>              8       1     3      10.3.0.222  6030      10.3.0.222            
>   6030      xvde   1.00          0 -100.00 
>              9       1     4      10.3.0.232  6010      10.3.0.232            
>   6010      xvdb   1.00          0 -100.00 
>             10       1     4      10.3.0.232  6020      10.3.0.232            
>   6020      xvdc   1.00          0 -100.00 
>             11       1     4      10.3.0.232  6030      10.3.0.232            
>   6030      xvde   1.00          0 -100.00 
> 
> Container and account rings have a similar configuration.
> Once the rings were created and all the disks were added to the rings like 
> above, I ran rebalance on each ring. (I ran rebalance after adding each of 
> the node above.)
> Then I immediately scp the rings to all other nodes in the cluster.
> 
> I now observe that the objects are all going to 10.3.0.202. I don't see the 
> objects being replicated to the other nodes. So much so that 202 is 
> approaching 100% disk usage, while other nodes are almost completely empty.
> What am I doing wrong? Am I not supposed to run rebalance operation after 
> addition of each disk/node?
> 
> Thanks in advance for the help.
> 
> -- 
> -Shyam
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Objects not getting distributed across the swift cluster...

Reply via email to