Re: [openstack-dev] [Openstack] Bad perf on swift servers...

2014-05-30 Thread Shyam Prasad N
Hi Hugo,
Thanks for the reply. Sorry for the delay in this reply.

A couple of disks in one of the swift servers was accidentally wiped a
couple of days back. And swift was trying hard to restore back the data to
those disks. It looks like this was definitely contributing to the CPU
load.
Does swift use rsync to perform this data restoration? Also, is there a way
to configure swift or rsync to reduce the priority of such rsync? I realize
that since my replica count is 2, it makes sense for swift to try hard to
restore the data. But will it be any different if replica count was higher,
say 3 or 4?

Regarding the troubleshooting of account-server cpu usage, the cluster is
currently down for some other issues. Will report back if the issue
persists after I reboot the setup.
As for the topology, I have 4 swift symmetric servers
(proxy+object+container+account) each with 4GB of ram and 10G ethernet
cards to communicate to each other and to clients through a 10G switch on a
private network.

Regards,
Shyam



On Fri, May 30, 2014 at 7:49 AM, Kuo Hugo tonyt...@gmail.com wrote:

 Hi ,

 1. Correct ! Once you adding new devices and rebalance rings, portion of
 partitions will be reassigned to new devices. If those partitions were used
 by some objects, object-replicator is going to move data to new devices.
 You should see logs of object-replicator to transfer objects from one
 device to another by invoking rsync.

 2. Regarding to busy swift-account-server, that's pretty abnormal tho. Is
 there any log indicating account-server doing any jobs?   A possibility is
 the ring which includes wrong port number of other workers to
 account-server. Perhaps you can paste all your rings layout to
 http://paste.openstack.org/ . To use strace on account-server process may
 help to track the exercise.

 3. In kind of deployment that outward-facing interface shares same network
 resource with cluster-facing interface, it definitely causes some race on
 network utilization. Hence the frontend traffic is under impact by
 replication traffic now.

 4. To have a detail network topology diagram will help.

 Hugo Kuo


 2014-05-29 1:06 GMT+08:00 Shyam Prasad N nspmangal...@gmail.com:

 Hi,

 Confused about the right mailing list to ask this question. So including
 both openstack and openstack-dev in the CC list.

 I'm running a swift cluster with 4 nodes.
 All 4 nodes are symmetrical. i.e. proxy, object, container, and account
 servers running on each with similar storage configuration and conf files.
 The I/O traffic to this cluster is mainly to upload dynamic large objects
 (typically 1GB chunks (sub-objects) and around 5-6 chunks under each large
 object).

 The setup is running and serving data; but I've begun to see a few perf
 issues, as the traffic increases. I want to understand the reason behind
 some of these issues, and make sure that there is nothing wrong with the
 setup configuration.

 1. High CPU utilization from rsync. I have set replica count in each of
 account, container, and object rings to 2. From what I've read, this
 assigns 2 devices for each partition in the storage cluster. And for each
 PUT, the 2 replicas should be written synchronously. And for GET, the I/O
 is through one of the object servers. So nothing here should be
 asynchronous in nature. Then what is causing the rsync traffic here?

 I recently ran a ring rebalance command after adding a node recently.
 Could this be causing the issue?

 2. High CPU utilization from swift-account-server threads. All my
 frontend traffic use 1 account and 1 container on the servers. There are
 hundreds of such objects in the same container. I don't understand what's
 keeping the account servers busy.

 3. I've started noticing that the 1GB object transfers of the frontend
 traffic are taking significantly more time than they used to (more than
 double the time). Could this be because i'm using the same subnet for both
 the internal and the frontend traffic.

 4. Can someone provide me some pointers/tips to improving perf for my
 cluster configuration? (I guess I've given out most details above. Feel
 free to ask if you need more details)

 As always, thanks in advance for your replies. Appreciate the support. :)
 --
 -Shyam

 ___
 Mailing list:
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
 Post to : openst...@lists.openstack.org
 Unsubscribe :
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack





-- 
-Shyam
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack] Bad perf on swift servers...

2014-05-29 Thread Kuo Hugo
Hi ,

1. Correct ! Once you adding new devices and rebalance rings, portion of
partitions will be reassigned to new devices. If those partitions were used
by some objects, object-replicator is going to move data to new devices.
You should see logs of object-replicator to transfer objects from one
device to another by invoking rsync.

2. Regarding to busy swift-account-server, that's pretty abnormal tho. Is
there any log indicating account-server doing any jobs?   A possibility is
the ring which includes wrong port number of other workers to
account-server. Perhaps you can paste all your rings layout to
http://paste.openstack.org/ . To use strace on account-server process may
help to track the exercise.

3. In kind of deployment that outward-facing interface shares same network
resource with cluster-facing interface, it definitely causes some race on
network utilization. Hence the frontend traffic is under impact by
replication traffic now.

4. To have a detail network topology diagram will help.

Hugo Kuo


2014-05-29 1:06 GMT+08:00 Shyam Prasad N nspmangal...@gmail.com:

 Hi,

 Confused about the right mailing list to ask this question. So including
 both openstack and openstack-dev in the CC list.

 I'm running a swift cluster with 4 nodes.
 All 4 nodes are symmetrical. i.e. proxy, object, container, and account
 servers running on each with similar storage configuration and conf files.
 The I/O traffic to this cluster is mainly to upload dynamic large objects
 (typically 1GB chunks (sub-objects) and around 5-6 chunks under each large
 object).

 The setup is running and serving data; but I've begun to see a few perf
 issues, as the traffic increases. I want to understand the reason behind
 some of these issues, and make sure that there is nothing wrong with the
 setup configuration.

 1. High CPU utilization from rsync. I have set replica count in each of
 account, container, and object rings to 2. From what I've read, this
 assigns 2 devices for each partition in the storage cluster. And for each
 PUT, the 2 replicas should be written synchronously. And for GET, the I/O
 is through one of the object servers. So nothing here should be
 asynchronous in nature. Then what is causing the rsync traffic here?

 I recently ran a ring rebalance command after adding a node recently.
 Could this be causing the issue?

 2. High CPU utilization from swift-account-server threads. All my frontend
 traffic use 1 account and 1 container on the servers. There are hundreds of
 such objects in the same container. I don't understand what's keeping the
 account servers busy.

 3. I've started noticing that the 1GB object transfers of the frontend
 traffic are taking significantly more time than they used to (more than
 double the time). Could this be because i'm using the same subnet for both
 the internal and the frontend traffic.

 4. Can someone provide me some pointers/tips to improving perf for my
 cluster configuration? (I guess I've given out most details above. Feel
 free to ask if you need more details)

 As always, thanks in advance for your replies. Appreciate the support. :)
 --
 -Shyam

 ___
 Mailing list:
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
 Post to : openst...@lists.openstack.org
 Unsubscribe :
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev