Re: [openstack-dev] [magnum]problems for horizontal scale
hi Hua, My comments in blue below. please check. Thanks Best Wishes, Kai Qiang Wu (吴开强 Kennan) IBM China System and Technology Lab, Beijing E-mail: wk...@cn.ibm.com Tel: 86-10-82451647 Address: Building 28(Ring Building), ZhongGuanCun Software Park, No.8 Dong Bei Wang West Road, Haidian District Beijing P.R.China 100193 Follow your heart. You are miracle! From: 王华 wanghua.hum...@gmail.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Date: 08/13/2015 03:32 PM Subject:Re: [openstack-dev] [magnum]problems for horizontal scale Hi Kai Qiang Wu, I have some comments in line. On Thu, Aug 13, 2015 at 1:32 PM, Kai Qiang Wu wk...@cn.ibm.com wrote: Hi Hua, I have some comments about this: A remove heat poller can be a way, but some of its logic needs to make sure it work and performance not burden. 1) for old heat poller it is quick loop, with fixed interval, to make sure stack status update quickly can be reflected in bay status 2) for periodic task running, it seems dynamic loop, and period is long, it was added for some stacks creation timeout, 1) loop exit, this 2) loop can help update the stack and also conductor crash issue It is not necessary to remove heat poller, so we can keep it. It would be ideal to put in one place for looping over the stacks, but periodic tasks need to consider if it really just need to loop IN_PROGRESS status stack ? And what's the interval for loop that ? (60s or short, loop performance) It is necessary to loop IN_PROGRESS status stack for conductor crash issue. Does heat have other status transition path, like delete_failed -- (status reset) -- become OK. etc. It needs to be made sure. B For remove db operation in bay_update case. I did not understand your suggestion. bay_update include update_stack and poll_and_check(it is in heat poller), if you removed heat poller to periodic task(as you said in your 3). It still needs db operations. Race conditions occur in periodic tasks too. If we save the stack params such as node_count in bay_update and race condition occurs, then the node_count in db is wrong and the status is UPDATE_COMPLETE. And there is no way to correct it. If we save stack params in periodic tasks and race condition occurs, the node_count in db is still wrong and status is UPDATE_COMPLETE. We can correct it in the next periodic task if race condition does not occur. The solution I proposed can not promise the data in db is always right. Yes, it can help some, when you talked periodic task, I checked that, filters = [bay_status.CREATE_IN_PROGRESS, bay_status.UPDATE_IN_PROGRESS, bay_status.DELETE_IN_PROGRESS] bays = objects.Bay.list_all(ctx, filters=filters) If UPDATE_COMPLETE, I did not find it would sync it in this task. Do you mean add that status check in this periodic task ? C For allow admin user to show stacks in other tenant, it seems OK. Does other projects try this before? Is it reasonable case for customer ? Nova allow admin user to show instances in other tenant. Neutron allow admin user to show ports in other tenant, nova uses it to sync up network info for instance from neutron. That would be OK, I think Thanks Best Wishes, Kai Qiang Wu (吴开强 Kennan) IBM China System and Technology Lab, Beijing E-mail: wk...@cn.ibm.com Tel: 86-10-82451647 Address: Building 28(Ring Building), ZhongGuanCun Software Park, No.8 Dong Bei Wang West Road, Haidian District Beijing P.R.China 100193 Follow your heart. You are miracle! Inactive hide details for 王华 ---08/13/2015 11:31:53 AM---any comments on this? On Wed, Aug 12, 2015 at 2:50 PM, 王华 wan王华 ---08/13/2015 11:31:53 AM---any comments on this? On Wed, Aug 12, 2015 at 2:50 PM, 王华 wanghua.hum...@gmail.com wrote: From: 王华 wanghua.hum...@gmail.com To: openstack-dev@lists.openstack.org Date: 08/13/2015 11:31 AM Subject: Re: [openstack-dev] [magnum]problems for horizontal scale any comments on this? On Wed, Aug 12, 2015 at 2:50 PM, 王华 wanghua.hum...@gmail.com wrote: Hi All, In order to prevent race conditions due to multiple conductors, my solution is as blew: 1. remove the db operation in bay_update to prevent race conditions.Stack operation is atomic. Db operation is atomic. But the two operations together are not atomic.So the data in the db may be wrong. 2. sync up stack status and stack parameters(now only node_count) from heat by periodic tasks
Re: [openstack-dev] [magnum]problems for horizontal scale
Hi Kai Qiang Wu, I have some comments in line. On Thu, Aug 13, 2015 at 1:32 PM, Kai Qiang Wu wk...@cn.ibm.com wrote: Hi Hua, I have some comments about this: A remove heat poller can be a way, but some of its logic needs to make sure it work and performance not burden. 1) for old heat poller it is quick loop, with fixed interval, to make sure stack status update quickly can be reflected in bay status 2) for periodic task running, it seems dynamic loop, and period is long, it was added for some stacks creation timeout, 1) loop exit, this 2) loop can help update the stack and also conductor crash issue It is not necessary to remove heat poller, so we can keep it. It would be ideal to put in one place for looping over the stacks, but periodic tasks need to consider if it really just need to loop IN_PROGRESS status stack ? And what's the interval for loop that ? (60s or short, loop performance) It is necessary to loop IN_PROGRESS status stack for conductor crash issue. Does heat have other status transition path, like delete_failed -- (status reset) -- become OK. etc. It needs to be made sure. B For remove db operation in bay_update case. I did not understand your suggestion. bay_update include update_stack and poll_and_check(it is in heat poller), if you removed heat poller to periodic task(as you said in your 3). It still needs db operations. Race conditions occur in periodic tasks too. If we save the stack params such as node_count in bay_update and race condition occurs, then the node_count in db is wrong and the status is UPDATE_COMPLETE. And there is no way to correct it. If we save stack params in periodic tasks and race condition occurs, the node_count in db is still wrong and status is UPDATE_COMPLETE. We can correct it in the next periodic task if race condition does not occur. The solution I proposed can not promise the data in db is always right. C For allow admin user to show stacks in other tenant, it seems OK. Does other projects try this before? Is it reasonable case for customer ? Nova allow admin user to show instances in other tenant. Neutron allow admin user to show ports in other tenant, nova uses it to sync up network info for instance from neutron. Thanks Best Wishes, Kai Qiang Wu (吴开强 Kennan) IBM China System and Technology Lab, Beijing E-mail: wk...@cn.ibm.com Tel: 86-10-82451647 Address: Building 28(Ring Building), ZhongGuanCun Software Park, No.8 Dong Bei Wang West Road, Haidian District Beijing P.R.China 100193 Follow your heart. You are miracle! [image: Inactive hide details for 王华 ---08/13/2015 11:31:53 AM---any comments on this? On Wed, Aug 12, 2015 at 2:50 PM, 王华 wan]王华 ---08/13/2015 11:31:53 AM---any comments on this? On Wed, Aug 12, 2015 at 2:50 PM, 王华 wanghua.hum...@gmail.com wrote: From: 王华 wanghua.hum...@gmail.com To: openstack-dev@lists.openstack.org Date: 08/13/2015 11:31 AM Subject: Re: [openstack-dev] [magnum]problems for horizontal scale -- any comments on this? On Wed, Aug 12, 2015 at 2:50 PM, 王华 *wanghua.hum...@gmail.com* wanghua.hum...@gmail.com wrote: Hi All, In order to prevent race conditions due to multiple conductors, my solution is as blew: 1. remove the db operation in bay_update to prevent race conditions.Stack operation is atomic. Db operation is atomic. But the two operations together are not atomic.So the data in the db may be wrong. 2. sync up stack status and stack parameters(now only node_count) from heat by periodic tasks. bay_update can change stack parameters, so we need to sync up them. 3. remove heat poller, because we have periodic tasks. To sync up stack parameters from heat, we need to show stacks using admin_context. But heat don't allow to show stacks in other tenant. If we want to show stacks in other tenant, we need to store auth context for every bay. That is a problem. Even if we store the auth context, there is a timeout for token. The best way I think is to let heat allow admin user to show stacks in other tenant. Do you have a better solution or any improvement for my solution? Regards, Wanghua __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [magnum]problems for horizontal scale
Hi, Kai, It is needed to sync stack in periodic tasks even if the bay status is UPDATE_COMPLETE in my solution. Thanks Regards, Wanghua On Thu, Aug 13, 2015 at 3:42 PM, Kai Qiang Wu wk...@cn.ibm.com wrote: hi Hua, My comments in blue below. please check. Thanks Best Wishes, Kai Qiang Wu (吴开强 Kennan) IBM China System and Technology Lab, Beijing E-mail: wk...@cn.ibm.com Tel: 86-10-82451647 Address: Building 28(Ring Building), ZhongGuanCun Software Park, No.8 Dong Bei Wang West Road, Haidian District Beijing P.R.China 100193 Follow your heart. You are miracle! [image: Inactive hide details for 王华 ---08/13/2015 03:32:53 PM---Hi Kai Qiang Wu, I have some comments in line.]王华 ---08/13/2015 03:32:53 PM---Hi Kai Qiang Wu, I have some comments in line. From: 王华 wanghua.hum...@gmail.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Date: 08/13/2015 03:32 PM Subject: Re: [openstack-dev] [magnum]problems for horizontal scale -- Hi Kai Qiang Wu, I have some comments in line. On Thu, Aug 13, 2015 at 1:32 PM, Kai Qiang Wu *wk...@cn.ibm.com* wk...@cn.ibm.com wrote: Hi Hua, I have some comments about this: A remove heat poller can be a way, but some of its logic needs to make sure it work and performance not burden. 1) for old heat poller it is quick loop, with fixed interval, to make sure stack status update quickly can be reflected in bay status 2) for periodic task running, it seems dynamic loop, and period is long, it was added for some stacks creation timeout, 1) loop exit, this 2) loop can help update the stack and also conductor crash issue It is not necessary to remove heat poller, so we can keep it. It would be ideal to put in one place for looping over the stacks, but periodic tasks need to consider if it really just need to loop IN_PROGRESS status stack ? And what's the interval for loop that ? (60s or short, loop performance) It is necessary to loop IN_PROGRESS status stack for conductor crash issue. Does heat have other status transition path, like delete_failed -- (status reset) -- become OK. etc. It needs to be made sure. B For remove db operation in bay_update case. I did not understand your suggestion. bay_update include update_stack and poll_and_check(it is in heat poller), if you removed heat poller to periodic task(as you said in your 3). It still needs db operations. Race conditions occur in periodic tasks too. If we save the stack params such as node_count in bay_update and race condition occurs, then the node_count in db is wrong and the status is UPDATE_COMPLETE. And there is no way to correct it. If we save stack params in periodic tasks and race condition occurs, the node_count in db is still wrong and status is UPDATE_COMPLETE. We can correct it in the next periodic task if race condition does not occur. The solution I proposed can not promise the data in db is always right. * Yes, it can help some, when you talked periodic task, I checked that,* *filters = [bay_status.CREATE_IN_PROGRESS,* * bay_status.UPDATE_IN_PROGRESS,* * bay_status.DELETE_IN_PROGRESS]* *bays = objects.Bay.list_all(ctx, filters=filters)* * If UPDATE_COMPLETE, I did not find it would sync it in this task. Do you mean add that status check in this periodic task ?* C For allow admin user to show stacks in other tenant, it seems OK. Does other projects try this before? Is it reasonable case for customer ? Nova allow admin user to show instances in other tenant. Neutron allow admin user to show ports in other tenant, nova uses it to sync up network info for instance from neutron. *That would be OK, I think* Thanks Best Wishes, Kai Qiang Wu (吴开强 Kennan) IBM China System and Technology Lab, Beijing E-mail: *wk...@cn.ibm.com* wk...@cn.ibm.com Tel: 86-10-82451647 Address: Building 28(Ring Building), ZhongGuanCun Software Park, No.8 Dong Bei Wang West Road, Haidian District Beijing P.R.China 100193 Follow your heart. You are miracle! [image: Inactive hide details for 王华 ---08/13/2015 11:31:53 AM---any comments on this? On Wed, Aug 12, 2015 at 2:50 PM, 王华 wan]王华 ---08/13/2015 11:31:53 AM---any comments on this? On Wed, Aug 12, 2015 at 2:50 PM, 王华 *wanghua.hum...@gmail.com* wanghua.hum...@gmail.com wrote: From: 王华 *wanghua.hum...@gmail.com* wanghua.hum...@gmail.com To: *openstack-dev@lists.openstack.org* openstack-dev
Re: [openstack-dev] [magnum]problems for horizontal scale
any comments on this? On Wed, Aug 12, 2015 at 2:50 PM, 王华 wanghua.hum...@gmail.com wrote: Hi All, In order to prevent race conditions due to multiple conductors, my solution is as blew: 1. remove the db operation in bay_update to prevent race conditions.Stack operation is atomic. Db operation is atomic. But the two operations together are not atomic.So the data in the db may be wrong. 2. sync up stack status and stack parameters(now only node_count) from heat by periodic tasks. bay_update can change stack parameters, so we need to sync up them. 3. remove heat poller, because we have periodic tasks. To sync up stack parameters from heat, we need to show stacks using admin_context. But heat don't allow to show stacks in other tenant. If we want to show stacks in other tenant, we need to store auth context for every bay. That is a problem. Even if we store the auth context, there is a timeout for token. The best way I think is to let heat allow admin user to show stacks in other tenant. Do you have a better solution or any improvement for my solution? Regards, Wanghua __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [magnum]problems for horizontal scale
Hi Hua, I have some comments about this: A remove heat poller can be a way, but some of its logic needs to make sure it work and performance not burden. 1) for old heat poller it is quick loop, with fixed interval, to make sure stack status update quickly can be reflected in bay status 2) for periodic task running, it seems dynamic loop, and period is long, it was added for some stacks creation timeout, 1) loop exit, this 2) loop can help update the stack and also conductor crash issue It would be ideal to put in one place for looping over the stacks, but periodic tasks need to consider if it really just need to loop IN_PROGRESS status stack ? And what's the interval for loop that ? (60s or short, loop performance) Does heat have other status transition path, like delete_failed -- (status reset) -- become OK. etc. B For remove db operation in bay_update case. I did not understand your suggestion. bay_update include update_stack and poll_and_check(it is in heat poller), if you removed heat poller to periodic task(as you said in your 3). It still needs db operations. C For allow admin user to show stacks in other tenant, it seems OK. Does other projects try this before? Is it reasonable case for customer ? Thanks Best Wishes, Kai Qiang Wu (吴开强 Kennan) IBM China System and Technology Lab, Beijing E-mail: wk...@cn.ibm.com Tel: 86-10-82451647 Address: Building 28(Ring Building), ZhongGuanCun Software Park, No.8 Dong Bei Wang West Road, Haidian District Beijing P.R.China 100193 Follow your heart. You are miracle! From: 王华 wanghua.hum...@gmail.com To: openstack-dev@lists.openstack.org Date: 08/13/2015 11:31 AM Subject:Re: [openstack-dev] [magnum]problems for horizontal scale any comments on this? On Wed, Aug 12, 2015 at 2:50 PM, 王华 wanghua.hum...@gmail.com wrote: Hi All, In order to prevent race conditions due to multiple conductors, my solution is as blew: 1. remove the db operation in bay_update to prevent race conditions.Stack operation is atomic. Db operation is atomic. But the two operations together are not atomic.So the data in the db may be wrong. 2. sync up stack status and stack parameters(now only node_count) from heat by periodic tasks. bay_update can change stack parameters, so we need to sync up them. 3. remove heat poller, because we have periodic tasks. To sync up stack parameters from heat, we need to show stacks using admin_context. But heat don't allow to show stacks in other tenant. If we want to show stacks in other tenant, we need to store auth context for every bay. That is a problem. Even if we store the auth context, there is a timeout for token. The best way I think is to let heat allow admin user to show stacks in other tenant. Do you have a better solution or any improvement for my solution? Regards, Wanghua __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [magnum]problems for horizontal scale
Hi All, In order to prevent race conditions due to multiple conductors, my solution is as blew: 1. remove the db operation in bay_update to prevent race conditions.Stack operation is atomic. Db operation is atomic. But the two operations together are not atomic.So the data in the db may be wrong. 2. sync up stack status and stack parameters(now only node_count) from heat by periodic tasks. bay_update can change stack parameters, so we need to sync up them. 3. remove heat poller, because we have periodic tasks. To sync up stack parameters from heat, we need to show stacks using admin_context. But heat don't allow to show stacks in other tenant. If we want to show stacks in other tenant, we need to store auth context for every bay. That is a problem. Even if we store the auth context, there is a timeout for token. The best way I think is to let heat allow admin user to show stacks in other tenant. Do you have a better solution or any improvement for my solution? Regards, Wanghua __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev