>It's not 100% true, in my case at last. We fixed this problem by >network interface driver, it causes kernel panic and readonly issues >under heavy networking workload actually. Network traffic control could help. The point is to ensure no instance is starved to death. Traffic control can be done with tc.
>btw, we are doing some works to make Glance to integrate Cinder as a >unified block storage backend. That sounds interesting. Is there some more materials? At 2014-04-18 06:05:23,"Zhi Yan Liu" <lzy....@gmail.com> wrote: >Replied as inline comments. > >On Thu, Apr 17, 2014 at 9:33 PM, lihuiba <magazine.lihu...@163.com> wrote: >>>IMO we'd better to use backend storage optimized approach to access >>>remote image from compute node instead of using iSCSI only. And from >>>my experience, I'm sure iSCSI is short of stability under heavy I/O >>>workload in product environment, it could causes either VM filesystem >>>to be marked as readonly or VM kernel panic. >> >> Yes, in this situation, the problem lies in the backend storage, so no other >> >> protocol will perform better. However, P2P transferring will greatly reduce >> >> workload on the backend storage, so as to increase responsiveness. >> > >It's not 100% true, in my case at last. We fixed this problem by >network interface driver, it causes kernel panic and readonly issues >under heavy networking workload actually. > >> >> >>>As I said currently Nova already has image caching mechanism, so in >>>this case P2P is just an approach could be used for downloading or >>>preheating for image caching. >> >> Nova's image caching is file level, while VMThunder's is block-level. And >> >> VMThunder is for working in conjunction with Cinder, not Glance. VMThunder >> >> currently uses facebook's flashcache to realize caching, and dm-cache, >> >> bcache are also options in the future. >> > >Hm if you say bcache, dm-cache and flashcache, I'm just thinking if >them could be leveraged by operation/best-practice level. > >btw, we are doing some works to make Glance to integrate Cinder as a >unified block storage backend. > >> >>>I think P2P transferring/pre-caching sounds a good way to go, as I >>>mentioned as well, but actually for the area I'd like to see something >>>like zero-copy + CoR. On one hand we can leverage the capability of >>>on-demand downloading image bits by zero-copy approach, on the other >>>hand we can prevent to reading data from remote image every time by >>>CoR. >> >> Yes, on-demand transferring is what you mean by "zero-copy", and caching >> is something close to CoR. In fact, we are working on a kernel module called >> foolcache that realize a true CoR. See >> https://github.com/lihuiba/dm-foolcache. >> > >Yup. And it's really interesting to me, will take a look, thanks for sharing. > >> >> >> >> National Key Laboratory for Parallel and Distributed >> Processing, College of Computer Science, National University of Defense >> Technology, Changsha, Hunan Province, P.R. China >> 410073 >> >> >> At 2014-04-17 17:11:48,"Zhi Yan Liu" <lzy....@gmail.com> wrote: >>>On Thu, Apr 17, 2014 at 4:41 PM, lihuiba <magazine.lihu...@163.com> wrote: >>>>>IMHO, zero-copy approach is better >>>> VMThunder's "on-demand transferring" is the same thing as your "zero-copy >>>> approach". >>>> VMThunder is uses iSCSI as the transferring protocol, which is option #b >>>> of >>>> yours. >>>> >>> >>>IMO we'd better to use backend storage optimized approach to access >>>remote image from compute node instead of using iSCSI only. And from >>>my experience, I'm sure iSCSI is short of stability under heavy I/O >>>workload in product environment, it could causes either VM filesystem >>>to be marked as readonly or VM kernel panic. >>> >>>> >>>>>Under #b approach, my former experience from our previous similar >>>>>Cloud deployment (not OpenStack) was that: under 2 PC server storage >>>>>nodes (general *local SAS disk*, without any storage backend) + >>>>>2-way/multi-path iSCSI + 1G network bandwidth, we can provisioning 500 >>>>>VMs in a minute. >>>> suppose booting one instance requires reading 300MB of data, so 500 ones >>>> require 150GB. Each of the storage server needs to send data at a rate >>>> of >>>> 150GB/2/60 = 1.25GB/s on average. This is absolutely a heavy burden even >>>> for high-end storage appliances. In production systems, this request >>>> (booting >>>> 500 VMs in one shot) will significantly disturb other running instances >>>> accessing the same storage nodes. >>>> > >btw, I believe the case/numbers is not true as well, since remote >image bits could be loaded on-demand instead of load them all on boot >stage. > >zhiyan > >>>> VMThunder eliminates this problem by P2P transferring and on-compute-node >>>> caching. Even a pc server with one 1gb NIC (this is a true pc server!) >>>> can >>>> boot >>>> 500 VMs in a minute with ease. For the first time, VMThunder makes bulk >>>> provisioning of VMs practical for production cloud systems. This is the >>>> essential >>>> value of VMThunder. >>>> >>> >>>As I said currently Nova already has image caching mechanism, so in >>>this case P2P is just an approach could be used for downloading or >>>preheating for image caching. >>> >>>I think P2P transferring/pre-caching sounds a good way to go, as I >>>mentioned as well, but actually for the area I'd like to see something >>>like zero-copy + CoR. On one hand we can leverage the capability of >>>on-demand downloading image bits by zero-copy approach, on the other >>>hand we can prevent to reading data from remote image every time by >>>CoR. >>> >>>zhiyan >>> >>>> >>>> >>>> >>>> =================================================== >>>> From: Zhi Yan Liu <lzy....@gmail.com> >>>> Date: 2014-04-17 0:02 GMT+08:00 >>>> Subject: Re: [openstack-dev] [Nova][blueprint] Accelerate the booting >>>> process of a number of vms via VMThunder >>>> To: "OpenStack Development Mailing List (not for usage questions)" >>>> <openstack-dev@lists.openstack.org> >>>> >>>> >>>> >>>> Hello Yongquan Fu, >>>> >>>> My thoughts: >>>> >>>> 1. Currently Nova has already supported image caching mechanism. It >>>> could caches the image on compute host which VM had provisioning from >>>> it before, and next provisioning (boot same image) doesn't need to >>>> transfer it again only if cache-manger clear it up. >>>> 2. P2P transferring and prefacing is something that still based on >>>> copy mechanism, IMHO, zero-copy approach is better, even >>>> transferring/prefacing could be optimized by such approach. (I have >>>> not check "on-demand transferring" of VMThunder, but it is a kind of >>>> transferring as well, at last from its literal meaning). >>>> And btw, IMO, we have two ways can go follow zero-copy idea: >>>> a. when Nova and Glance use same backend storage, we could use storage >>>> special CoW/snapshot approach to prepare VM disk instead of >>>> copy/transferring image bits (through HTTP/network or local copy). >>>> b. without "unified" storage, we could attach volume/LUN to compute >>>> node from backend storage as a base image, then do such CoW/snapshot >>>> on it to prepare root/ephemeral disk of VM. This way just like >>>> boot-from-volume but different is that we do CoW/snapshot on Nova side >>>> instead of Cinder/storage side. >>>> >>>> For option #a, we have already got some progress: >>>> https://blueprints.launchpad.net/nova/+spec/image-multiple-location >>>> https://blueprints.launchpad.net/nova/+spec/rbd-clone-image-handler >>>> https://blueprints.launchpad.net/nova/+spec/vmware-clone-image-handler >>>> >>>> Under #b approach, my former experience from our previous similar >>>> Cloud deployment (not OpenStack) was that: under 2 PC server storage >>>> nodes (general *local SAS disk*, without any storage backend) + >>>> 2-way/multi-path iSCSI + 1G network bandwidth, we can provisioning 500 >>>> VMs in a minute. >>>> >>>> For vmThunder topic I think it sounds a good idea, IMO P2P, prefacing >>>> is one of optimized approach for image transferring valuably. >>>> >>>> zhiyan >>>> >>>> On Wed, Apr 16, 2014 at 9:14 PM, yongquan Fu <quanyo...@gmail.com> wrote: >>>>> >>>>> Dear all, >>>>> >>>>> >>>>> >>>>> We would like to present an extension to the vm-booting functionality >>>>> of >>>>> Nova when a number of homogeneous vms need to be launched at the same >>>>> time. >>>>> >>>>> >>>>> >>>>> The motivation for our work is to increase the speed of provisioning vms >>>>> for >>>>> large-scale scientific computing and big data processing. In that case, >>>>> we >>>>> often need to boot tens and hundreds virtual machine instances at the >>>>> same >>>>> time. >>>>> >>>>> >>>>> Currently, under the Openstack, we found that creating a large >>>>> number >>>>> of >>>>> virtual machine instances is very time-consuming. The reason is the >>>>> booting >>>>> procedure is a centralized operation that involve performance >>>>> bottlenecks. >>>>> Before a virtual machine can be actually started, OpenStack either copy >>>>> the >>>>> image file (swift) or attach the image volume (cinder) from storage >>>>> server >>>>> to compute node via network. Booting a single VM need to read a large >>>>> amount >>>>> of image data from the image storage server. So creating a large number >>>>> of >>>>> virtual machine instances would cause a significant workload on the >>>>> servers. >>>>> The servers become quite busy even unavailable during the deployment >>>>> phase. >>>>> It would consume a very long time before the whole virtual machine >>>>> cluster >>>>> useable. >>>>> >>>>> >>>>> >>>>> Our extension is based on our work on vmThunder, a novel mechanism >>>>> accelerating the deployment of large number virtual machine instances. >>>>> It >>>>> is >>>>> written in Python, can be integrated with OpenStack easily. VMThunder >>>>> addresses the problem described above by following improvements: >>>>> on-demand >>>>> transferring (network attached storage), compute node caching, P2P >>>>> transferring and prefetching. VMThunder is a scalable and cost-effective >>>>> accelerator for bulk provisioning of virtual machines. >>>>> >>>>> >>>>> >>>>> We hope to receive your feedbacks. Any comments are extremely welcome. >>>>> Thanks in advance. >>>>> >>>>> >>>>> >>>>> PS: >>>>> >>>>> >>>>> >>>>> VMThunder enhanced nova blueprint: >>>>> https://blueprints.launchpad.net/nova/+spec/thunderboost >>>>> >>>>> VMThunder standalone project: https://launchpad.net/vmthunder; >>>>> >>>>> VMThunder prototype: https://github.com/lihuiba/VMThunder >>>>> >>>>> VMThunder etherpad: https://etherpad.openstack.org/p/vmThunder >>>>> >>>>> VMThunder portal: http://www.vmthunder.org/ >>>>> >>>>> VMThunder paper: >>>>> http://www.computer.org/csdl/trans/td/preprint/06719385.pdf >>>>> >>>>> >>>>> >>>>> Regards >>>>> >>>>> >>>>> >>>>> vmThunder development group >>>>> >>>>> PDL >>>>> >>>>> National University of Defense Technology >>>>> >>>>> >>>>> _______________________________________________ >>>>> OpenStack-dev mailing list >>>>> OpenStack-dev@lists.openstack.org >>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>>>> >>>> >>>> _______________________________________________ >>>> OpenStack-dev mailing list >>>> OpenStack-dev@lists.openstack.org >>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>>> >>>> >>>> >>>> -- >>>> Yongquan Fu >>>> PhD, Assistant Professor, >>>> National Key Laboratory for Parallel and Distributed >>>> Processing, College of Computer Science, National University of Defense >>>> Technology, Changsha, Hunan Province, P.R. China >>>> 410073 >>>> >>>> _______________________________________________ >>>> OpenStack-dev mailing list >>>> OpenStack-dev@lists.openstack.org >>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>>> >>> >>>_______________________________________________ >>>OpenStack-dev mailing list >>>OpenStack-dev@lists.openstack.org >>>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > >_______________________________________________ >OpenStack-dev mailing list >OpenStack-dev@lists.openstack.org >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev