Re: [Lxc-users] Copy-on-write hard-link / hashify feature
On 06/10/2010 10:54 PM, Gordon Henderson wrote: On Thu, 10 Jun 2010, John Drescher wrote: BTW, a second option is lessfs. http://www.lessfs.com/wordpress/?page_id=50 What about the KSM kernel option? It's aimed at KVM I think and in the kernel from 2.6.32. See: http://lwn.net/Articles/306704/ and http://lwn.net/Articles/330589/ Not sure if that could be used to help here - it seems a bit of a retrospective way to find data duplications - assuming we could enable it for whole containers... KSM is enabled on my ubuntu 10.04. When I do a compilation, ksm takes more cpu than the compilation itself and is always eating 10-30% of my cpu (Intel(R) Core(TM)2 Duo CPU T9500 @ 2.60GHz). So I disabled it definitively ... -- ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] Copy-on-write hard-link / hashify feature
On 06/11/2010 09:57 AM, Daniel Lezcano wrote: On 06/10/2010 10:54 PM, Gordon Henderson wrote: On Thu, 10 Jun 2010, John Drescher wrote: BTW, a second option is lessfs. http://www.lessfs.com/wordpress/?page_id=50 What about the KSM kernel option? It's aimed at KVM I think and in the kernel from 2.6.32. See: http://lwn.net/Articles/306704/ and http://lwn.net/Articles/330589/ Not sure if that could be used to help here - it seems a bit of a retrospective way to find data duplications - assuming we could enable it for whole containers... KSM is enabled on my ubuntu 10.04. When I do a compilation, ksm takes more cpu than the compilation itself and is always eating 10-30% of my cpu (Intel(R) Core(TM)2 Duo CPU T9500 @ 2.60GHz). So I disabled it definitively ... Are you saying that KSM is performing memory de-duplication on bare metal, rather than inside a KVM VM? That can't be right. My guess that you have it misconfigured to be scanning the memory too frequently and it's spinning empty? Gordan -- ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] Copy-on-write hard-link / hashify feature
On 06/11/2010 11:08 AM, Gordan Bobic wrote: On 06/11/2010 09:57 AM, Daniel Lezcano wrote: On 06/10/2010 10:54 PM, Gordon Henderson wrote: On Thu, 10 Jun 2010, John Drescher wrote: BTW, a second option is lessfs. http://www.lessfs.com/wordpress/?page_id=50 What about the KSM kernel option? It's aimed at KVM I think and in the kernel from 2.6.32. See: http://lwn.net/Articles/306704/ and http://lwn.net/Articles/330589/ Not sure if that could be used to help here - it seems a bit of a retrospective way to find data duplications - assuming we could enable it for whole containers... KSM is enabled on my ubuntu 10.04. When I do a compilation, ksm takes more cpu than the compilation itself and is always eating 10-30% of my cpu (Intel(R) Core(TM)2 Duo CPU T9500 @ 2.60GHz). So I disabled it definitively ... Are you saying that KSM is performing memory de-duplication on bare metal, rather than inside a KVM VM? That can't be right. My guess that you have it misconfigured to be scanning the memory too frequently and it's spinning empty? Yes, I think it is probable. I didn't tune the ubuntu default settings. -- ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] Copy-on-write hard-link / hashify feature
On 06/10/2010 09:54 PM, Gordon Henderson wrote: On Thu, 10 Jun 2010, John Drescher wrote: BTW, a second option is lessfs. http://www.lessfs.com/wordpress/?page_id=50 What about the KSM kernel option? It's aimed at KVM I think and in the kernel from 2.6.32. See: http://lwn.net/Articles/306704/ and http://lwn.net/Articles/330589/ Not sure if that could be used to help here - it seems a bit of a retrospective way to find data duplications - assuming we could enable it for whole containers... I've been thinking about that, too. Here's the thread I started on the KVM mailing list, detailing the approach I was thinking about: http://www.spinics.net/lists/kvm/msg36193.html Gordan -- ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] Copy-on-write hard-link / hashify feature
On Thu, 10 Jun 2010, John Drescher wrote: BTW, a second option is lessfs. http://www.lessfs.com/wordpress/?page_id=50 What about the KSM kernel option? It's aimed at KVM I think and in the kernel from 2.6.32. See: http://lwn.net/Articles/306704/ and http://lwn.net/Articles/330589/ Not sure if that could be used to help here - it seems a bit of a retrospective way to find data duplications - assuming we could enable it for whole containers... Gordon -- ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] Copy-on-write hard-link / hashify feature
On 06/10/2010 11:25 AM, Gordan Bobic wrote: On 06/09/2010 11:47 PM, Daniel Lezcano wrote: On 06/09/2010 10:46 PM, Gordan Bobic wrote: On 06/09/2010 09:08 PM, Daniel Lezcano wrote: On 06/09/2010 08:45 PM, Gordan Bobic wrote: Is there a feature that allows unifying identical files between guests via hard-links to save both space and memory (on shared libraries)? VServers has a feature for this called hashify, but I haven't been able to find such a thing in LXC documentation. Is there such a thing? Obviously, I could manually do the searching and hard-linking, but this is dangerous since without the copy-on-write feature for such hard-linked files that VServers provides, it would be dangerous as any guest could change a file on all guests. Is there a way to do this safely with LXC? No because it is supported by the system with the btrfs cow / snapshot file system. https://btrfs.wiki.kernel.org You can create your btrfs filesystem, mount it somewhere in your fs, install a distro and then make a snapshot, that will result in a directory. Assign this directory as the rootfs of your container. For each container you want to install, create a snapshot of the initial installation and assign each resulting directory for a container. OK, this obviously saves the disk space. What about shared libraries memory conservation? Do the shared files in different snapshots have the same inodes? Yes. So this implicitly implements COW hard-linking? I am not an expert with btrfs, but if I understand correctly what you mean by COW hard-linking, IMO yes. I created a btrfs image, put a file, checked the inode, did a snapshot, modified the file in the snapshot, checked the inode, it was the same but the file content was different. What about re-merging them after they get out of sync? For example, if I yum update, and a new glibc gets onto each of the virtual hosts, they will become unshared and each get different inode numbers which will cause them to no longer be mmap()-ed as one, thus rapidly increasing the memory requirements. Is there a way to merge them back together with the approach you are suggesting? I ask because VServer tools handle this relatively gracefully, and I see it as a frequently occurring usage pattern. The use case you are describing suppose the guests do not upgrade their os, so no need of a cow fs for some private modifications, no ? No, the use-case I'm describing treats guests pretty independently. I am saying that I can see a lot of cases where I might update a package in the guest which will cause those files to be COW-ed and unshared. I might then update another guest with the same package. It's files will not be COW-ed and unshared, too. Proceed until all guests are updated. now all instances of files in this package are COW-ed and unshared, but they are again identical files. I want to merge them back into COW hard-links in order to save disk-space and memory. Ok, I see, thanks for explanation. I know that BTRFS has block-level deduplication feature (or will have such a feature soon), but that doesn't address the memory saving, does it? My understanding (potentially erroneous?) is that DLLs get mapped into same shared memory iif their inodes are the same (i.e. if the two DLLs are hard-linked). Mmmh, that need to be investigated, but I don't think. VServer's hashify feature handles this unmerge-remerge scenario gracefully so as to preserver both the disk and memory savings. I can understand that BTRFS will preserve (some of) the disk savings with it's features, but it is not at all clear to me that it will preserve the memory savings. It's an interesting question, I think we should ask this question to the btrfs maintainers. In this case, an empty file hierarchy as a rootfs and the hosts system libraries, tools directories can be ro-binded-mounted in this rootfs with a private /etc and /home. That is an interesting idea, and might work to some extent, but it is rather inflexible compared to the VServer alternative that is effectively fully dynamic. Do you have any pointer explaining this feature ? Thanks -- Daniel -- ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] Copy-on-write hard-link / hashify feature
On 06/10/2010 09:35 PM, John Drescher wrote: BTW, a second option is lessfs. http://www.lessfs.com/wordpress/?page_id=50 Interesting project, but again, in this context, similar problem to btrfs (or any other FS) - without VServer type COW hard-links, no RAM savings to go with the space savings. :( Gordan -- ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] Copy-on-write hard-link / hashify feature
On Thu, Jun 10, 2010 at 3:54 PM, Gordon Henderson gor...@drogon.net wrote: On Thu, 10 Jun 2010, John Drescher wrote: BTW, a second option is lessfs. http://www.lessfs.com/wordpress/?page_id=50 What about the KSM kernel option? It's aimed at KVM I think and in the kernel from 2.6.32. i'm not really familiar with KSM, but it believe it is intended to be a generic solution for problems like this. however, i think it requires prodding from userspace applications to trigger scans for identical pages; it does not IIRC do any automatic scanning and merging... applications/modules must explicitly request/trigger merges. -- ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] Copy-on-write hard-link / hashify feature
On 06/09/2010 09:08 PM, Daniel Lezcano wrote: On 06/09/2010 08:45 PM, Gordan Bobic wrote: Is there a feature that allows unifying identical files between guests via hard-links to save both space and memory (on shared libraries)? VServers has a feature for this called hashify, but I haven't been able to find such a thing in LXC documentation. Is there such a thing? Obviously, I could manually do the searching and hard-linking, but this is dangerous since without the copy-on-write feature for such hard-linked files that VServers provides, it would be dangerous as any guest could change a file on all guests. Is there a way to do this safely with LXC? No because it is supported by the system with the btrfs cow / snapshot file system. https://btrfs.wiki.kernel.org You can create your btrfs filesystem, mount it somewhere in your fs, install a distro and then make a snapshot, that will result in a directory. Assign this directory as the rootfs of your container. For each container you want to install, create a snapshot of the initial installation and assign each resulting directory for a container. OK, this obviously saves the disk space. What about shared libraries memory conservation? Do the shared files in different snapshots have the same inodes? What about re-merging them after they get out of sync? For example, if I yum update, and a new glibc gets onto each of the virtual hosts, they will become unshared and each get different inode numbers which will cause them to no longer be mmap()-ed as one, thus rapidly increasing the memory requirements. Is there a way to merge them back together with the approach you are suggesting? I ask because VServer tools handle this relatively gracefully, and I see it as a frequently occurring usage pattern. Gordan -- ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users