Re: [Lxc-users] Copy-on-write hard-link / hashify feature

2010-06-11 Thread Daniel Lezcano
On 06/10/2010 10:54 PM, Gordon Henderson wrote:
 On Thu, 10 Jun 2010, John Drescher wrote:


 BTW, a second option is lessfs.

 http://www.lessfs.com/wordpress/?page_id=50
  
 What about the KSM kernel option? It's aimed at KVM I think and in the
 kernel from 2.6.32. See:

http://lwn.net/Articles/306704/
 and
http://lwn.net/Articles/330589/

 Not sure if that could be used to help here - it seems a bit of a
 retrospective way to find data duplications - assuming we could enable it
 for whole containers...


KSM is enabled on my ubuntu 10.04. When I do a compilation, ksm takes 
more cpu than the compilation itself and is always eating 10-30% of my 
cpu (Intel(R) Core(TM)2 Duo CPU T9500  @ 2.60GHz). So I disabled it 
definitively ...



--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] Copy-on-write hard-link / hashify feature

2010-06-11 Thread Gordan Bobic
On 06/11/2010 09:57 AM, Daniel Lezcano wrote:
 On 06/10/2010 10:54 PM, Gordon Henderson wrote:
 On Thu, 10 Jun 2010, John Drescher wrote:


 BTW, a second option is lessfs.

 http://www.lessfs.com/wordpress/?page_id=50

 What about the KSM kernel option? It's aimed at KVM I think and in the
 kernel from 2.6.32. See:

 http://lwn.net/Articles/306704/
 and
 http://lwn.net/Articles/330589/

 Not sure if that could be used to help here - it seems a bit of a
 retrospective way to find data duplications - assuming we could enable it
 for whole containers...


 KSM is enabled on my ubuntu 10.04. When I do a compilation, ksm takes
 more cpu than the compilation itself and is always eating 10-30% of my
 cpu (Intel(R) Core(TM)2 Duo CPU T9500  @ 2.60GHz). So I disabled it
 definitively ...

Are you saying that KSM is performing memory de-duplication on bare 
metal, rather than inside a KVM VM? That can't be right.

My guess that you have it misconfigured to be scanning the memory too 
frequently and it's spinning empty?

Gordan

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] Copy-on-write hard-link / hashify feature

2010-06-11 Thread Daniel Lezcano
On 06/11/2010 11:08 AM, Gordan Bobic wrote:
 On 06/11/2010 09:57 AM, Daniel Lezcano wrote:
 On 06/10/2010 10:54 PM, Gordon Henderson wrote:
 On Thu, 10 Jun 2010, John Drescher wrote:


 BTW, a second option is lessfs.

 http://www.lessfs.com/wordpress/?page_id=50

 What about the KSM kernel option? It's aimed at KVM I think and in the
 kernel from 2.6.32. See:

  http://lwn.net/Articles/306704/
 and
  http://lwn.net/Articles/330589/

 Not sure if that could be used to help here - it seems a bit of a
 retrospective way to find data duplications - assuming we could enable it
 for whole containers...


 KSM is enabled on my ubuntu 10.04. When I do a compilation, ksm takes
 more cpu than the compilation itself and is always eating 10-30% of my
 cpu (Intel(R) Core(TM)2 Duo CPU T9500  @ 2.60GHz). So I disabled it
 definitively ...

 Are you saying that KSM is performing memory de-duplication on bare
 metal, rather than inside a KVM VM? That can't be right.

 My guess that you have it misconfigured to be scanning the memory too
 frequently and it's spinning empty?

Yes, I think it is probable. I didn't tune the ubuntu default settings.


--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] Copy-on-write hard-link / hashify feature

2010-06-10 Thread Gordan Bobic
On 06/10/2010 09:54 PM, Gordon Henderson wrote:
 On Thu, 10 Jun 2010, John Drescher wrote:

 BTW, a second option is lessfs.

 http://www.lessfs.com/wordpress/?page_id=50

 What about the KSM kernel option? It's aimed at KVM I think and in the
 kernel from 2.6.32. See:

http://lwn.net/Articles/306704/
 and
http://lwn.net/Articles/330589/

 Not sure if that could be used to help here - it seems a bit of a
 retrospective way to find data duplications - assuming we could enable it
 for whole containers...

I've been thinking about that, too. Here's the thread I started on the 
KVM mailing list, detailing the approach I was thinking about:

http://www.spinics.net/lists/kvm/msg36193.html

Gordan

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] Copy-on-write hard-link / hashify feature

2010-06-10 Thread Gordon Henderson
On Thu, 10 Jun 2010, John Drescher wrote:

 BTW, a second option is lessfs.

 http://www.lessfs.com/wordpress/?page_id=50

What about the KSM kernel option? It's aimed at KVM I think and in the 
kernel from 2.6.32. See:

  http://lwn.net/Articles/306704/
and
  http://lwn.net/Articles/330589/

Not sure if that could be used to help here - it seems a bit of a 
retrospective way to find data duplications - assuming we could enable it 
for whole containers...

Gordon

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] Copy-on-write hard-link / hashify feature

2010-06-10 Thread Daniel Lezcano
On 06/10/2010 11:25 AM, Gordan Bobic wrote:
 On 06/09/2010 11:47 PM, Daniel Lezcano wrote:

 On 06/09/2010 10:46 PM, Gordan Bobic wrote:
  
 On 06/09/2010 09:08 PM, Daniel Lezcano wrote:

 On 06/09/2010 08:45 PM, Gordan Bobic wrote:
  
 Is there a feature that allows unifying identical files between guests
 via hard-links to save both space and memory (on shared libraries)?
 VServers has a feature for this called hashify, but I haven't been able
 to find such a thing in LXC documentation. Is there such a thing?

 Obviously, I could manually do the searching and hard-linking, but this
 is dangerous since without the copy-on-write feature for such
 hard-linked files that VServers provides, it would be dangerous as any
 guest could change a file on all guests.

 Is there a way to do this safely with LXC?

 No because it is supported by the system with the btrfs cow / snapshot
 file system.

 https://btrfs.wiki.kernel.org

 You can create your btrfs filesystem, mount it somewhere in your fs,
 install a distro and then make a snapshot, that will result in a
 directory. Assign this directory as the rootfs of your container. For
 each container you want to install, create a snapshot of the initial
 installation and assign each resulting directory for a container.
  
 OK, this obviously saves the disk space. What about shared libraries
 memory conservation? Do the shared files in different snapshots have the
 same inodes?

 Yes.
  
 So this implicitly implements COW hard-linking?

I am not an expert with btrfs, but if I understand correctly what you 
mean by COW hard-linking, IMO yes.

I created a btrfs image, put a file, checked the inode, did a snapshot, 
modified the file in the snapshot, checked the inode, it was the same 
but the file content was different.

 What about re-merging them after they get out of sync? For example, if I
 yum update, and a new glibc gets onto each of the virtual hosts, they
 will become unshared and each get different inode numbers which will
 cause them to no longer be mmap()-ed as one, thus rapidly increasing the
 memory requirements. Is there a way to merge them back together with the
 approach you are suggesting? I ask because VServer tools handle this
 relatively gracefully, and I see it as a frequently occurring usage
 pattern.

 The use case you are describing suppose the guests do not upgrade their
 os, so no need of a cow fs for some private modifications, no ?
  
 No, the use-case I'm describing treats guests pretty independently. I am
 saying that I can see a lot of cases where I might update a package in
 the guest which will cause those files to be COW-ed and unshared. I
 might then update another guest with the same package. It's files will
 not be COW-ed and unshared, too. Proceed until all guests are updated.
 now all instances of files in this package are COW-ed and unshared, but
 they are again identical files. I want to merge them back into COW
 hard-links in order to save disk-space and memory.

Ok, I see, thanks for explanation.

 I know that BTRFS has block-level deduplication feature (or will have
 such a feature soon), but that doesn't address the memory saving, does
 it? My understanding (potentially erroneous?) is that DLLs get mapped
 into same shared memory iif their inodes are the same (i.e. if the two
 DLLs are hard-linked).

Mmmh, that need to be investigated, but I don't think.

 VServer's hashify feature handles this unmerge-remerge scenario
 gracefully so as to preserver both the disk and memory savings. I can
 understand that BTRFS will preserve (some of) the disk savings with it's
 features, but it is not at all clear to me that it will preserve the
 memory savings.


It's an interesting question, I think we should ask this question to the 
btrfs maintainers.

 In this case, an empty file hierarchy as a rootfs and the hosts system
 libraries, tools directories can be ro-binded-mounted in this rootfs
 with a private /etc and /home.
  
 That is an interesting idea, and might work to some extent, but it is
 rather inflexible compared to the VServer alternative that is
 effectively fully dynamic.


Do you have any pointer explaining this feature ?

Thanks
   -- Daniel

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] Copy-on-write hard-link / hashify feature

2010-06-10 Thread Gordan Bobic
On 06/10/2010 09:35 PM, John Drescher wrote:
 BTW, a second option is lessfs.

 http://www.lessfs.com/wordpress/?page_id=50

Interesting project, but again, in this context, similar problem to 
btrfs (or any other FS) - without VServer type COW hard-links, no RAM 
savings to go with the space savings. :(

Gordan

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] Copy-on-write hard-link / hashify feature

2010-06-10 Thread C Anthony Risinger
On Thu, Jun 10, 2010 at 3:54 PM, Gordon Henderson gor...@drogon.net wrote:
 On Thu, 10 Jun 2010, John Drescher wrote:

 BTW, a second option is lessfs.

 http://www.lessfs.com/wordpress/?page_id=50

 What about the KSM kernel option? It's aimed at KVM I think and in the
 kernel from 2.6.32.

i'm not really familiar with KSM, but it believe it is intended to be
a generic solution for problems like this.

however, i think it requires prodding from userspace applications to
trigger scans for identical pages; it does not IIRC do any automatic
scanning and merging... applications/modules must explicitly
request/trigger merges.

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] Copy-on-write hard-link / hashify feature

2010-06-09 Thread Gordan Bobic
On 06/09/2010 09:08 PM, Daniel Lezcano wrote:
 On 06/09/2010 08:45 PM, Gordan Bobic wrote:
 Is there a feature that allows unifying identical files between guests
 via hard-links to save both space and memory (on shared libraries)?
 VServers has a feature for this called hashify, but I haven't been able
 to find such a thing in LXC documentation. Is there such a thing?

 Obviously, I could manually do the searching and hard-linking, but this
 is dangerous since without the copy-on-write feature for such
 hard-linked files that VServers provides, it would be dangerous as any
 guest could change a file on all guests.

 Is there a way to do this safely with LXC?

 No because it is supported by the system with the btrfs cow / snapshot
 file system.

 https://btrfs.wiki.kernel.org

 You can create your btrfs filesystem, mount it somewhere in your fs,
 install a distro and then make a snapshot, that will result in a
 directory. Assign this directory as the rootfs of your container. For
 each container you want to install, create a snapshot of the initial
 installation and assign each resulting directory for a container.

OK, this obviously saves the disk space. What about shared libraries 
memory conservation? Do the shared files in different snapshots have the 
same inodes?

What about re-merging them after they get out of sync? For example, if I 
yum update, and a new glibc gets onto each of the virtual hosts, they 
will become unshared and each get different inode numbers which will 
cause them to no longer be mmap()-ed as one, thus rapidly increasing the 
memory requirements. Is there a way to merge them back together with the 
approach you are suggesting? I ask because VServer tools handle this 
relatively gracefully, and I see it as a frequently occurring usage pattern.

Gordan

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users