On Fri, Apr 17, 2009 at 2:45 PM, Nitin Gupta <[email protected]> wrote: > On Fri, Apr 17, 2009 at 4:02 AM, Peter Dolding <[email protected]> wrote: >> The copy on write system also appears to provide something else >> interesting. ksm and compcache are both after allocation. The >> interesting question is if Linux kernel should provide a calloc >> function. So that on commit its is automatically stacked. This >> would massively reduce the numbers of blank matching pages. Linux >> system already has something that deals with malloc allowing over >> commits until accessed. >> > > Not sure if I understand you here. You mean all new allocation should > be zeroed to play better with KSM?
Not sure either, but it seems similar to my suggestion that we could use existing techniques to zero garbage. The suggested purpose of these techniqueswas security, but this would presumably also improve the compression ratio of compcache. Apparently they require only ~1% overhead and we may be able to do even better that this if the goal is performance rather than security http://www.usenix.org/events/sec05/tech/full_papers/chow/chow_html/index.html Unfortunately they have lost the code, so we would have to reimplement it from scratch. > For simplicity, code currently in SVN > decompresses individual objects in a page before writing out to backing > swap device. One complexity is that compressed pages could get fragmented. I am not sure if pages being adjacent on the swap device means that they are related, but even if not, there would be some book keeping regarding free space fragmentation. As an aside, with decent wear leveling, swap on SSD is feasible, but compressing pages first would seem a good idea to reduce wear of the SSD device. I was thinking of some algorithms to write out pages to SSD in a optimized way. One obvious technique would be to write to pages in a round robin fashion, consolidating free space as we go. This would theoretically lead to perfect wear leveling, although skipping over sectors that have little free space would seem a good idea. However most PCs have SSD devices that do their own wear leveling. I am not sure what the best strategy for these devices would be. However it does seem to me that as SSD devices become more common SSD optimised swap would be useful, as despite the obvious disavantages SSD does have the advantage of fast random reads so swapping to SSD is less likely to kill performance than to HDD. Modern SSD drives can survive years of properly wear levelled writes and although SSD is reasonably expensive per MB, a particular machine may well happen to have substantial free SSD space but limited memory. Perhaps I should find some SSD related people to ask about SSD optimized writes? > For duplicate page removal COW has lower space/time overhead than > compressing all individual copies with ramzswap approach. But this If we wanted, we could keep only a single copy of duplicated pages in compcache. Since we compress the pages anyway, we may be able to assume that the non-duplcated pages are fairly random, allowing us to implement a hash table with minor overhead. This may be worthwhile if we have many VMs of the same OS. However maybe it would be better to just run compcache and KSM together and let each one handle its own strengths. (Hopefully this would also mean less work for Nitin :) > virtual block device approach has big advantages - we need not patch > kernel and keeps things simple. Going forward, we can better take > advantage of stacked block devices to provide compressed disk based > swapping, generic R/W compressed caches for arbitrary block devices > (compressed cached /tmpfs comes to mind) etc. I understand tmpfs will swap out unused pages to disk, so we already have a compressed cached tmpfs of sorts. I can see a number of advantages to an explicitly cc'd tmpfs, e.g. the option of larger blocks with a better compression ratio though, and smarter decisions as to when to compress pages. However the current set up has the advantage that it is very simple, as compcache doesn't have to worry about any of this, and presumably tmpfs is optimized to be very fast when it does not need to be swapped out. I am not sure if a block device can hold onto pages the kernel hands to it without needing to memcpy (mostly because I know very little about the Linux kernel internals). -- John C. McCabe-Dansted PhD Student University of Western Australia _______________________________________________ linux-mm-cc mailing list [email protected] http://lists.laptop.org/listinfo/linux-mm-cc
