Re: Reiser4 und LZO compression
Clemens Eisserer wrote: But speaking of single threadedness, more and more desktops are shipping with ridiculously more power than people need. Even a gamer really Will the LZO compression code in reiser4 be able to use multi-processor systems? Good point, but it wasn't what I was talking about. I was talking about the compression happening on one CPU, meaning even if it takes most of the CPU to saturate disk throughput, your other CPU is still 100% available, meaning the typical desktop user won't notice their apps running slower, they'll just notice disk access being faster.
Re: Reiser4 und LZO compression
Hans Reiser wrote: Edward Shishkin wrote: Clemens Eisserer wrote: But speaking of single threadedness, more and more desktops are shipping with ridiculously more power than people need. Even a gamer really Will the LZO compression code in reiser4 be able to use multi-processor systems? E.g. if I've a Turion-X2 in my laptop will it use 2 threads for compression/decompression making cpu throughput much better than whatthe disk could do? Compression is going in flush time and there can be more then one flush thread that processes the same transaction atom. Decompression is going in the context of readpage/readpages. So if you mean per file, then yes for compression and no for decompression. I don't think your explanation above is a good one. If there is more than one process reading a file, then you can have multiple decompressions at one time of the same file, yes? You are almost right. Unless they read the same logical cluster. Just because there can be more than one flush thread per file does not mean it is likely there will be. CPU scheduling of compression/decompression is an area that could use work in the future.For now, just understand that what we do is better than doing nothing.;-/ Edward. lg Clemens 2006/8/30, Hans Reiser <[EMAIL PROTECTED]>: Edward Shishkin wrote: (Plain) file is considered as a set of logical clusters (64K by default). Minimal unit occupied in memory by (plain) file is one page. Compressed logical cluster is stored on disk in so-called "disk clusters". Disk cluster is a set of special items (aka "ctails", or "compressed bodies"), so that one block can contain (compressed) data of many files and everything is packed tightly on disk. So the compression unit is 64k for purposes of your benchmarks.
Re: Reiser4 und LZO compression
Edward Shishkin wrote: > Clemens Eisserer wrote: >>> But speaking of single threadedness, more and more desktops are >>> shipping >>> with ridiculously more power than people need. Even a gamer really >> >> Will the LZO compression code in reiser4 be able to use >> multi-processor systems? >> E.g. if I've a Turion-X2 in my laptop will it use 2 threads for >> compression/decompression making cpu throughput much better than >> whatthe disk could do? >> > > Compression is going in flush time and there can be more then > one flush thread that processes the same transaction atom. > Decompression is going in the context of readpage/readpages. > So if you mean per file, then yes for compression and no for > decompression. I don't think your explanation above is a good one. If there is more than one process reading a file, then you can have multiple decompressions at one time of the same file, yes? Just because there can be more than one flush thread per file does not mean it is likely there will be. CPU scheduling of compression/decompression is an area that could use work in the future.For now, just understand that what we do is better than doing nothing.;-/ > > Edward. > > >> lg Clemens >> >> >> 2006/8/30, Hans Reiser <[EMAIL PROTECTED]>: >> >>> Edward Shishkin wrote: >>> > >>> > (Plain) file is considered as a set of logical clusters (64K by >>> > default). Minimal unit occupied in memory by (plain) file is one >>> > page. Compressed logical cluster is stored on disk in so-called >>> > "disk clusters". Disk cluster is a set of special items (aka >>> "ctails", >>> > or "compressed bodies"), so that one block can contain (compressed) >>> > data of many files and everything is packed tightly on disk. >>> > >>> > >>> > >>> So the compression unit is 64k for purposes of your benchmarks. >>> >> >> > > >
Re: Reiser4 und LZO compression
Hi Edward, Thanks a lot for answering. Compression is going in flush time and there can be more then one flush thread that processes the same transaction atom. Decompression is going in the context of readpage/readpages. So if you mean per file, then yes for compression and no for decompression. So the parallelism is not really explicit, more or less a bit accidental. Are threads in the kernel possible - and if yes how large is the typical workload of stuff which can be decompressed? I guess for several hundered kb using more than one thread could speed things up quite a bit? lg Clemens
Re: Reiser4 und LZO compression
Clemens Eisserer wrote: But speaking of single threadedness, more and more desktops are shipping with ridiculously more power than people need. Even a gamer really Will the LZO compression code in reiser4 be able to use multi-processor systems? E.g. if I've a Turion-X2 in my laptop will it use 2 threads for compression/decompression making cpu throughput much better than whatthe disk could do? Compression is going in flush time and there can be more then one flush thread that processes the same transaction atom. Decompression is going in the context of readpage/readpages. So if you mean per file, then yes for compression and no for decompression. Edward. lg Clemens 2006/8/30, Hans Reiser <[EMAIL PROTECTED]>: Edward Shishkin wrote: > > (Plain) file is considered as a set of logical clusters (64K by > default). Minimal unit occupied in memory by (plain) file is one > page. Compressed logical cluster is stored on disk in so-called > "disk clusters". Disk cluster is a set of special items (aka "ctails", > or "compressed bodies"), so that one block can contain (compressed) > data of many files and everything is packed tightly on disk. > > > So the compression unit is 64k for purposes of your benchmarks.
Re: Reiser4 und LZO compression
But speaking of single threadedness, more and more desktops are shipping with ridiculously more power than people need. Even a gamer really Will the LZO compression code in reiser4 be able to use multi-processor systems? E.g. if I've a Turion-X2 in my laptop will it use 2 threads for compression/decompression making cpu throughput much better than whatthe disk could do? lg Clemens 2006/8/30, Hans Reiser <[EMAIL PROTECTED]>: Edward Shishkin wrote: > > (Plain) file is considered as a set of logical clusters (64K by > default). Minimal unit occupied in memory by (plain) file is one > page. Compressed logical cluster is stored on disk in so-called > "disk clusters". Disk cluster is a set of special items (aka "ctails", > or "compressed bodies"), so that one block can contain (compressed) > data of many files and everything is packed tightly on disk. > > > So the compression unit is 64k for purposes of your benchmarks.
Re: Reiser4 und LZO compression
Edward Shishkin wrote: > > (Plain) file is considered as a set of logical clusters (64K by > default). Minimal unit occupied in memory by (plain) file is one > page. Compressed logical cluster is stored on disk in so-called > "disk clusters". Disk cluster is a set of special items (aka "ctails", > or "compressed bodies"), so that one block can contain (compressed) > data of many files and everything is packed tightly on disk. > > > So the compression unit is 64k for purposes of your benchmarks.
Re: Reiser4 und LZO compression
PFC wrote: Maybe, but Reiser4 is supposed to be a general purpose filesystem talking about its advantages/disadvantages wrt. gaming makes sense, I don't see a lot of gamers using Linux ;) But yes, gaming is what pushes hardware development these days, at least on the desktop. Also, as you said, gamers (like many others) reinvent filesystems and generally use the Big Zip File paradigm, which is not that stupid for a read only FS (if you cache all file offsets, reading can be pretty fast). However when you start storing ogg-compressed sound and JPEG images inside a zip file, it starts to stink. *** Does the CPU power necessary to do the compression cost more or less than another drive? *** It depends, you have to consider several distinct scenarios. For instance, on a big Postgres database server, the rule is to have as many spindles as you can. - If you are doing a lot of full table scans (like data mining etc), more spindles means reads can be parallelized ; of course this will mean more data will have to be decompressed. - If you are doing a lot of little transactions (web sites), it means seeks can be distributed around the various disks. In this case compression would be a big win because there is free CPU to use ; besides, it would virtually double the RAM cache size. You have to ponder cost (in CPU $) of compression versus the cost in "virtual RAM" saved for caching and the cost in disks not bought. *** Do the two processors have separate caches, and thus being overly fined grained makes you memory transfer bound or? It depends on which dual core system you use ; future systems (like Core) will definitely share cache as this is the best option. *** If we analyze the results of my little compression benchmarks, we find that : - gzip is way too slow. - lzo and lzf are pretty close. LZF is faster than LZO (especially on decompression) but compresses worse. So, when we are disk-bound, LZF will be slower. When we are CPU-bound, LZF will be faster. The differences are not that huge, though, so it might be worthwile to weight this against the respective code cleanliness, of which I have no idea. However my compression benchmarks mean nothing because I'm compressing whole files whereas reiser4 will be compressing little blocks of files. We must therefore evaluate the performance of compressors on little blocks, which is very different from 300 megabytes files. For instance, the setup time of the compressor will be important (wether some huffman table needs to be constructed etc), and the compression ratios will be worse. Let's redo a benchmark then. For that I need to know if a compression block in reiser4 will be either : - a FS block containing several files (ie. a block will contain several small files) - a part of a file (ie. a small file will be 1 block) I think it's the second option, right ? (Plain) file is considered as a set of logical clusters (64K by default). Minimal unit occupied in memory by (plain) file is one page. Compressed logical cluster is stored on disk in so-called "disk clusters". Disk cluster is a set of special items (aka "ctails", or "compressed bodies"), so that one block can contain (compressed) data of many files and everything is packed tightly on disk.
Re: Reiser4 und LZO compression
PFC wrote: Maybe, but Reiser4 is supposed to be a general purpose filesystem talking about its advantages/disadvantages wrt. gaming makes sense, I don't see a lot of gamers using Linux ;) There have to be some. Transgaming seems to still be making a successful business out of making games work out-of-the-box under Wine. While I don't imagine there are as many who attempt gaming on Linux, I'd guess a significant portion of Linux users, if not the majority, are at least casual gamers. Some will have given up on the PC as a gaming platform long a go, tired of its upgrade cycle, crashes, game patches, and install times. These people will have a console for games, probably a PS2 so they can watch DVDs, and use their computer for real work, with as much free software as they can manage. Others will compromise somewhat. I compromise by running the binary nVidia drivers, keeping a Windows partition around sometimes, and enjoying many old games which have released their source recently, and now run under Linux -- as well as a few native Linux games, some Cedega games, and some under straight Wine. Basically, I'll play it on Linux if it works well, otherwise I boot Windows. I'm migrating away from that Windows dependency by making sure all my new game purchases work on Linux. Others will use some or all of the above -- stick to old games, use exclusively stuff that works on Linux (one way or the other), or give up on Linux gaming entirely and use a Windows partition. Anything Linux can do to become more game-friendly is one less reason for gamers to have to compromise. Not all gamers are willing to do that. I know at least two who ultimately decided that, with dual boot, they end up spending most of their time on Windows anyway. These are the people who would use Linux if they didn't have a good reason to use something else, but right now, they do. This is not the fault of the filesystem, but taking the attitude of "There aren't many Linux gamers anyway" -- that's a self-fulfilling prophecy, gamers WILL leave because of it. Also, as you said, gamers (like many others) reinvent filesystems and generally use the Big Zip File paradigm, which is not that stupid for a read only FS (if you cache all file offsets, reading can be pretty fast). However when you start storing ogg-compressed sound and JPEG images inside a zip file, it starts to stink. I don't like it as a read-only FS, either. Take an MMO -- while most commercial ones load the entire game to disk from install DVDs, there are some smaller ones which only cache the data as you explore the world. Also, even with the bigger ones, the world is always changing with patches, and I've seen patches take several hours to install -- not download, install -- on a 2.4 ghz amd64 with 2 gigs of RAM, on a striped RAID. You can trust me when I say this was mostly disk-bound, which is retarded, because it took less than half an hour to install in the first place. Even simple multiplayer games -- hell, even single-player games can get fairly massive updates relatively often. Half-Life 2 is one example -- they've now added HDR to the engine. In these cases, you still need as fast access as possible to the data (to cut down on load time), and it would be nice to save on space as well, but a zipfile starts to make less sense. And yet, I still see people using _cabinet_ files. Compression at the FS layer, plus efficient storing of small files, makes this much simpler. While you can make the zipfile-fs transparent to a game, even your mapping tools, it's still not efficient, and it's not transparent to your modeling package, Photoshop-alike, audio software, or gcc. But everything understands a filesystem. It depends, you have to consider several distinct scenarios. For instance, on a big Postgres database server, the rule is to have as many spindles as you can. - If you are doing a lot of full table scans (like data mining etc), more spindles means reads can be parallelized ; of course this will mean more data will have to be decompressed. I don't see why more spindles means more data decompressed. If anything, I'd imagine it would be less reads, total, if there's any kind of data locality. But I'll leave this to the database experts, for now. - If you are doing a lot of little transactions (web sites), it means seeks can be distributed around the various disks. In this case compression would be a big win because there is free CPU to use ; Dangerous assumption. Three words: Ruby on Rails. There goes your free CPU. Suddenly, compression makes no sense at all. But then, Ruby makes no sense at all for any serious load, unless you really have that much money to spend, or until the Ruby.NET compiler is finished -- that should speed things up. besides, it would virtually double the RAM cache size. No it wouldn't, not the way Reiser4 does it. Cur
Re: Reiser4 und LZO compression
Maybe, but Reiser4 is supposed to be a general purpose filesystem talking about its advantages/disadvantages wrt. gaming makes sense, I don't see a lot of gamers using Linux ;) But yes, gaming is what pushes hardware development these days, at least on the desktop. Also, as you said, gamers (like many others) reinvent filesystems and generally use the Big Zip File paradigm, which is not that stupid for a read only FS (if you cache all file offsets, reading can be pretty fast). However when you start storing ogg-compressed sound and JPEG images inside a zip file, it starts to stink. *** Does the CPU power necessary to do the compression cost more or less than another drive? *** It depends, you have to consider several distinct scenarios. For instance, on a big Postgres database server, the rule is to have as many spindles as you can. - If you are doing a lot of full table scans (like data mining etc), more spindles means reads can be parallelized ; of course this will mean more data will have to be decompressed. - If you are doing a lot of little transactions (web sites), it means seeks can be distributed around the various disks. In this case compression would be a big win because there is free CPU to use ; besides, it would virtually double the RAM cache size. You have to ponder cost (in CPU $) of compression versus the cost in "virtual RAM" saved for caching and the cost in disks not bought. *** Do the two processors have separate caches, and thus being overly fined grained makes you memory transfer bound or? It depends on which dual core system you use ; future systems (like Core) will definitely share cache as this is the best option. *** If we analyze the results of my little compression benchmarks, we find that : - gzip is way too slow. - lzo and lzf are pretty close. LZF is faster than LZO (especially on decompression) but compresses worse. So, when we are disk-bound, LZF will be slower. When we are CPU-bound, LZF will be faster. The differences are not that huge, though, so it might be worthwile to weight this against the respective code cleanliness, of which I have no idea. However my compression benchmarks mean nothing because I'm compressing whole files whereas reiser4 will be compressing little blocks of files. We must therefore evaluate the performance of compressors on little blocks, which is very different from 300 megabytes files. For instance, the setup time of the compressor will be important (wether some huffman table needs to be constructed etc), and the compression ratios will be worse. Let's redo a benchmark then. For that I need to know if a compression block in reiser4 will be either : - a FS block containing several files (ie. a block will contain several small files) - a part of a file (ie. a small file will be 1 block) I think it's the second option, right ?
Re: Reiser4 und LZO compression
Toby Thain wrote: Gamer systems, whether from coder's or player's p.o.v., would appear fairly irrelevant to reiserfs and this list. I'd trust Carmack's eye candy credentials but doubt he has much to say about filesystems or server threading... Maybe, but Reiser4 is supposed to be a general purpose filesystem, so talking about its advantages/disadvantages wrt. gaming makes sense, especially considering gamers are the most likely to tune their desktop for perfomance. That was a bit much, though. I apologize.
Re: Reiser4 und LZO compression
On 29-Aug-06, at 4:03 PM, David Masover wrote: Hans Reiser wrote: David Masover wrote: John Carmack is pretty much the only superstar programmer in video games, and after his first fairly massive attempt to make Quake 3 have two threads (since he'd just gotten a dual-core machine to play with) actually resulted in the game running some 30-40% slower than it did with a single thread. Do the two processors have separate caches, and thus being overly fined grained makes you memory transfer bound or? It wasn't anything that intelligent. Let me see if I can find it... Taken from http://techreport.com/etc/2005q3/carmack-quakecon/index.x?pg=1 "Graphics accelerators are a great example of parallelism working well, he noted, but game code is not similarly parallelizable. ... The downside is, most game developers are working on Windows, for which FS compression has always sucked. Thus, they most often implement their own compression, often something horrible, like storing the whole game in CAB or ZIP files, and loading the entire level into RAM before play starts, making load times less relevant for gameplay. Reiser4's cryptocompress would be a marked improvement over that, but it would also not be used in many games. Gamer systems, whether from coder's or player's p.o.v., would appear fairly irrelevant to reiserfs and this list. I'd trust Carmack's eye candy credentials but doubt he has much to say about filesystems or server threading...
Re: Reiser4 und LZO compression
Hi. On Tue, 2006-08-29 at 15:38 +0400, Edward Shishkin wrote: > Nigel Cunningham wrote: > > Hi. > > > > On Tue, 2006-08-29 at 03:23 -0500, David Masover wrote: > > > >>Nigel Cunningham wrote: > >> > >>>Hi. > >>> > >>>On Tue, 2006-08-29 at 06:05 +0200, Jan Engelhardt wrote: > >>> > Hmm. LZO is the best compression algorithm for the task as measured > by > the objectives of good compression effectiveness while still having > very > low CPU usage (the best of those written and GPL'd, there is a > slightly > better one which is proprietary and uses more CPU, LZRW if I remember > right. The gzip code base uses too much CPU, though I think Edward > made > >>> > >>>I don't think that LZO beats LZF in both speed and compression ratio. > >>> > >>>LZF is also available under GPL (dual-licensed BSD) and was choosen in > >>>favor > >>>of LZO for the next generation suspend-to-disk code of the Linux > >>>kernel. > >>> > >>>see: http://www.goof.com/pcg/marc/liblzf.html > >> > >>thanks for the info, we will compare them > > > >For Suspend2, we ended up converting the LZF support to a cryptoapi > >plugin. Is there any chance that you could use cryptoapi modules? We > >could then have a hope of sharing the support. > > I am throwing in gzip: would it be meaningful to use that instead? The > decoder (inflate.c) is already there. > > 06:04 shanghai:~/liblzf-1.6 > l configure* > -rwxr-xr-x 1 jengelh users 154894 Mar 3 2005 configure > -rwxr-xr-x 1 jengelh users 26810 Mar 3 2005 configure.bz2 > -rw-r--r-- 1 jengelh users 30611 Aug 28 20:32 configure.gz-z9 > -rw-r--r-- 1 jengelh users 30693 Aug 28 20:32 configure.gz-z6 > -rw-r--r-- 1 jengelh users 53077 Aug 28 20:32 configure.lzf > >>> > >>>We used gzip when we first implemented compression support, and found it > >>>to be far too slow. Even with the fastest compression options, we were > >>>only getting a few megabytes per second. Perhaps I did something wrong > >>>in configuring it, but there's not that many things to get wrong! > >> > >>All that comes to mind is the speed/quality setting -- the number from 1 > >>to 9. Recently, I backed up someone's hard drive using -1, and I > >>believe I was still able to saturate... the _network_. Definitely try > >>again if you haven't changed this, but I can't imagine I'm the first > >>persson to think of it. > >> > >> From what I remember, gzip -1 wasn't faster than the disk. But at > >>least for (very) repetitive data, I was wrong: > >> > >>eve:~ sanity$ time bash -c 'dd if=/dev/zero of=test bs=10m count=10; sync' > >>10+0 records in > >>10+0 records out > >>104857600 bytes transferred in 3.261990 secs (32145287 bytes/sec) > >> > >>real0m3.746s > >>user0m0.005s > >>sys 0m0.627s > >>eve:~ sanity$ time bash -c 'dd if=/dev/zero bs=10m count=10 | gzip -v1 > > >>test; sync' > >>10+0 records in > >>10+0 records out > >>104857600 bytes transferred in 2.404093 secs (43616282 bytes/sec) > >> 99.5% > >> > >>real0m2.558s > >>user0m1.554s > >>sys 0m0.680s > >>eve:~ sanity$ > >> > >> > >> > >>This was on OS X, but I think it's still valid -- this is a slightly > >>older Powerbook, with a 5400 RPM drive, 1.6 ghz G4. > >> > >>-1 is still worlds better than nothing. The backup was over 15 gigs, > >>down to about 6 -- loads of repetitive data, I'm sure, but that's where > >>you win with compression anyway. > > > > > > Wow. That's a lot better; I guess I did get something wrong in trying to > > tune deflate. That was pre-cryptoapi though; looking at > > cryptoapi/deflate.c, I don't see any way of controlling the compression > > level. Am I missing anything? > > > > zlib is tunable, not cryptoapi's deflate. > look at zlib_deflateInit2() Ok; thanks. I wasn't mistaken then :) Regards, Nigel
Re: Reiser4 und LZO compression
Hans Reiser wrote: David Masover wrote: John Carmack is pretty much the only superstar programmer in video games, and after his first fairly massive attempt to make Quake 3 have two threads (since he'd just gotten a dual-core machine to play with) actually resulted in the game running some 30-40% slower than it did with a single thread. Do the two processors have separate caches, and thus being overly fined grained makes you memory transfer bound or? It wasn't anything that intelligent. Let me see if I can find it... Taken from http://techreport.com/etc/2005q3/carmack-quakecon/index.x?pg=1 "Graphics accelerators are a great example of parallelism working well, he noted, but game code is not similarly parallelizable. Carmack cited his Quake III Arena engine, whose renderer was multithreaded and achieved up to 40% performance increases on multiprocessor systems, as a good example of where games would have to go. (Q3A's SMP mode was notoriously crash-prone and fragile, working only with certain graphics driver revisions and the like.) Initial returns on multithreading, he projected, will be disappointing." Basically, it's hard enough to split what we currently do onto even 2 CPUs, and it definitely seems like we're about to hit a wall in CPU frequency just as multicore becomes a practical reality, so future CPUs may be measured in how many cores they have, not how fast each core is. There's also a question of what to use the extra power for. From the same presentation: "Part of the problem with multithreading, argued Carmack, is knowing how to use the power of additional CPU cores to enhance the game experience. A.I., can be effective when very simple, as some of the first Doom logic was. It was less than a page of code, but players ascribed complex behaviors and motivations to the bad guys. However, more complex A.I. seems hard to improve to the point where it really changes the game. More physics detail, meanwhile, threatens to make games too fragile as interactions in the game world become more complex." So, I humbly predict that Physics cards (so-called PPUs) will fail, and be replaced by ever-increasing numbers of cores, which will, for awhile, be one step ahead of what we can think of to fill them with. Thus, anything useful (like compression) that can be split off into a separate thread is going to be useful for games, and won't hurt performance on future mega-multicore monstrosities. The downside is, most game developers are working on Windows, for which FS compression has always sucked. Thus, they most often implement their own compression, often something horrible, like storing the whole game in CAB or ZIP files, and loading the entire level into RAM before play starts, making load times less relevant for gameplay. Reiser4's cryptocompress would be a marked improvement over that, but it would also not be used in many games.
Re: Reiser4 und LZO compression
David Masover wrote: > John Carmack is pretty much the only superstar programmer in video > games, and after his first fairly massive attempt to make Quake 3 have > two threads (since he'd just gotten a dual-core machine to play with) > actually resulted in the game running some 30-40% slower than it did > with a single thread. Do the two processors have separate caches, and thus being overly fined grained makes you memory transfer bound or? Two processors tends to create a snappier user experience, in that big CPU processes get throttled nicely. Hans
Re: Reiser4 und LZO compression
Gregory Maxwell wrote: On 8/29/06, David Masover <[EMAIL PROTECTED]> wrote: [snip] Conversely, compression does NOT make sense if: - You spend a lot of time with the CPU busy and the disk idle. - You have more than enough disk space. - Disk space is cheaper than buying enough CPU to handle compression. - You've tried compression, and the CPU requirements slowed you more than you saved in disk access. [snip] It's also not always this simple ... if you have a single threaded workload that doesn't overlap CPU and disk well, (de)compression may be free even if you're still CPU bound a lot as the compression is using cpu cycles which would have been otherwise idle.. Isn't that implied, though -- if the CPU is not busy (run top under a 2.6 kernel and you'll see an IO-Wait number), then the first condition isn't satisfied -- CPU is not busy, disk is not idle. But speaking of single threadedness, more and more desktops are shipping with ridiculously more power than people need. Even a gamer really won't benefit that much from having a dual-core system, because multithreading is hard, and games haven't been doing it properly. John Carmack is pretty much the only superstar programmer in video games, and after his first fairly massive attempt to make Quake 3 have two threads (since he'd just gotten a dual-core machine to play with) actually resulted in the game running some 30-40% slower than it did with a single thread. So, for the desktop, compression makes perfect sense. We don't have massive amounts of RAID. If we have newer machines, there's a good chance we'll have one CPU sitting mostly idle while playing games. Short of gaming, there are few desktop applications that will fully utilize even one reasonably fast CPU. The reason gamers buy dual-core systems is they're getting cheap enough to be worth it, and that one core sitting idle is a perfect place to do OS/system work not related to the game -- antivirus, automatic update checks, the inevitable background processes leeching a couple few % off your available CPU. So for the typical new desktop with about 2 ghz of 64-bit processor sitting idle, compression is essentially free.
Re: Reiser4 und LZO compression
On 8/29/06, David Masover <[EMAIL PROTECTED]> wrote: [snip] Conversely, compression does NOT make sense if: - You spend a lot of time with the CPU busy and the disk idle. - You have more than enough disk space. - Disk space is cheaper than buying enough CPU to handle compression. - You've tried compression, and the CPU requirements slowed you more than you saved in disk access. [snip] It's also not always this simple ... if you have a single threaded workload that doesn't overlap CPU and disk well, (de)compression may be free even if you're still CPU bound a lot as the compression is using cpu cycles which would have been otherwise idle..
Re: Reiser4 und LZO compression
Hans Reiser wrote: PFC wrote: A big ass RAID will not get much benefit unless : - the buffer cache stores compressed pages, so compression virtually doubles the RAM cache - or the CPU is really fast - or you put one of these neat FPGA modules in a free Opteron socket and upload a soft-hardware LZF in it with a few gigabytes/s throughput Or you look the sysadmin in the eyes, and say, your file servers have more out of disk space problems than load problems, yes? I'd look at the IO-Wait number, also. Compression makes sense if: - You spend a lot of time waiting for the disk. - You need disk space, and either: - You already have enough spare CPU to do compression - It's cheaper to buy enough CPU than to buy the space compression would save you. Conversely, compression does NOT make sense if: - You spend a lot of time with the CPU busy and the disk idle. - You have more than enough disk space. - Disk space is cheaper than buying enough CPU to handle compression. - You've tried compression, and the CPU requirements slowed you more than you saved in disk access. After a certain amount of RAID -- really, after the second or third disk in a mirrored array, or the third or fourth disk in RAID 5 -- at that point, I don't think adding more disks is really doing a huge amount to increase reliability, which means you're either trying to increase speed or space. You can increase both of these by using compression, if you have the spare CPU, so the question becomes: Does the CPU power necessary to do the compression cost more or less than another drive? Especially in a big-ass RAID, you'll also want to be thinking about heat and power consumption, too. There are still cases where compression loses, but they seem pathological enough that you'd want to benchmark to see if they really apply to you. For instance, if you're dealing with lots of quick, read-only access to very tiny amounts of data, compression will likely slow you down, whereas adding another disk can speed you up. If your data isn't very compressible, then you're just burning cycles for no point. And, of course, the price/performance ratio (CPUs) and price/gig ratio (disk space) changes all the time. And all of this is ignoring the very real possibility of a dedicated hardware compressor -- at which point, we could afford pretty much any algorithm you like, as long as the hardware can do it quickly enough. This is an advantage to using cryptoapi for the cryptocompress plugin, by the way -- it's one place where we could call out to the hardware later.
Re: Reiser4 und LZO compression
PFC, thanks for giving us some real data. May I post it to the lkml thread? In essence, LZO wins the benchmarks, and the code is hard to read. I guess I have to go with LZO, and encourage people to take a stab at dethroning it. Hans PFC wrote: > > I have made a little openoffice spreadsheet with the results. > You can have fun entering stuff and seeing the results. > > http://peufeu.free.fr/compression.ods > > Basically, a laptop having the same processor as my PC and a > crummy 15 MB/s drive (like most laptop drives) will get a 2.5x speedup > using lzf, while using 40% CPU for compression and 15% CPU for > decompression. I'd say it's a clear, hge win. > > A desktop computer with a modern IDE drive doing 50 MB/s will > still get nice speedups (1.8x on write, 2.5x on read) but of course, > more CPU will be used because of the higher throughput. In this case > it is CPU limited on compression and disk limited on decompression. > However soon everyone will have dual core monsters so... > > A big ass RAID will not get much benefit unless : > - the buffer cache stores compressed pages, so compression > virtually doubles the RAM cache > - or the CPU is really fast > - or you put one of these neat FPGA modules in a free Opteron > socket and upload a soft-hardware LZF in it with a few gigabytes/s > throughput Or you look the sysadmin in the eyes, and say, your file servers have more out of disk space problems than load problems, yes? > > ... > >
Re: Reiser4 und LZO compression
PFC wrote: > > I made a little benchmark on my own PC (Athlon64 3200+ in 64 bit > gentoo) > > http://peufeu.free.fr/compression.html > > So, gzip could be used on PCs having very fast processors and very > slow harddrives, like Core Duo laptops. > However, lzo compresses nearly as much and is still a lot faster. > I don't see a reason for gzip in a FS application. > > Anyone has a bench for lzf ? > > Yes, Edward did equivalent tests, and thus we selected LZO.
Re: Reiser4 und LZO compression
PFC wrote: Would it be, by any chance, possible to tweak the thing so that reiserfs plugins become kernel modules, so that the reiserfs core can be put in the kernel without the plugins slowing down its acceptance ? I don't see what this has to do with cryptoapi plugins -- those are not related to Reiser plugins. As for the plugins slowing down acceptance, it's actually the concept of plugins and the plugin API -- in other words, it's the fact that Reiser4 supports plugins -- that is slowing it down, if anything about plugins is still an issue at all. Making them modules would make it worse. Last I saw, Linus doesn't particularly like the idea of plugins because of a few misconceptions, like the possibility of proprietary (possibly GPL-violating) plugins distributed as modules -- basically, something like what nVidia and ATI do with their video drivers. As it is, a good argument in favor of plugins is that this kind of thing isn't possible -- we often put "plugins" in quotes because really, it's just a nice abstraction layer. They aren't any more plugins than iptables modules or cryptoapi plugins are. If anything, they're less, because they must be compiled into Reiser4, which means either one huge monolithic Reiser4 module (including all plugins), or everything compiled into the kernel image. (and updating plugins without rebooting would be a nice extra) It probably wouldn't be as nice as you think. Remember, if you're using a certain plugin in your root FS, it's part of the FS, so I don't think you'd be able to remove that plugin any more than you're able to remove reiser4.ko if that's your root FS. You'd have to unmount every FS that uses that plugin. At this point, you don't really gain much -- if you unmount every last Reiser4 filesystem, you can then remove reiser4.ko, recompile it, and load a new one with different plugins enabled. Also, these things would typically be part of a kernel update anyway, meaning a reboot anyway. But suppose you could remove a plugin, what then? What would that mean? Suppose half your files are compressed and you remove cryptocompress -- are those files uncompressed when the plugin goes away? Probably not. The only smart way to handle this that I can think of is to make those files unavailable, which is probably not what you want -- how do you update cryptocompress when the new reiser4_cryptocompress.ko is itself compressed? That may be an acceptable solution for some plugins, but you'd have to be extremely careful which ones you remove. The only safe way I can imagine doing this may not be possible, and if it is, it's extremely hackish -- load the plugin under another module name, so r4_cryptocompress would be r4_cryptocompress_init -- have the module, once loaded, do an atomic switch from the old one to the new one, effectively in-place. But that kind of solution is something I've never seen attempted, and only really heard of in strange environments like Erlang. It would probably require much more engineering than the Reiser team can handle right now, especially with their hands full with inclusion. The patch below is so-called reiser4 LZO compression plugin as extracted from 2.6.18-rc4-mm3. I think it is an unauditable piece of shit and thus should not enter mainline. Like lib/inflate.c (and this new code should arguably be in lib/). The problem is that if we clean this up, we've diverged very much from the upstream implementation. So taking in fixes and features from upstream becomes harder and more error-prone. I'd suspect that the maturity of these utilities is such that we could afford to turn them into kernel code in the expectation that any future changes will be small. But it's not a completely simple call. (iirc the inflate code had a buffer overrun a while back, which was found and fixed in the upstream version).
Re: Reiser4 und LZO compression
I have made a little openoffice spreadsheet with the results. You can have fun entering stuff and seeing the results. http://peufeu.free.fr/compression.ods Basically, a laptop having the same processor as my PC and a crummy 15 MB/s drive (like most laptop drives) will get a 2.5x speedup using lzf, while using 40% CPU for compression and 15% CPU for decompression. I'd say it's a clear, hge win. A desktop computer with a modern IDE drive doing 50 MB/s will still get nice speedups (1.8x on write, 2.5x on read) but of course, more CPU will be used because of the higher throughput. In this case it is CPU limited on compression and disk limited on decompression. However soon everyone will have dual core monsters so... A big ass RAID will not get much benefit unless : - the buffer cache stores compressed pages, so compression virtually doubles the RAM cache - or the CPU is really fast - or you put one of these neat FPGA modules in a free Opteron socket and upload a soft-hardware LZF in it with a few gigabytes/s throughput ...
Re: Reiser4 und LZO compression
On 8/29/06, PFC <[EMAIL PROTECTED]> wrote: Anyone has a bench for lzf ? This is on a opteron 1.8GHz box. Everything tested hot cache. Testing on a fairly repetative but real test case (an SQL dump of one of the Wikipedia tables): -rw-rw-r-- 1 gmaxwell gmaxwell 426162134 Jul 20 06:54 ../page.sql $time lzop -c ../page.sql > page.sql.lzo real0m8.618s user0m7.800s sys 0m0.808s $time lzop -9c ../page.sql > page.sql.lzo-9 real4m45.299s user4m44.474s sys 0m0.712s $time gzip -1 -c ../page.sql > page.sql.gz real0m19.292s user0m18.545s sys 0m0.748s $time lzop -d -c ./page.sql.lzo > /dev/null real0m3.061s user0m2.836s sys 0m0.224s $time gzip -dc page.sql.gz >/dev/null real0m7.199s user0m7.020s sys 0m0.176s $time ./lzf -d < page.sql.lzf > /dev/null real0m2.398s user0m2.224s sys 0m0.172s -rw-rw-r-- 1 gmaxwell gmaxwell 193853815 Aug 29 10:59 page.sql.gz -rw-rw-r-- 1 gmaxwell gmaxwell 243497298 Aug 29 10:47 page.sql.lzf -rw-rw-r-- 1 gmaxwell gmaxwell 259986955 Jul 20 06:54 page.sql.lzo -rw-rw-r-- 1 gmaxwell gmaxwell 204930904 Jul 20 06:54 page.sql.lzo-9 (decompression of the differing lzo levels is the same speed) None of them really decompress fast enough to keep up with the disks in this system, lzf or lzo wouldn't be a big loss. (Bonnie scores: floodlamp,64G,,,246163,52,145536,35,,,365198,42,781.2,2,16,4540,69,+,+++,2454,31,4807,76,+,+++,2027,36)
Re: Reiser4 und LZO compression
On Tue, Aug 29, 2006 at 03:45:59PM +0200, PFC wrote: > Anyone has a bench for lzf ? It's easy, try something like: wget http://www.goof.com/pcg/marc/data/liblzf-1.6.tar.gz tar zxvpf liblzf-1.6.tar.gz cd liblzf-1.6 configure && make Now you have a small lzf binary that you can use for testing: cat bigfile|./lzf > bigfile.lzf use "./lzf -d" for decompression tests -- ciao - Stefan
Re: Reiser4 und LZO compression
I made a little benchmark on my own PC (Athlon64 3200+ in 64 bit gentoo) http://peufeu.free.fr/compression.html So, gzip could be used on PCs having very fast processors and very slow harddrives, like Core Duo laptops. However, lzo compresses nearly as much and is still a lot faster. I don't see a reason for gzip in a FS application. Anyone has a bench for lzf ?
Re: Reiser4 und LZO compression
Would it be, by any chance, possible to tweak the thing so that reiserfs plugins become kernel modules, so that the reiserfs core can be put in the kernel without the plugins slowing down its acceptance ? (and updating plugins without rebooting would be a nice extra) The patch below is so-called reiser4 LZO compression plugin as extracted from 2.6.18-rc4-mm3. I think it is an unauditable piece of shit and thus should not enter mainline. Like lib/inflate.c (and this new code should arguably be in lib/). The problem is that if we clean this up, we've diverged very much from the upstream implementation. So taking in fixes and features from upstream becomes harder and more error-prone. I'd suspect that the maturity of these utilities is such that we could afford to turn them into kernel code in the expectation that any future changes will be small. But it's not a completely simple call. (iirc the inflate code had a buffer overrun a while back, which was found and fixed in the upstream version).
Re: Reiser4 und LZO compression
Nigel Cunningham wrote: Hi. On Tue, 2006-08-29 at 03:23 -0500, David Masover wrote: Nigel Cunningham wrote: Hi. On Tue, 2006-08-29 at 06:05 +0200, Jan Engelhardt wrote: Hmm. LZO is the best compression algorithm for the task as measured by the objectives of good compression effectiveness while still having very low CPU usage (the best of those written and GPL'd, there is a slightly better one which is proprietary and uses more CPU, LZRW if I remember right. The gzip code base uses too much CPU, though I think Edward made I don't think that LZO beats LZF in both speed and compression ratio. LZF is also available under GPL (dual-licensed BSD) and was choosen in favor of LZO for the next generation suspend-to-disk code of the Linux kernel. see: http://www.goof.com/pcg/marc/liblzf.html thanks for the info, we will compare them For Suspend2, we ended up converting the LZF support to a cryptoapi plugin. Is there any chance that you could use cryptoapi modules? We could then have a hope of sharing the support. I am throwing in gzip: would it be meaningful to use that instead? The decoder (inflate.c) is already there. 06:04 shanghai:~/liblzf-1.6 > l configure* -rwxr-xr-x 1 jengelh users 154894 Mar 3 2005 configure -rwxr-xr-x 1 jengelh users 26810 Mar 3 2005 configure.bz2 -rw-r--r-- 1 jengelh users 30611 Aug 28 20:32 configure.gz-z9 -rw-r--r-- 1 jengelh users 30693 Aug 28 20:32 configure.gz-z6 -rw-r--r-- 1 jengelh users 53077 Aug 28 20:32 configure.lzf We used gzip when we first implemented compression support, and found it to be far too slow. Even with the fastest compression options, we were only getting a few megabytes per second. Perhaps I did something wrong in configuring it, but there's not that many things to get wrong! All that comes to mind is the speed/quality setting -- the number from 1 to 9. Recently, I backed up someone's hard drive using -1, and I believe I was still able to saturate... the _network_. Definitely try again if you haven't changed this, but I can't imagine I'm the first persson to think of it. From what I remember, gzip -1 wasn't faster than the disk. But at least for (very) repetitive data, I was wrong: eve:~ sanity$ time bash -c 'dd if=/dev/zero of=test bs=10m count=10; sync' 10+0 records in 10+0 records out 104857600 bytes transferred in 3.261990 secs (32145287 bytes/sec) real0m3.746s user0m0.005s sys 0m0.627s eve:~ sanity$ time bash -c 'dd if=/dev/zero bs=10m count=10 | gzip -v1 > test; sync' 10+0 records in 10+0 records out 104857600 bytes transferred in 2.404093 secs (43616282 bytes/sec) 99.5% real0m2.558s user0m1.554s sys 0m0.680s eve:~ sanity$ This was on OS X, but I think it's still valid -- this is a slightly older Powerbook, with a 5400 RPM drive, 1.6 ghz G4. -1 is still worlds better than nothing. The backup was over 15 gigs, down to about 6 -- loads of repetitive data, I'm sure, but that's where you win with compression anyway. Wow. That's a lot better; I guess I did get something wrong in trying to tune deflate. That was pre-cryptoapi though; looking at cryptoapi/deflate.c, I don't see any way of controlling the compression level. Am I missing anything? zlib is tunable, not cryptoapi's deflate. look at zlib_deflateInit2() Well, you use cryptoapi anyway, so it should be easy to just let the user pick a plugin, right? Right. They can already pick deflate if they want to. Regards, Nigel
Re: Reiser4 und LZO compression
On 8/29/06, Nigel Cunningham <[EMAIL PROTECTED]> wrote: Hi. On Tue, 2006-08-29 at 03:23 -0500, David Masover wrote: > Nigel Cunningham wrote: > > We used gzip when we first implemented compression support, and found it > > to be far too slow. Even with the fastest compression options, we were > > only getting a few megabytes per second. Perhaps I did something wrong > > in configuring it, but there's not that many things to get wrong! > > All that comes to mind is the speed/quality setting -- the number from 1 > to 9. Recently, I backed up someone's hard drive using -1, and I > believe I was still able to saturate... the _network_. Definitely try > again if you haven't changed this, but I can't imagine I'm the first > persson to think of it. > > From what I remember, gzip -1 wasn't faster than the disk. But at > least for (very) repetitive data, I was wrong: > > eve:~ sanity$ time bash -c 'dd if=/dev/zero of=test bs=10m count=10; sync' > 10+0 records in > 10+0 records out > 104857600 bytes transferred in 3.261990 secs (32145287 bytes/sec) > > real0m3.746s > user0m0.005s > sys 0m0.627s > eve:~ sanity$ time bash -c 'dd if=/dev/zero bs=10m count=10 | gzip -v1 > > test; sync' > 10+0 records in > 10+0 records out > 104857600 bytes transferred in 2.404093 secs (43616282 bytes/sec) > 99.5% > > real0m2.558s > user0m1.554s > sys 0m0.680s > eve:~ sanity$ > > > > This was on OS X, but I think it's still valid -- this is a slightly > older Powerbook, with a 5400 RPM drive, 1.6 ghz G4. > > -1 is still worlds better than nothing. The backup was over 15 gigs, > down to about 6 -- loads of repetitive data, I'm sure, but that's where > you win with compression anyway. Wow. That's a lot better; I guess I did get something wrong in trying to tune deflate. That was pre-cryptoapi though; looking at cryptoapi/deflate.c, I don't see any way of controlling the compression level. Am I missing anything? Compressing /dev/zero isn't a great test. The timings are really data-dependant: [EMAIL PROTECTED]:~$ time bash -c 'sudo dd if=/dev/zero bs=8M count=64 | gzip -v1 >/dev/null' 64+0 records in 64+0 records out 536870912 bytes (537 MB) copied, 7.60817 seconds, 70.6 MB/s 99.6% real0m7.652s user0m6.581s sys 0m0.701s [EMAIL PROTECTED]:~$ time bash -c 'sudo dd if=/dev/mem bs=8M count=64 | gzip -v1 >/dev/null' 64+0 records in 64+0 records out 536870912 bytes (537 MB) copied, 21.5863 seconds, 24.9 MB/s 70.4% real0m21.626s user0m18.763s sys 0m1.762s This is on an AMD64 laptop. Ray
Re: Reiser4 und LZO compression
Hi. On Tue, 2006-08-29 at 03:23 -0500, David Masover wrote: > Nigel Cunningham wrote: > > Hi. > > > > On Tue, 2006-08-29 at 06:05 +0200, Jan Engelhardt wrote: > >> Hmm. LZO is the best compression algorithm for the task as measured by > >> the objectives of good compression effectiveness while still having > >> very > >> low CPU usage (the best of those written and GPL'd, there is a slightly > >> better one which is proprietary and uses more CPU, LZRW if I remember > >> right. The gzip code base uses too much CPU, though I think Edward > >> made > > I don't think that LZO beats LZF in both speed and compression ratio. > > > > LZF is also available under GPL (dual-licensed BSD) and was choosen in > > favor > > of LZO for the next generation suspend-to-disk code of the Linux kernel. > > > > see: http://www.goof.com/pcg/marc/liblzf.html > thanks for the info, we will compare them > >>> For Suspend2, we ended up converting the LZF support to a cryptoapi > >>> plugin. Is there any chance that you could use cryptoapi modules? We > >>> could then have a hope of sharing the support. > >> I am throwing in gzip: would it be meaningful to use that instead? The > >> decoder (inflate.c) is already there. > >> > >> 06:04 shanghai:~/liblzf-1.6 > l configure* > >> -rwxr-xr-x 1 jengelh users 154894 Mar 3 2005 configure > >> -rwxr-xr-x 1 jengelh users 26810 Mar 3 2005 configure.bz2 > >> -rw-r--r-- 1 jengelh users 30611 Aug 28 20:32 configure.gz-z9 > >> -rw-r--r-- 1 jengelh users 30693 Aug 28 20:32 configure.gz-z6 > >> -rw-r--r-- 1 jengelh users 53077 Aug 28 20:32 configure.lzf > > > > We used gzip when we first implemented compression support, and found it > > to be far too slow. Even with the fastest compression options, we were > > only getting a few megabytes per second. Perhaps I did something wrong > > in configuring it, but there's not that many things to get wrong! > > All that comes to mind is the speed/quality setting -- the number from 1 > to 9. Recently, I backed up someone's hard drive using -1, and I > believe I was still able to saturate... the _network_. Definitely try > again if you haven't changed this, but I can't imagine I'm the first > persson to think of it. > > From what I remember, gzip -1 wasn't faster than the disk. But at > least for (very) repetitive data, I was wrong: > > eve:~ sanity$ time bash -c 'dd if=/dev/zero of=test bs=10m count=10; sync' > 10+0 records in > 10+0 records out > 104857600 bytes transferred in 3.261990 secs (32145287 bytes/sec) > > real0m3.746s > user0m0.005s > sys 0m0.627s > eve:~ sanity$ time bash -c 'dd if=/dev/zero bs=10m count=10 | gzip -v1 > > test; sync' > 10+0 records in > 10+0 records out > 104857600 bytes transferred in 2.404093 secs (43616282 bytes/sec) > 99.5% > > real0m2.558s > user0m1.554s > sys 0m0.680s > eve:~ sanity$ > > > > This was on OS X, but I think it's still valid -- this is a slightly > older Powerbook, with a 5400 RPM drive, 1.6 ghz G4. > > -1 is still worlds better than nothing. The backup was over 15 gigs, > down to about 6 -- loads of repetitive data, I'm sure, but that's where > you win with compression anyway. Wow. That's a lot better; I guess I did get something wrong in trying to tune deflate. That was pre-cryptoapi though; looking at cryptoapi/deflate.c, I don't see any way of controlling the compression level. Am I missing anything? > Well, you use cryptoapi anyway, so it should be easy to just let the > user pick a plugin, right? Right. They can already pick deflate if they want to. Regards, Nigel
Re: Reiser4 und LZO compression
Nigel Cunningham wrote: Hi. On Mon, 2006-08-28 at 22:15 +0400, Edward Shishkin wrote: Stefan Traby wrote: On Mon, Aug 28, 2006 at 10:06:46AM -0700, Hans Reiser wrote: Hmm. LZO is the best compression algorithm for the task as measured by the objectives of good compression effectiveness while still having very low CPU usage (the best of those written and GPL'd, there is a slightly better one which is proprietary and uses more CPU, LZRW if I remember right. The gzip code base uses too much CPU, though I think Edward made I don't think that LZO beats LZF in both speed and compression ratio. LZF is also available under GPL (dual-licensed BSD) and was choosen in favor of LZO for the next generation suspend-to-disk code of the Linux kernel. see: http://www.goof.com/pcg/marc/liblzf.html thanks for the info, we will compare them For Suspend2, we ended up converting the LZF support to a cryptoapi plugin. Is there any chance that you could use cryptoapi modules? We could then have a hope of sharing the support. No problems with using crypto-api. Reiser4 bypasses it, because currently it supplies the only compression level, which is fairly bad for compressed file systems. Edward.
Re: Reiser4 und LZO compression
Nigel Cunningham wrote: Hi. On Tue, 2006-08-29 at 06:05 +0200, Jan Engelhardt wrote: Hmm. LZO is the best compression algorithm for the task as measured by the objectives of good compression effectiveness while still having very low CPU usage (the best of those written and GPL'd, there is a slightly better one which is proprietary and uses more CPU, LZRW if I remember right. The gzip code base uses too much CPU, though I think Edward made I don't think that LZO beats LZF in both speed and compression ratio. LZF is also available under GPL (dual-licensed BSD) and was choosen in favor of LZO for the next generation suspend-to-disk code of the Linux kernel. see: http://www.goof.com/pcg/marc/liblzf.html thanks for the info, we will compare them For Suspend2, we ended up converting the LZF support to a cryptoapi plugin. Is there any chance that you could use cryptoapi modules? We could then have a hope of sharing the support. I am throwing in gzip: would it be meaningful to use that instead? The decoder (inflate.c) is already there. 06:04 shanghai:~/liblzf-1.6 > l configure* -rwxr-xr-x 1 jengelh users 154894 Mar 3 2005 configure -rwxr-xr-x 1 jengelh users 26810 Mar 3 2005 configure.bz2 -rw-r--r-- 1 jengelh users 30611 Aug 28 20:32 configure.gz-z9 -rw-r--r-- 1 jengelh users 30693 Aug 28 20:32 configure.gz-z6 -rw-r--r-- 1 jengelh users 53077 Aug 28 20:32 configure.lzf We used gzip when we first implemented compression support, and found it to be far too slow. Even with the fastest compression options, we were only getting a few megabytes per second. Perhaps I did something wrong in configuring it, but there's not that many things to get wrong! All that comes to mind is the speed/quality setting -- the number from 1 to 9. Recently, I backed up someone's hard drive using -1, and I believe I was still able to saturate... the _network_. Definitely try again if you haven't changed this, but I can't imagine I'm the first persson to think of it. From what I remember, gzip -1 wasn't faster than the disk. But at least for (very) repetitive data, I was wrong: eve:~ sanity$ time bash -c 'dd if=/dev/zero of=test bs=10m count=10; sync' 10+0 records in 10+0 records out 104857600 bytes transferred in 3.261990 secs (32145287 bytes/sec) real0m3.746s user0m0.005s sys 0m0.627s eve:~ sanity$ time bash -c 'dd if=/dev/zero bs=10m count=10 | gzip -v1 > test; sync' 10+0 records in 10+0 records out 104857600 bytes transferred in 2.404093 secs (43616282 bytes/sec) 99.5% real0m2.558s user0m1.554s sys 0m0.680s eve:~ sanity$ This was on OS X, but I think it's still valid -- this is a slightly older Powerbook, with a 5400 RPM drive, 1.6 ghz G4. -1 is still worlds better than nothing. The backup was over 15 gigs, down to about 6 -- loads of repetitive data, I'm sure, but that's where you win with compression anyway. Well, you use cryptoapi anyway, so it should be easy to just let the user pick a plugin, right?
Re: Reiser4 und LZO compression
Hi. On Tue, 2006-08-29 at 13:59 +0900, Paul Mundt wrote: > On Tue, Aug 29, 2006 at 07:48:25AM +1000, Nigel Cunningham wrote: > > For Suspend2, we ended up converting the LZF support to a cryptoapi > > plugin. Is there any chance that you could use cryptoapi modules? We > > could then have a hope of sharing the support. > > > Using cryptoapi plugins for the compression methods is an interesting > approach, there's a few other places in the kernel that could probably > benefit from this as well, such as jffs2 (which at the moment rolls its > own compression subsystem), and the out-of-tree page and swap cache > compression work. > > Assuming you were wrapping in to LZF directly prior to the cryptoapi > integration, do you happen to have before and after numbers to determine > how heavyweight the rest of the cryptoapi overhead is? It would be > interesting to profile this and consider migrating the in-tree users, > rather than duplicating the compress/decompress routines all over the > place. I was, but I don't have numbers right now. I'm about to go out, but will see if I can find them when I get back later. From memory, it wasn't a huge change in terms of lines of code. Regards, Nigel
Re: Reiser4 und LZO compression
Hi. On Tue, 2006-08-29 at 06:05 +0200, Jan Engelhardt wrote: > >> >>Hmm. LZO is the best compression algorithm for the task as measured by > >> >>the objectives of good compression effectiveness while still having very > >> >>low CPU usage (the best of those written and GPL'd, there is a slightly > >> >>better one which is proprietary and uses more CPU, LZRW if I remember > >> >>right. The gzip code base uses too much CPU, though I think Edward made > >> > > >> > I don't think that LZO beats LZF in both speed and compression ratio. > >> > > >> > LZF is also available under GPL (dual-licensed BSD) and was choosen in > >> > favor > >> > of LZO for the next generation suspend-to-disk code of the Linux kernel. > >> > > >> > see: http://www.goof.com/pcg/marc/liblzf.html > >> > >> thanks for the info, we will compare them > > > >For Suspend2, we ended up converting the LZF support to a cryptoapi > >plugin. Is there any chance that you could use cryptoapi modules? We > >could then have a hope of sharing the support. > > I am throwing in gzip: would it be meaningful to use that instead? The > decoder (inflate.c) is already there. > > 06:04 shanghai:~/liblzf-1.6 > l configure* > -rwxr-xr-x 1 jengelh users 154894 Mar 3 2005 configure > -rwxr-xr-x 1 jengelh users 26810 Mar 3 2005 configure.bz2 > -rw-r--r-- 1 jengelh users 30611 Aug 28 20:32 configure.gz-z9 > -rw-r--r-- 1 jengelh users 30693 Aug 28 20:32 configure.gz-z6 > -rw-r--r-- 1 jengelh users 53077 Aug 28 20:32 configure.lzf We used gzip when we first implemented compression support, and found it to be far too slow. Even with the fastest compression options, we were only getting a few megabytes per second. Perhaps I did something wrong in configuring it, but there's not that many things to get wrong! In contrast, with LZF, we get very high throughput. My current laptop is an 1.8MHz Turion with a 7200 RPM (PATA) drive. Without LZF compression, my throughput in writing an image is the maximum the drive & interface can manage - 38MB/s. With LZF, I get roughly that divided by compression ratio achieved, so if the compression ratio is ~50%, as it generally is, I'm reading and writing the image at 75-80MB/s. During this time, all the computer is doing is compressing pages using LZF and submitting bios, with the odd message being send to the userspace interface app via netlink. I realise this is very different to the workload you'll be doing, but hopefully the numbers are somewhat useful: [EMAIL PROTECTED]:~$ cat /sys/power/suspend2/debug_info Suspend2 debugging info: - SUSPEND core : 2.2.7.4 - Kernel Version : 2.6.18-rc4 - Compiler vers. : 4.1 - Attempt number : 1 - Parameters : 0 32785 0 0 0 0 - Overall expected compression percentage: 0. - Compressor is 'lzf'. Compressed 820006912 bytes into 430426371 (47 percent compression). - Swapwriter active. Swap available for image: 487964 pages. - Filewriter inactive. - I/O speed: Write 74 MB/s, Read 70 MB/s. - Extra pages: 1913 used/2100. [EMAIL PROTECTED]:~$ (Modify hibernate.conf to disable compression, suspend again...) [EMAIL PROTECTED]:~$ cat /sys/power/suspend2/debug_info Suspend2 debugging info: - SUSPEND core : 2.2.7.4 - Kernel Version : 2.6.18-rc4 - Compiler vers. : 4.1 - Attempt number : 2 - Parameters : 0 32785 0 0 0 0 - Overall expected compression percentage: 0. - Swapwriter active. Swap available for image: 487964 pages. - Filewriter inactive. - I/O speed: Write 38 MB/s, Read 39 MB/s. - Extra pages: 1907 used/2100. [EMAIL PROTECTED]:~$ Oh, I also have a debugging mode where I can get Suspend2 to just compress the pages but not actually write anything. If I do that, it says it can do 80MB/s on my kernel image, so the disk is still the bottleneck, it seems. Hope this all helps (and isn't information overload!) Nigel
Re: Reiser4 und LZO compression
On Tue, Aug 29, 2006 at 07:48:25AM +1000, Nigel Cunningham wrote: > For Suspend2, we ended up converting the LZF support to a cryptoapi > plugin. Is there any chance that you could use cryptoapi modules? We > could then have a hope of sharing the support. > Using cryptoapi plugins for the compression methods is an interesting approach, there's a few other places in the kernel that could probably benefit from this as well, such as jffs2 (which at the moment rolls its own compression subsystem), and the out-of-tree page and swap cache compression work. Assuming you were wrapping in to LZF directly prior to the cryptoapi integration, do you happen to have before and after numbers to determine how heavyweight the rest of the cryptoapi overhead is? It would be interesting to profile this and consider migrating the in-tree users, rather than duplicating the compress/decompress routines all over the place.
Re: Reiser4 und LZO compression
>> >>Hmm. LZO is the best compression algorithm for the task as measured by >> >>the objectives of good compression effectiveness while still having very >> >>low CPU usage (the best of those written and GPL'd, there is a slightly >> >>better one which is proprietary and uses more CPU, LZRW if I remember >> >>right. The gzip code base uses too much CPU, though I think Edward made >> > >> > I don't think that LZO beats LZF in both speed and compression ratio. >> > >> > LZF is also available under GPL (dual-licensed BSD) and was choosen in >> > favor >> > of LZO for the next generation suspend-to-disk code of the Linux kernel. >> > >> > see: http://www.goof.com/pcg/marc/liblzf.html >> >> thanks for the info, we will compare them > >For Suspend2, we ended up converting the LZF support to a cryptoapi >plugin. Is there any chance that you could use cryptoapi modules? We >could then have a hope of sharing the support. I am throwing in gzip: would it be meaningful to use that instead? The decoder (inflate.c) is already there. 06:04 shanghai:~/liblzf-1.6 > l configure* -rwxr-xr-x 1 jengelh users 154894 Mar 3 2005 configure -rwxr-xr-x 1 jengelh users 26810 Mar 3 2005 configure.bz2 -rw-r--r-- 1 jengelh users 30611 Aug 28 20:32 configure.gz-z9 -rw-r--r-- 1 jengelh users 30693 Aug 28 20:32 configure.gz-z6 -rw-r--r-- 1 jengelh users 53077 Aug 28 20:32 configure.lzf Jan Engelhardt --
Re: Reiser4 und LZO compression
Nigel Cunningham wrote: > For Suspend2, we ended up converting the LZF support to a cryptoapi > plugin. Is there any chance that you could use cryptoapi modules? We > could then have a hope of sharing the support It is in principle a good idea, and I hope we will be able to say yes. However, I have to see the numbers, as we are more performance sensitive than you folks probably are, and every 10% is a big deal for us.
Re: Reiser4 und LZO compression
Hi. On Mon, 2006-08-28 at 22:15 +0400, Edward Shishkin wrote: > Stefan Traby wrote: > > On Mon, Aug 28, 2006 at 10:06:46AM -0700, Hans Reiser wrote: > > > > > >>Hmm. LZO is the best compression algorithm for the task as measured by > >>the objectives of good compression effectiveness while still having very > >>low CPU usage (the best of those written and GPL'd, there is a slightly > >>better one which is proprietary and uses more CPU, LZRW if I remember > >>right. The gzip code base uses too much CPU, though I think Edward made > > > > > > I don't think that LZO beats LZF in both speed and compression ratio. > > > > LZF is also available under GPL (dual-licensed BSD) and was choosen in favor > > of LZO for the next generation suspend-to-disk code of the Linux kernel. > > > > see: http://www.goof.com/pcg/marc/liblzf.html > > > > thanks for the info, we will compare them For Suspend2, we ended up converting the LZF support to a cryptoapi plugin. Is there any chance that you could use cryptoapi modules? We could then have a hope of sharing the support. Regards, Nigel
Re: Reiser4 und LZO compression
Stefan Traby wrote: On Mon, Aug 28, 2006 at 10:06:46AM -0700, Hans Reiser wrote: Hmm. LZO is the best compression algorithm for the task as measured by the objectives of good compression effectiveness while still having very low CPU usage (the best of those written and GPL'd, there is a slightly better one which is proprietary and uses more CPU, LZRW if I remember right. The gzip code base uses too much CPU, though I think Edward made I don't think that LZO beats LZF in both speed and compression ratio. LZF is also available under GPL (dual-licensed BSD) and was choosen in favor of LZO for the next generation suspend-to-disk code of the Linux kernel. see: http://www.goof.com/pcg/marc/liblzf.html thanks for the info, we will compare them
Re: Reiser4 und LZO compression
Jindrich Makovicka wrote: On Sun, 27 Aug 2006 04:42:59 -0500 David Masover <[EMAIL PROTECTED]> wrote: Andrew Morton wrote: On Sun, 27 Aug 2006 04:34:26 +0400 Alexey Dobriyan <[EMAIL PROTECTED]> wrote: The patch below is so-called reiser4 LZO compression plugin as extracted from 2.6.18-rc4-mm3. I think it is an unauditable piece of shit and thus should not enter mainline. Like lib/inflate.c (and this new code should arguably be in lib/). The problem is that if we clean this up, we've diverged very much from the upstream implementation. So taking in fixes and features from upstream becomes harder and more error-prone. Well, what kinds of changes have to happen? I doubt upstream would care about moving some of it to lib/ -- and anyway, reiserfs-list is on the CC. We are speaking of upstream in the third party in the presence of upstream, so... The ifdef jungle is ugly, and especially the WIN / 16-bit DOS stuff is completely useless here. I agree that it needs some brushing, putting in todo.. Maybe just ask upstream? I am not sure if Mr. Oberhumer still cares about LZO 1.x, AFAIK he now develops a new compressor under a commercial license. Regards,
Re: Reiser4 und LZO compression
On Mon, Aug 28, 2006 at 10:06:46AM -0700, Hans Reiser wrote: > Hmm. LZO is the best compression algorithm for the task as measured by > the objectives of good compression effectiveness while still having very > low CPU usage (the best of those written and GPL'd, there is a slightly > better one which is proprietary and uses more CPU, LZRW if I remember > right. The gzip code base uses too much CPU, though I think Edward made I don't think that LZO beats LZF in both speed and compression ratio. LZF is also available under GPL (dual-licensed BSD) and was choosen in favor of LZO for the next generation suspend-to-disk code of the Linux kernel. see: http://www.goof.com/pcg/marc/liblzf.html -- ciao - Stefan
Re: Reiser4 und LZO compression
On Sun, 27 Aug 2006 04:42:59 -0500 David Masover <[EMAIL PROTECTED]> wrote: > Andrew Morton wrote: > > On Sun, 27 Aug 2006 04:34:26 +0400 > > Alexey Dobriyan <[EMAIL PROTECTED]> wrote: > > > >> The patch below is so-called reiser4 LZO compression plugin as > >> extracted from 2.6.18-rc4-mm3. > >> > >> I think it is an unauditable piece of shit and thus should not > >> enter mainline. > > > > Like lib/inflate.c (and this new code should arguably be in lib/). > > > > The problem is that if we clean this up, we've diverged very much > > from the upstream implementation. So taking in fixes and features > > from upstream becomes harder and more error-prone. > > Well, what kinds of changes have to happen? I doubt upstream would > care about moving some of it to lib/ -- and anyway, reiserfs-list is > on the CC. We are speaking of upstream in the third party in the > presence of upstream, so... The ifdef jungle is ugly, and especially the WIN / 16-bit DOS stuff is completely useless here. > Maybe just ask upstream? I am not sure if Mr. Oberhumer still cares about LZO 1.x, AFAIK he now develops a new compressor under a commercial license. Regards, -- Jindrich Makovicka
Re: Reiser4 und LZO compression
Alexey Dobriyan wrote: > Reiser4 developers, Andrew, > > The patch below is so-called reiser4 LZO compression plugin as extracted > from 2.6.18-rc4-mm3. > > I think it is an unauditable piece of shit and thus should not enter > mainline. > > Hmm. LZO is the best compression algorithm for the task as measured by the objectives of good compression effectiveness while still having very low CPU usage (the best of those written and GPL'd, there is a slightly better one which is proprietary and uses more CPU, LZRW if I remember right. The gzip code base uses too much CPU, though I think Edward made an option of it). Could you be kind enough to send me a plugin which is better at those two measures, I'd be quite grateful? By the way, could you tell me about this "auditing" stuff? Last I remember, when I mentioned that the US Defense community had coding practices worth adopting by the Kernel Community, I was pretty much disregarded. So, while I understand that the FSB has serious security issues what with all these Americans seeking to crack their Linux boxen, complaining to me about auditability seems a bit graceless.;-) Especially if there is no offer of replacement compression code. Oh, and this LZO code is not written by Namesys. You can tell by the utter lack of comments, assertions, etc. We are just seeking to reuse well known widely used code. I have in the past been capable of demanding that my programmers comment code not written by them before we use it, but this time I did not. I have mixed feeling about us adding our comments to code written by a compression specialist. If Andrew wants us to write our own compression code, or comment this code and fill it with asserts, we will grumble a bit and do it. It is not a task I am eager for, as compression code is a highly competitive field which gives me the surface impression that if you are not gripped by what you are sure is an inspiration you should stay out of it. Jorn wrote: I've had an identical argument with Linus about lib/zlib_*. He decided that he didn't care about diverging, I went ahead and changed the code. In the process, I merged a couple of outstanding bugfixes and reduced memory consumption by 25%. Looks like Linus was right on that one. Anyone sends myself or Edward a patch, that's great. Jorn, sounds like you did a good job on that one. Hans
Re: Reiser4 und LZO compression
On Sun, 27 August 2006 01:04:28 -0700, Andrew Morton wrote: > > Like lib/inflate.c (and this new code should arguably be in lib/). > > The problem is that if we clean this up, we've diverged very much from the > upstream implementation. So taking in fixes and features from upstream > becomes harder and more error-prone. I've had an identical argument with Linus about lib/zlib_*. He decided that he didn't care about diverging, I went ahead and changed the code. In the process, I merged a couple of outstanding bugfixes and reduced memory consumption by 25%. Looks like Linus was right on that one. > I'd suspect that the maturity of these utilities is such that we could > afford to turn them into kernel code in the expectation that any future > changes will be small. But it's not a completely simple call. > > (iirc the inflate code had a buffer overrun a while back, which was found > and fixed in the upstream version). Dito in lib/zlib_*. lib/inflage.c is only used for the various in-kernel bootloaders to uncompress a kernel image. Anyone tampering with the image to cause a buffer overrun already owns the machine anyway. Whether any of our experiences with zlib apply to lzo remains a question, though. Jörn -- I've never met a human being who would want to read 17,000 pages of documentation, and if there was, I'd kill him to get him out of the gene pool. -- Joseph Costello
Re: Reiser4 und LZO compression
Andrew Morton wrote: On Sun, 27 Aug 2006 04:34:26 +0400 Alexey Dobriyan <[EMAIL PROTECTED]> wrote: The patch below is so-called reiser4 LZO compression plugin as extracted from 2.6.18-rc4-mm3. I think it is an unauditable piece of shit and thus should not enter mainline. Like lib/inflate.c (and this new code should arguably be in lib/). The problem is that if we clean this up, we've diverged very much from the upstream implementation. So taking in fixes and features from upstream becomes harder and more error-prone. Well, what kinds of changes have to happen? I doubt upstream would care about moving some of it to lib/ -- and anyway, reiserfs-list is on the CC. We are speaking of upstream in the third party in the presence of upstream, so... Maybe just ask upstream?
Re: Reiser4 und LZO compression
On 8/27/06, Andrew Morton <[EMAIL PROTECTED]> wrote: On Sun, 27 Aug 2006 04:34:26 +0400 Alexey Dobriyan <[EMAIL PROTECTED]> wrote: > The patch below is so-called reiser4 LZO compression plugin as extracted > from 2.6.18-rc4-mm3. > > I think it is an unauditable piece of shit and thus should not enter > mainline. Sheesh. Like lib/inflate.c (and this new code should arguably be in lib/). The problem is that if we clean this up, we've diverged very much from the upstream implementation. So taking in fixes and features from upstream becomes harder and more error-prone. Right. How about putting it in as so that everyone can track divergences, but to not use it for a real compile. Rather, consider it meta-source, and do mechanical, repeatable transformations only, starting with something like: mv minilzo.c minilzo._c cpp 2>/dev/null -w -P -C -nostdinc -dI minilzo._c >minilzo.c lindent minilzo.c to generate a version that can be audited. Doing so on a version of minilzo.c google found on the web generated something that looked much like any other stream coder source I've read, so it approaches readability. Of a sorts. Further cleanups could be done with cpp -D to rename some of the more bizarre symbols. Downside is that bugs would have to be fixed in the 'meta-source' (horrible name, but it's late here), but at least they could be found (potentially) easier than in the original. Ray
Re: Reiser4 und LZO compression
On Sun, 27 Aug 2006 04:34:26 +0400 Alexey Dobriyan <[EMAIL PROTECTED]> wrote: > The patch below is so-called reiser4 LZO compression plugin as extracted > from 2.6.18-rc4-mm3. > > I think it is an unauditable piece of shit and thus should not enter > mainline. Like lib/inflate.c (and this new code should arguably be in lib/). The problem is that if we clean this up, we've diverged very much from the upstream implementation. So taking in fixes and features from upstream becomes harder and more error-prone. I'd suspect that the maturity of these utilities is such that we could afford to turn them into kernel code in the expectation that any future changes will be small. But it's not a completely simple call. (iirc the inflate code had a buffer overrun a while back, which was found and fixed in the upstream version).