Re: [HACKERS] Compression and on-disk sorting

2006-06-07 Thread Jim C. Nasby
On Fri, May 26, 2006 at 09:21:44PM +0100, Simon Riggs wrote: On Fri, 2006-05-26 at 14:47 -0500, Jim C. Nasby wrote: But the meat is: -- work_mem -- Scale 20002 not compressed 150 805.7

Re: [HACKERS] Compression and on-disk sorting

2006-06-07 Thread Simon Riggs
On Wed, 2006-06-07 at 01:33 -0500, Jim C. Nasby wrote: On Fri, May 26, 2006 at 09:21:44PM +0100, Simon Riggs wrote: On Fri, 2006-05-26 at 14:47 -0500, Jim C. Nasby wrote: But the meat is: -- work_mem -- Scale

Re: [HACKERS] Compression and on-disk sorting

2006-06-07 Thread Jim C. Nasby
On Wed, Jun 07, 2006 at 11:59:50AM +0100, Simon Riggs wrote: On Wed, 2006-06-07 at 01:33 -0500, Jim C. Nasby wrote: On Fri, May 26, 2006 at 09:21:44PM +0100, Simon Riggs wrote: On Fri, 2006-05-26 at 14:47 -0500, Jim C. Nasby wrote: But the meat is:

Re: [HACKERS] Compression and on-disk sorting

2006-06-07 Thread Simon Riggs
On Wed, 2006-06-07 at 09:35 -0500, Jim C. Nasby wrote: Would simply changing the ORDER BY to DESC suffice for this? FWIW: Try sorting on aid also, both ascneding and descending. We need to try lots of tests, not just one thats chosen to show the patch in the best light. I want this, but we

Re: [HACKERS] Compression and on-disk sorting

2006-06-07 Thread Jim C. Nasby
On Wed, Jun 07, 2006 at 04:11:57PM +0100, Simon Riggs wrote: On Wed, 2006-06-07 at 09:35 -0500, Jim C. Nasby wrote: Would simply changing the ORDER BY to DESC suffice for this? FWIW: Try sorting on aid also, both ascneding and descending. We need to try lots of tests, not just one thats

Re: [HACKERS] Compression and on-disk sorting

2006-05-26 Thread Jim C. Nasby
I've done some more testing with Tom's recently committed changes to tuplesort.c, which remove the tupleheaders from the sort data. It does about 10% better than compression alone does. What's interesting is that the gains are about 10% regardless of compression, which means compression isn't

Re: [HACKERS] Compression and on-disk sorting

2006-05-26 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes: Something else worth mentioning is that sort performance is worse with larger work_mem for all cases except the old HEAD, prior to the tuplesort.c changes. It looks like whatever was done to fix that will need to be adjusted/rethought pending the outcome

Re: [HACKERS] Compression and on-disk sorting

2006-05-26 Thread Jim C. Nasby
On Fri, May 26, 2006 at 12:35:36PM -0400, Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: Something else worth mentioning is that sort performance is worse with larger work_mem for all cases except the old HEAD, prior to the tuplesort.c changes. It looks like whatever was done to fix

Re: [HACKERS] Compression and on-disk sorting

2006-05-26 Thread Simon Riggs
On Fri, 2006-05-26 at 14:47 -0500, Jim C. Nasby wrote: But the meat is: -- work_mem -- Scale 20002 not compressed 150 805.7 797.7 not compressed 300017820 17436

Re: [HACKERS] Compression and on-disk sorting

2006-05-26 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: There is a noticeable rise in sort time with increasing work_mem, but that needs to be offset from the benefit that in-general comes from using a large Heap for the sort. With the data you're using that always looks like a loss, but that isn't true with

Re: [HACKERS] Compression and on-disk sorting

2006-05-26 Thread Jim C. Nasby
On Fri, May 26, 2006 at 04:41:51PM -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: There is a noticeable rise in sort time with increasing work_mem, but that needs to be offset from the benefit that in-general comes from using a large Heap for the sort. With the data you're

Re: [HACKERS] Compression and on-disk sorting

2006-05-24 Thread Joshua D. Drake
Jim C. Nasby wrote: Finally completed testing of a dataset that doesn't fit in memory with compression enabled. Results are at http://jim.nasby.net/misc/pgsqlcompression . Summary: work_memcompressed not compressed gain in-memory 2 400.1 797.7

Re: [HACKERS] Compression and on-disk sorting

2006-05-24 Thread Jim C. Nasby
On Wed, May 24, 2006 at 02:20:43PM -0700, Joshua D. Drake wrote: Jim C. Nasby wrote: Finally completed testing of a dataset that doesn't fit in memory with compression enabled. Results are at http://jim.nasby.net/misc/pgsqlcompression . Summary: work_memcompressed

Re: [HACKERS] Compression and on-disk sorting

2006-05-21 Thread Martijn van Oosterhout
On Fri, May 19, 2006 at 01:39:45PM -0500, Jim C. Nasby wrote: Do you have any stats on CPU usage? Memory usage? I've only been taking a look at vmstat from time-to-time, and I have yet to see the machine get CPU-bound. Haven't really paid much attention to memory. Is there anything in

Re: [HACKERS] Compression and on-disk sorting

2006-05-19 Thread Luke Lonergan
Jim, http://jim.nasby.net/misc/compress_sort.txt is preliminary results. I've run into a slight problem in that even at a compression level of -3, zlib is cutting the on-disk size of sorts by 25x. So my pgbench sort test with scale=150 that was producing a 2G on-disk sort is now

Re: [HACKERS] Compression and on-disk sorting

2006-05-19 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes: I'm seeing 250,000 blocks being cut down to 9,500 blocks. That's almost unbeleiveable. What's in the table? Yeah, I'd tend to question the test data being used. gzip does not do that well on typical text (especially not at the lower settings

Re: [HACKERS] Compression and on-disk sorting

2006-05-19 Thread Martijn van Oosterhout
On Fri, May 19, 2006 at 09:03:31AM -0400, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: I'm seeing 250,000 blocks being cut down to 9,500 blocks. That's almost unbeleiveable. What's in the table? Yeah, I'd tend to question the test data being used. gzip does not do

Re: [HACKERS] Compression and on-disk sorting

2006-05-19 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes: However, postgres tables are very highly compressable, 10-to-1 is not that uncommon. pg_proc and pg_index compress by that for example. Indexes compress even more (a few on my system compress 25-to-1 but that could just be slack space, the

Re: [HACKERS] Compression and on-disk sorting

2006-05-19 Thread Jim C. Nasby
On Fri, May 19, 2006 at 09:29:03AM +0200, Martijn van Oosterhout wrote: On Thu, May 18, 2006 at 10:02:44PM -0500, Jim C. Nasby wrote: http://jim.nasby.net/misc/compress_sort.txt is preliminary results. I've run into a slight problem in that even at a compression level of -3, zlib is cutting

Re: [HACKERS] Compression and on-disk sorting

2006-05-19 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes: On Fri, May 19, 2006 at 09:29:03AM +0200, Martijn van Oosterhout wrote: I'm seeing 250,000 blocks being cut down to 9,500 blocks. That's almost unbeleiveable. What's in the table? It would seem to imply that our tuple format is far more compressable than

Re: [HACKERS] Compression and on-disk sorting

2006-05-19 Thread Hannu Krosing
Ühel kenal päeval, R, 2006-05-19 kell 14:53, kirjutas Tom Lane: Jim C. Nasby [EMAIL PROTECTED] writes: On Fri, May 19, 2006 at 09:29:03AM +0200, Martijn van Oosterhout wrote: I'm seeing 250,000 blocks being cut down to 9,500 blocks. That's almost unbeleiveable. What's in the table? It would

Re: [HACKERS] Compression and on-disk sorting

2006-05-19 Thread Martijn van Oosterhout
On Fri, May 19, 2006 at 10:02:50PM +0300, Hannu Krosing wrote: It's just SELECT count(*) FROM (SELECT * FROM accounts ORDER BY bid) a; If the tape routines were actually storing visibility information, I'd expect that to be pretty compressible in this case since all the tuples were

Re: [HACKERS] Compression and on-disk sorting

2006-05-19 Thread Jim C. Nasby
On Fri, May 19, 2006 at 10:02:50PM +0300, Hannu Krosing wrote: ??hel kenal p??eval, R, 2006-05-19 kell 14:53, kirjutas Tom Lane: Jim C. Nasby [EMAIL PROTECTED] writes: On Fri, May 19, 2006 at 09:29:03AM +0200, Martijn van Oosterhout wrote: I'm seeing 250,000 blocks being cut down to 9,500

Re: [HACKERS] Compression and on-disk sorting

2006-05-19 Thread Jim C. Nasby
On Fri, May 19, 2006 at 09:29:44PM +0200, Martijn van Oosterhout wrote: On Fri, May 19, 2006 at 10:02:50PM +0300, Hannu Krosing wrote: It's just SELECT count(*) FROM (SELECT * FROM accounts ORDER BY bid) a; If the tape routines were actually storing visibility information, I'd expect

Re: [HACKERS] Compression and on-disk sorting

2006-05-19 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes: True random data wouldn't be such a great test either; what would probably be best is a set of random words, since in real life you're unlikely to have truely random data. True random data would provide worst-case compression behavior, so we'd want to try

Re: [HACKERS] Compression and on-disk sorting

2006-05-19 Thread Hannu Krosing
Ühel kenal päeval, R, 2006-05-19 kell 14:57, kirjutas Jim C. Nasby: On Fri, May 19, 2006 at 09:29:44PM +0200, Martijn van Oosterhout wrote: On Fri, May 19, 2006 at 10:02:50PM +0300, Hannu Krosing wrote: It's just SELECT count(*) FROM (SELECT * FROM accounts ORDER BY bid) a; If

Re: [HACKERS] Compression and on-disk sorting

2006-05-18 Thread Zeugswetter Andreas DCP SD
1) Use n sort areas for n tapes making everything purely sequential access. Some time ago testing I did has shown, that iff the IO block size is large enough (256k) it does not really matter that much if the blocks are at random locations. I think that is still true for current model disks. So

Re: [HACKERS] Compression and on-disk sorting

2006-05-18 Thread Simon Riggs
On Tue, 2006-05-16 at 15:42 -0500, Jim C. Nasby wrote: On Tue, May 16, 2006 at 12:31:07PM -0500, Jim C. Nasby wrote: In any case, my curiousity is aroused, so I'm currently benchmarking pgbench on both a compressed and uncompressed $PGDATA/base. I'll also do some benchmarks with pg_tmp

Re: [HACKERS] Compression and on-disk sorting

2006-05-18 Thread Bruce Momjian
Uh, TODO already has: o %Add a GUC variable to control the tablespace for temporary objects and sort files It could start with a random tablespace from a supplied list and cycle through the list. Do we need to add to this?

Re: [HACKERS] Compression and on-disk sorting

2006-05-18 Thread Jim C. Nasby
On Thu, May 18, 2006 at 11:34:51AM +0100, Simon Riggs wrote: On Tue, 2006-05-16 at 15:42 -0500, Jim C. Nasby wrote: On Tue, May 16, 2006 at 12:31:07PM -0500, Jim C. Nasby wrote: In any case, my curiousity is aroused, so I'm currently benchmarking pgbench on both a compressed and

Re: [HACKERS] Compression and on-disk sorting

2006-05-18 Thread Jim C. Nasby
On Thu, May 18, 2006 at 10:57:16AM +0200, Zeugswetter Andreas DCP SD wrote: 1) Use n sort areas for n tapes making everything purely sequential access. Some time ago testing I did has shown, that iff the IO block size is large enough (256k) it does not really matter that much if the

Re: [HACKERS] Compression and on-disk sorting

2006-05-18 Thread Tom Lane
Bruce Momjian pgman@candle.pha.pa.us writes: Uh, TODO already has: o %Add a GUC variable to control the tablespace for temporary objects and sort files It could start with a random tablespace from a supplied list and cycle through the list. Do we need

Re: [HACKERS] Compression and on-disk sorting

2006-05-18 Thread Bruce Momjian
Tom Lane wrote: Bruce Momjian pgman@candle.pha.pa.us writes: Uh, TODO already has: o %Add a GUC variable to control the tablespace for temporary objects and sort files It could start with a random tablespace from a supplied list and cycle

Re: [HACKERS] Compression and on-disk sorting

2006-05-18 Thread Martijn van Oosterhout
On Thu, May 18, 2006 at 11:22:46AM -0500, Jim C. Nasby wrote: AFAIK logtape currently reads in much less than 256k blocks. Of course if you get lucky you'll read from one tape for some time before switching to another, which should have a sort-of similar effect if the drives aren't very busy

Re: [HACKERS] Compression and on-disk sorting

2006-05-18 Thread Jim C. Nasby
On Thu, May 18, 2006 at 08:32:10PM +0200, Martijn van Oosterhout wrote: On Thu, May 18, 2006 at 11:22:46AM -0500, Jim C. Nasby wrote: AFAIK logtape currently reads in much less than 256k blocks. Of course if you get lucky you'll read from one tape for some time before switching to another,

Re: [HACKERS] Compression and on-disk sorting

2006-05-18 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes: Actually, I guess the amount of memory used for zlib's lookback buffer (or whatever they call it) could be pretty substantial, and I'm not sure if there would be a way to combine that across all tapes. But there's only one active write tape at a time. My

Re: [HACKERS] Compression and on-disk sorting

2006-05-18 Thread Jim C. Nasby
On Thu, May 18, 2006 at 04:55:17PM -0400, Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: Actually, I guess the amount of memory used for zlib's lookback buffer (or whatever they call it) could be pretty substantial, and I'm not sure if there would be a way to combine that across all

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Albe Laurenz
Andrew Piskorski wrote: Rod Taylor wrote: Disk storage is cheap. Disk bandwidth or throughput is very expensive. Oracle has included table compression since 9iR2. They report table size reductions of 2x to 4x as typical, with proportional reductions in I/O, and supposedly, usually low to

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Martijn van Oosterhout
On Wed, May 17, 2006 at 09:45:35AM +0200, Albe Laurenz wrote: Oracle's compression seems to work as follows: - At the beginning of each data block, there is a 'lookup table' containing frequently used values in table entries (of that block). - This lookup table is referenced from within the

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Martijn van Oosterhout
On Wed, May 17, 2006 at 12:03:15AM -0400, Tom Lane wrote: AFAICS the only sane choice here is to use src/backend/utils/adt/pg_lzcompress.c, on the grounds that (1) it's already in the backend, and (2) data compression in general is such a minefield of patents that we'd be foolish to expose

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Andrew Piskorski
On Tue, May 16, 2006 at 11:48:21PM -0400, Greg Stark wrote: There are some very fast decompression algorithms: http://www.oberhumer.com/opensource/lzo/ Sure, and for some tasks in PostgreSQL perhaps it would be useful. But at least as of July 2005, a Sandor Heman, one of the MonetDB guys,

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Zeugswetter Andreas DCP SD
Certainly, if you can't prototype a convincing performance win using that algorithm, it's unlikely to be worth anyone's time to look harder. That should be easily possible with LZO. It would need to be the lib that we can optionally link to (--with-lzo), since the lib is GPL. lzo even

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Hannu Krosing
Ühel kenal päeval, K, 2006-05-17 kell 12:20, kirjutas Zeugswetter Andreas DCP SD: Certainly, if you can't prototype a convincing performance win using that algorithm, it's unlikely to be worth anyone's time to look harder. That should be easily possible with LZO. It would need to be the

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Zeugswetter Andreas DCP SD
Unfortunatly, the interface provided by pg_lzcompress.c is probably insufficient for this purpose. You want to be able to compress tuples as they get inserted and start a new block once the output reaches a I don't think anything that compresses single tuples without context is going to be a

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Jonah H. Harris
On 5/17/06, Martijn van Oosterhout kleptog@svana.org wrote: Clever idea, pity we can't use it (what's the bet it's patented?). I'd wager anything beyond simple compression is patented by someone. Oracle's patent application 20040054858 covers the method itself including the process for storing

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes: Clever idea, pity we can't use it (what's the bet it's patented?). I'd wager anything beyond simple compression is patented by someone. You're in for a rude awakening: even simple compression is anything but simple. As I said, it's a minefield

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Jim C. Nasby
On Tue, May 16, 2006 at 06:48:25PM -0400, Greg Stark wrote: Martijn van Oosterhout kleptog@svana.org writes: It might be easier to switch to giving each tape it's own file... I don't think it would make much difference. OTOH, if this turns out to be a win, the tuplestore could have

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Jim C. Nasby
If we're going to consider table-level compression, ISTM the logical first step is to provide greater control over TOASTing; namely thresholds for when to compress and/or go to external storage that can be set on a per-field or at least per-table basis. -- Jim C. Nasby, Sr. Engineering Consultant

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Jim C. Nasby
On Wed, May 17, 2006 at 10:06:04AM +0200, Martijn van Oosterhout wrote: On Wed, May 17, 2006 at 09:45:35AM +0200, Albe Laurenz wrote: Oracle's compression seems to work as follows: - At the beginning of each data block, there is a 'lookup table' containing frequently used values in table

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes: What *might* make sense would be to provide two locations for pgsql_tmp, because a lot of operations in there involve reading and writing at the same time: Read from heap while writing tapes to pgsql_tmp read from tapes while writing final version to

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Jim C. Nasby
On Wed, May 17, 2006 at 11:38:05AM -0400, Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: What *might* make sense would be to provide two locations for pgsql_tmp, because a lot of operations in there involve reading and writing at the same time: Read from heap while writing tapes

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes: On Wed, May 17, 2006 at 11:38:05AM -0400, Tom Lane wrote: Note that a large part of the reason for the current logtape.c design is to avoid requiring 2X or more disk space to sort X amount of data. Actually, I suspect in most cases it won't matter; I

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Rod Taylor
Actually, I suspect in most cases it won't matter; I don't think people make a habit of trying to sort their entire database. :) But we'd want to protect for the oddball cases... yech. I can make query result sets that are far larger than the database itself. create table

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Martijn van Oosterhout
For all those people not subscribed to -patches (should appear in archive soon), I just posted a patch there implemented zlib compression for logtape.c. If people have test machines for speed-testing this sort of stuff, please have at it. You can also download it here:

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Greg Stark
Jim C. Nasby [EMAIL PROTECTED] writes: Only if those spindles weren't all in a single RAID array and if we went through the trouble of creating all the machinery so you could tell PostgreSQL where all those spindles were mounted in the filesystem. I think the way you do this is simply by

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Greg Stark
Andrew Piskorski [EMAIL PROTECTED] writes: Things like enums and 1 bit booleans certainly could be useful, but they cannot take advantage of duplicate values across multiple rows at all, even if 1000 rows have the exact same value in their date column and are all in the same disk block,

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes: The ideal way to handle the situation you're describing would be to interleave the tuples so that you have all 1000 values of the first column, followed by all 1000 values of the second column and so on. Then you run a generic algorithm on this and it

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Jim C. Nasby
On Wed, May 17, 2006 at 12:55:53PM -0400, Greg Stark wrote: Jim C. Nasby [EMAIL PROTECTED] writes: Only if those spindles weren't all in a single RAID array and if we went through the trouble of creating all the machinery so you could tell PostgreSQL where all those spindles were

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Jim C. Nasby
On Wed, May 17, 2006 at 12:16:13PM -0400, Rod Taylor wrote: Actually, I suspect in most cases it won't matter; I don't think people make a habit of trying to sort their entire database. :) But we'd want to protect for the oddball cases... yech. I can make query result sets that are far

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Hannu Krosing
Ühel kenal päeval, K, 2006-05-17 kell 10:01, kirjutas Jim C. Nasby: If we're going to consider table-level compression, ISTM the logical first step is to provide greater control over TOASTing; namely thresholds for when to compress and/or go to external storage that can be set on a per-field

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Jim C. Nasby
On Wed, May 17, 2006 at 10:55:19PM +0300, Hannu Krosing wrote: ??hel kenal p??eval, K, 2006-05-17 kell 10:01, kirjutas Jim C. Nasby: If we're going to consider table-level compression, ISTM the logical first step is to provide greater control over TOASTing; namely thresholds for when to

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Greg Stark
Jim C. Nasby [EMAIL PROTECTED] writes: On Wed, May 17, 2006 at 12:55:53PM -0400, Greg Stark wrote: Jim C. Nasby [EMAIL PROTECTED] writes: Only if those spindles weren't all in a single RAID array and if we went through the trouble of creating all the machinery so you could tell

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Jim C. Nasby
On Wed, May 17, 2006 at 05:44:22PM -0400, Greg Stark wrote: Jim C. Nasby [EMAIL PROTECTED] writes: On Wed, May 17, 2006 at 12:55:53PM -0400, Greg Stark wrote: Jim C. Nasby [EMAIL PROTECTED] writes: Only if those spindles weren't all in a single RAID array and if we went

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Greg Stark
Jim C. Nasby [EMAIL PROTECTED] writes: Which means we need all the interface bits to be able to tell PostgreSQL where every single temp storage area is. Presumably much of the tablespace mechanism could be used for this, but it's still a bunch of work. And you can't just say I have 8

Re: [HACKERS] Compression and on-disk sorting

2006-05-16 Thread Zeugswetter Andreas DCP SD
Given that any time that happens we end up caring much less about CPU usage and much more about disk IO, for any of these cases that use non-random access, compressing the data before sending it to disk would potentially be a sizeable win. Note however that what the code thinks is a

Re: [HACKERS] Compression and on-disk sorting

2006-05-16 Thread Zeugswetter Andreas DCP SD
Personally, I believe it would be worth it - but only to a few. And these most of these few are likely using Oracle. So, no gain unless you can convince them to switch back... :-) We do know that the benefit for commercial databases that use raw and file system storage is that raw

Re: [HACKERS] Compression and on-disk sorting

2006-05-16 Thread Bort, Paul
Compressed-filesystem extension (like e2compr, and I think either Fat or NTFS) can do that. Windows (NT/2000/XP) can compress individual directories and files under NTFS; new files in a compressed directory are compressed by default. So if the 'spill-to-disk' all happened in its own

Re: [HACKERS] Compression and on-disk sorting

2006-05-16 Thread Andrew Dunstan
Bort, Paul wrote: Compressed-filesystem extension (like e2compr, and I think either Fat or NTFS) can do that. Windows (NT/2000/XP) can compress individual directories and files under NTFS; new files in a compressed directory are compressed by default. So if the 'spill-to-disk' all

Re: [HACKERS] Compression and on-disk sorting

2006-05-16 Thread Andrew Dunstan
Rod Taylor wrote: I habitually turn off all compression on my Windows boxes, because it's a performance hit in my experience. Disk is cheap ... Disk storage is cheap. Disk bandwidth or throughput is very expensive. Sure, but in my experience using Windows File System compression is

Re: [HACKERS] Compression and on-disk sorting

2006-05-16 Thread Jim C. Nasby
On Tue, May 16, 2006 at 09:24:38AM +0200, Zeugswetter Andreas DCP SD wrote: Given that any time that happens we end up caring much less about CPU usage and much more about disk IO, for any of these cases that use non-random access, compressing the data before sending it to disk would

Re: [HACKERS] Compression and on-disk sorting

2006-05-16 Thread Jim C. Nasby
On Tue, May 16, 2006 at 12:27:42PM -0400, Andrew Dunstan wrote: Rod Taylor wrote: I habitually turn off all compression on my Windows boxes, because it's a performance hit in my experience. Disk is cheap ... Disk storage is cheap. Disk bandwidth or throughput is very expensive. Hey,

Re: [HACKERS] Compression and on-disk sorting

2006-05-16 Thread Jim C. Nasby
On Tue, May 16, 2006 at 12:31:07PM -0500, Jim C. Nasby wrote: In any case, my curiousity is aroused, so I'm currently benchmarking pgbench on both a compressed and uncompressed $PGDATA/base. I'll also do some benchmarks with pg_tmp compressed. Results: http://jim.nasby.net/bench.log As

Re: [HACKERS] Compression and on-disk sorting

2006-05-16 Thread Martijn van Oosterhout
On Tue, May 16, 2006 at 12:31:07PM -0500, Jim C. Nasby wrote: Does anyone have time to hack some kind of compression into the on-disk sort code just to get some benchmark numbers? Unfortunately, doing so is beyond my meager C abilitiy... I had a look at this. At first glance it doesn't seem

Re: [HACKERS] Compression and on-disk sorting

2006-05-16 Thread Jim C. Nasby
On Tue, May 16, 2006 at 11:46:15PM +0200, Martijn van Oosterhout wrote: On Tue, May 16, 2006 at 12:31:07PM -0500, Jim C. Nasby wrote: Does anyone have time to hack some kind of compression into the on-disk sort code just to get some benchmark numbers? Unfortunately, doing so is beyond my

Re: [HACKERS] Compression and on-disk sorting

2006-05-16 Thread Martijn van Oosterhout
On Tue, May 16, 2006 at 04:50:22PM -0500, Jim C. Nasby wrote: I had a look at this. At first glance it doesn't seem too hard, except the whole logtape process kinda gets in the way. If it wern't for the mark/restore it'd be trivial. Might take a stab at it some time, if I can think of a

Re: [HACKERS] Compression and on-disk sorting

2006-05-16 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes: Not seek, mark/restore. As the code describes, sometimes you go back a tuple. The primary reason I think is for the final pass, a merge sort might read the tuples multiple times, so it needs to support it there. However it'd be possible to tell

Re: [HACKERS] Compression and on-disk sorting

2006-05-16 Thread Greg Stark
Martijn van Oosterhout kleptog@svana.org writes: It might be easier to switch to giving each tape it's own file... I don't think it would make much difference. OTOH, if this turns out to be a win, the tuplestore could have the same optimisation. Would giving each tape its own file make it

Re: [HACKERS] Compression and on-disk sorting

2006-05-16 Thread Andrew Piskorski
On Tue, May 16, 2006 at 12:31:07PM -0500, Jim C. Nasby wrote: On Tue, May 16, 2006 at 12:27:42PM -0400, Andrew Dunstan wrote: Rod Taylor wrote: I habitually turn off all compression on my Windows boxes, because it's a performance hit in my experience. Disk is cheap ... Disk storage is

Re: [HACKERS] Compression and on-disk sorting

2006-05-16 Thread Greg Stark
Andrew Piskorski [EMAIL PROTECTED] writes: The main tricks seem to be: One, EXTREMELY lightweight compression schemes - basically table lookups designed to be as cpu friendly as posible. Two, keep the data compressed in RAM as well so that you can also cache more of the data, and indeed

Re: [HACKERS] Compression and on-disk sorting

2006-05-16 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes: Andrew Piskorski [EMAIL PROTECTED] writes: A corrolary of that is forget compression schemes like gzip - it reduces data size nicely but is far too slow on the cpu to be particularly useful in improving overall throughput rates. There are some very fast

Re: [HACKERS] Compression and on-disk sorting

2006-05-15 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes: A recent post Tom made in -bugs about how bad performance would be if we spilled after-commit triggers to disk got me thinking... There are several operations the database performs that potentially spill to disk. Given that any time that happens we end up

Re: [HACKERS] Compression and on-disk sorting

2006-05-15 Thread Jim C. Nasby
On Mon, May 15, 2006 at 02:18:03PM -0400, Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: A recent post Tom made in -bugs about how bad performance would be if we spilled after-commit triggers to disk got me thinking... There are several operations the database performs that

Re: [HACKERS] Compression and on-disk sorting

2006-05-15 Thread Andrew Dunstan
Jim C. Nasby wrote: On Mon, May 15, 2006 at 02:18:03PM -0400, Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: A recent post Tom made in -bugs about how bad performance would be if we spilled after-commit triggers to disk got me thinking... There are several operations the

Re: [HACKERS] Compression and on-disk sorting

2006-05-15 Thread Jim C. Nasby
On Mon, May 15, 2006 at 03:44:50PM -0400, Andrew Dunstan wrote: Jim C. Nasby wrote: On Mon, May 15, 2006 at 02:18:03PM -0400, Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: A recent post Tom made in -bugs about how bad performance would be if we spilled after-commit triggers

Re: [HACKERS] Compression and on-disk sorting

2006-05-15 Thread Martijn van Oosterhout
On Mon, May 15, 2006 at 03:02:07PM -0500, Jim C. Nasby wrote: The problem is that it seems like there's never enough ability to clue the OS in on what the application is trying to accomplish. For a long time we didn't have a background writer, because the OS should be able to flush things out

Re: [HACKERS] Compression and on-disk sorting

2006-05-15 Thread Jim C. Nasby
On Mon, May 15, 2006 at 10:09:47PM +0200, Martijn van Oosterhout wrote: In this case the problem is that we want to tell the OS Hey, if this stuff is actually going to go out to the spindles then compress it. And by the way, we won't be doing any random access on it, either. But AFAIK

Re: [HACKERS] Compression and on-disk sorting

2006-05-15 Thread Ron Mayer
Jim C. Nasby wrote: There's an fadvise that tells the OS to compress the data if it actually makes it to disk? Compressed-filesystem extension (like e2compr, and I think either Fat or NTFS) can do that. I think the reasons against adding this feature to postgresql are largely the same as the

Re: [HACKERS] Compression and on-disk sorting

2006-05-15 Thread Tom Lane
Ron Mayer [EMAIL PROTECTED] writes: I think the real reason Oracle and others practically re-wrote their own VM-system and filesystems is that at the time it was important for them to run under Windows98; where it was rather easy to write better filesystems than your customer's OS was bundled

Re: [HACKERS] Compression and on-disk sorting

2006-05-15 Thread Joshua D. Drake
Tom Lane wrote: Ron Mayer [EMAIL PROTECTED] writes: I think the real reason Oracle and others practically re-wrote their own VM-system and filesystems is that at the time it was important for them to run under Windows98; where it was rather easy to write better filesystems than your customer's

Re: [HACKERS] Compression and on-disk sorting

2006-05-15 Thread mark
On Mon, May 15, 2006 at 05:42:53PM -0700, Joshua D. Drake wrote: Windows98? No, those decisions predate any thought of running Oracle on Windows, probably by decades. But I think the thought process was about as above whenever they did make it; they were running on some pretty stupid OSes

Re: [HACKERS] Compression and on-disk sorting

2006-05-15 Thread Gregory Maxwell
Oh come on, Sorry to troll but this is too easy. On 5/15/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: You guys have to kill your Windows hate - in jest or otherwise. It's zealous, and blinding. [snip] Why would it be assumed, that a file system designed for use from a desktop, would be

Re: [HACKERS] Compression and on-disk sorting

2006-05-15 Thread Bruce Momjian
[EMAIL PROTECTED] wrote: The real question - and I believe Tom and others have correctly harped on it in the past is - is it worth it? Until somebody actually pulls up their sleeves, invests a month or more of their life to it, and does it, we really won't know. And even then, the cost of