Jim, > http://jim.nasby.net/misc/compress_sort.txt is preliminary results. > I've run into a slight problem in that even at a compression > level of -3, zlib is cutting the on-disk size of sorts by > 25x. So my pgbench sort test with scale=150 that was > producing a 2G on-disk sort is now producing a 80M sort, > which obviously fits in memory. And cuts sort times by more than half.
When you're ready, we can test this on some other interesting cases and on fast hardware. BTW - external sorting is *still* 4x slower than popular commercial DBMS (PCDB) on real workload when full rows are used in queries. The final results we had after the last bit of sort improvements were limited to cases where only the sort column was used in the query, and for that case the improved external sort code was as fast as PCDB provided lots of work_mem are used, but when the whole contents of the row are consumed (as with TPC-H and in many real world cases) the performance is still far slower. So, compression of the tuples may be just what we're looking for. - Luke ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend