"Tom Lane" <[EMAIL PROTECTED]> writes: > This whole structure seems a bit broken, independently of whether the > particular parameter values are good. If the compressor is given an > input of 1000000 bytes and manages to compress it to 999999 bytes, > we'll store it compressed, and pay for decompression cycles on every > access, even though the I/O savings are nonexistent. That's not sane.
Especially given that uncompressed toasted data is quite a bit more flexible in that it can handle substr() efficiently. Thinking about it, if the datum is stored inline then a single byte saved is at least theoretically helpful. If it's stored in a toast table then anything less than 2k is pretty slim odds to be helpful at all even if the percentage gain is pretty big. I don't know what the right answer is yet but it looks to me like there does need to be two strategies, one for inline toasted tuples and one for externally toasted tuples. Unfortunately that's not the way the toaster is structured. First it goes through and compresses all the fields starting with the largest and then it starts pushing out to external storage all the fields starting with the largest remaining. It doesn't really know whether something's going to be stored externally when it's compressing. It seems to me that having a fairly high minimum percentage of 25% would get pretty close to the intended behaviour. Small data which happens to be highly compressible would only have to save 8-32 bytes to be compressed. Data over 8k would have to save at least 2k or more to be compressed. (Incidentally, this means what I said earlier about uselessly trying to compress objects below 256 is even grosser than I realized. If you have a single large object which even after compressing will be over the toast target it will force *every* varlena to be considered for compression even though they mostly can't be compressed. Considering a varlena smaller than 256 for compression only costs a useless palloc, so it's not the end of the world but still. It does seem kind of strange that a tuple which otherwise wouldn't be toasted at all suddenly gets all its fields compressed if you add one more field which ends up being stored externally.) -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match