On Wed, Feb 15, 2012 at 8:29 AM, Peter Geoghegan <pe...@2ndquadrant.com> wrote: > Cool. I agree that we should do this. It doesn't need to be justified > as a performance optimisation - it makes sense to refactor in this > way. If that makes things faster, then so much the better.
Well, maybe so, but I think if the performance had been the same I might not have messed with it. > I am inclined to agree that given that we already use Perl to generate > source code like this, it seems natural that we should prefer to do > that, if only to avoid paranoia about the inclusion of a dial-a-bloat > knob. I am at a disadvantage here, since I've never written a line of > Perl. I think it's still dial-a-bloat, but I feel pretty comfortable about how we've got that knob adjusted in this version. It's almost as much improvement as any previous version, it applies to more cases, and the code footprint is the least of any version I've measured. > Hmm. I'll run some tests myself when I get a chance, and attempt to > determine what's going on here. I don't have immediate access to a > good benchmarking server, which would be preferable, and I'd like to > post a revision of pg_stat_statements normalisation today, as it's > been too long. Nice work though. Thanks. >> It strikes me that if we wanted to take this further, we could look at >> squeezing out ApplySortComparator. For example, suppose that, upon >> discovering that we can do an in-memory quicksort on a single sort >> key, we make an initial pass over the data where we check whether it's >> sorted and, as we go, swap all the entries with isnull1 = true to the >> end of the memtuples array. We then sort the isnull1 = true entries >> with the standard comparator, and the isnull1 = false entries with an >> optimized comparator that elides most of ApplySortComparator and >> instead just calls the comparison function directly. We then decide >> on whether to emit the isnull1 = true entries first or last based on >> NULLS FIRST/LAST, and decide whether to emit the remaining entries in >> forward or reverse order based on ASC/DESC. Or maybe not exactly that >> thing, but something like that, so that we sort the null and non-null >> entries separately. The additional complexity in the read-out logic >> would probably be more than offset by being able to use a simpler >> comparator. > > While it's really easy to be wrong about these things, I'd venture to > guess that branch prediction makes this a less than compelling win in > practice. That said, if you want to know how good a win it might be, > you need only knock out a quick-and-dirty prototype and see how fast a > sort is - however well this does will be pretty close to your best > case, but not quite, since presumably such a prototype wouldn't worry > about the applicability of the optimisation. You might be right. But note that the swapcode improvements were also vulnerable to branch prediction, too, though there could also be other effects there. One effect that should perhaps be considered is the cost of saving and restoring registers. Even if the branch itself doesn't cost much because we know what the result will be, the operands still have to be loaded into registers, which isn't free of itself and also potentially forces those registers to be saved and restored across function calls. Still, I'm inclined not to poke at this right now: there are a lot of patches that still need to be looked at, and at the rate we're currently disposing of patches this CommitFest will still be ongoing come Easter. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers