On Fri, Jun 24, 2016 at 4:29 PM, Andres Freund <and...@anarazel.de> wrote: > As a motivation, here's a somewhat juicy example of the benefits that > can be gained (disabled parallelism, results vary too much): > SELECT l_linestatus, SUM(l_quantity) FROM lineitem GROUP BY 1 ORDER BY 2 DESC > LIMIT 10 ; > non-optimized: 9075.835 ms > optimized: 5194.024 ms > > and that's far from doing all the possible work for that query; > especially the batching is far from optimal.
That's pretty great. Let me first just say that I think that your broadly have the right idea here, and that it will likely be a big area to work on in the years ahead. This may become a big, general area with many sub-projects, kind of like parallelism. Also, I think it's very much complementary to parallelism. > So far I've mostly looked at performance for tpc-h, not because it's a > particularly good workload, but because it's available. If somebody has > good suggestions for an analytics heavy workload... I think that it's the only thing available that isn't a pain to setup. > My observations about the performance bottlenecks I found, and partially > addressed, are in rough order of importance (there's interdependence > between most of them): > > (List of various bottlenecks) > I'm planning to start individual subthreads for some of these, with > in-detail discussions and/or patches. But I wanted to present this on a > higher level once, before spending even more time. Good call. > Have I missed concrete issues others noticed? Missed significant > improvements (please, please, only realistic stuff)? I'll add one: We should have a way to make routines like heap_copytuple() "allocate" into the caller's own batch memory. We may be able to come up with a reasonably generic abstraction that minimizes the special case handling of things like exhausting caller's batch allocation. Failure to do something like this wastes an awful lot of memory, but more importantly causes a lot of unnecessary cache misses in tuplestore.c, and even with the 9.6 work in tuplesort.c. Someone should also work on palloc() fragmentation, but I tend to think that that's a less efficient way of improving matters. I'd leave that until later. Having an excessive number of retail palloc() calls for cases where the caller really does process tuples in a batch fashion is inherently inefficient. I'm all for the simplicity of the memory context abstraction, but ISTM that a little specialization gets you very far. > Unfortunately these items are heavily interdependent. For some queries > you'll need issue n) addressed before n+2) shows a benefit, for some > it's the other way, and such. Given the size of the overall task, the > only way I can see this being reasonably addressable, is trying to get > smaller steps in individually, even if not a huge win on their own. And > then focus on the the larger architectural changes (primarily 3)) > issues. I could not agree more. This should be strongly emphasized. I have parallel CREATE INDEX working pretty well now. I don't think I could have done that without the 9.6 work on external sorting's cache efficiency, so in some sense that earlier work has enabled parallelism in a non-obvious way. I think that abbreviated keys will prove really important for getting the best out of parallel sort (and that a cost model will have to consider stuff like leading attribute cardinality -- in other words, it must consider CPU cache efficiency). I also suspect that solving the problems with heap_deform_tuple() first would make my pending parallel CREATE INDEX patch look a lot better relative to a serial CREATE INDEX. That's the kind of intuitive speculation that's really hard to prove. It may be that it doesn't matter there, because working on fixing that yields obvious benefits. But, as you suggest, we should consider that that might not always be true. That's really tricky, and has historically been the kind of thing we've managed very badly. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers