On Wed, Oct 15, 2014 at 2:55 PM, Simon Riggs <si...@2ndquadrant.com> wrote: > Something usable, with severe restrictions, is actually better than we > have now. I understand the journey this work represents, so don't be > embarrassed by submitting things with heuristics and good-enoughs in > it. Our mentor, Mr.Lane, achieved much by spreading work over many > releases, leaving others to join in the task.
It occurs to me that, now that the custom-scan stuff is committed, it wouldn't be that hard to use that, plus the other infrastructure we already have, to write a prototype of parallel sequential scan. Given where we are with the infrastructure, there would be a number of unhandled problems, such as deadlock detection (needs group locking or similar), assessment of quals as to parallel-safety (needs proisparallel or similar), general waterproofing to make sure that pushing down a qual we shouldn't does do anything really dastardly like crash the server (another written but yet-to-be-published patch adds a bunch of relevant guards), and snapshot sharing (likewise). But if you don't do anything weird, it should basically work. I think this would be useful for a couple of reasons. First, it would be a demonstrable show of progress, illustrating how close we are to actually having something you can really deploy. Second, we could use it to demonstrate how the remaining infrastructure patches close up gaps in the initial prototype. Third, it would let us start doing real performance testing. It seems pretty clear that a parallel sequential scan of data that's in memory (whether the page cache or the OS cache) can be accelerated by having multiple processes scan it in parallel. But it's much less clear what will happen when the data is being read in from disk. Does parallelism help at all? What degree of parallelism helps? Do we break OS readahead so badly that performance actually regresses? These are things that are likely to need a fair amount of tuning before this is ready for prime time, so being able to start experimenting with them in advance of all of the infrastructure being completely ready seems like it might help. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers