On 02/03/2011 04:56 PM, Greg Smith wrote:
Scott Marlowe wrote:
On Thu, Feb 3, 2011 at 8:57 AM,<gnuo...@rcn.com> wrote:
Time for my pet meme to wiggle out of its hole (next to Phil's, and a day
later). For PG to prosper in the future, it has to embrace the
multi-core/processor/SSD machine at the query level. It has to. And
I'm pretty sure multi-core query processing is in the TODO list. Not
sure anyone's working on it tho. Writing a big check might help.
Work on the exciting parts people are interested in is blocked behind completely mundane
tasks like coordinating how the multiple sessions are going to end up with a consistent
view of the database. See "Export snapshots to other sessions" at
http://wiki.postgresql.org/wiki/ClusterFeatures for details on that one.
Parallel query works well for accelerating CPU-bound operations that are
executing in RAM. The reality here is that while the feature sounds important,
these situations don't actually show up that often. There are exactly zero
clients I deal with regularly who would be helped out by this. The ones running
web applications whose workloads do fit into memory are more concerned about
supporting large numbers of users, not optimizing things for a single one. And
the ones who have so much data that single users running large reports would
seemingly benefit from this are usually disk-bound instead.
The same sort of situation exists with SSDs. Take out the potential users whose
data can fit in RAM instead, take out those who can't possibly get an SSD big
enough to hold all their stuff anyway, and what's left in the middle is not
very many people. In a database context I still haven't found anything better
to do with a SSD than to put mid-sized indexes on them, ones a bit too large
for RAM but not so big that only regular hard drives can hold them.
I would rather strongly disagree with the suggestion that embracing either of
these fancy but not really as functional as they appear at first approaches is
critical to PostgreSQL's future. They're specialized techniques useful to only
a limited number of people.
--
Greg Smith 2ndQuadrant usg...@2ndquadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Supportwww.2ndQuadrant.us
"PostgreSQL 9.0 High Performance":http://www.2ndQuadrant.com/books
4 cores is cheap and popular now, 6 in a bit, 8 next year, 16/24 cores in 5
years. You can do 16 cores now, but its a bit expensive. I figure hundreds of
cores will be expensive in 5 years, but possible, and available.
Cpu's wont get faster, but HD's and SSD's will. To have one database
connection, which runs one query, run fast, it's going to need multi-core
support.
That's not to say we need "parallel query's". Or we need multiple backends to
work on one query. We need one backend, working on one query, using mostly the same
architecture, to just use more than one core.
You'll notice I used _mostly_ and _just_, and have no knowledge of PG
internals, so I fully expect to be wrong.
My point is, there must be levels of threading, yes? If a backend has data to
sort, has it collected, nothing locked, what would it hurt to use multi-core
sorting?
-- OR --
Threading (and multicore), to me, always mean queues. What if new type's of backend's
were created that did "simple" things, that normal backends could distribute
work to, then go off and do other things, and come back to collect the results.
I thought I read a paper someplace that said shared cache (L1/L2/etc) multicore
cpu's would start getting really slow at 16/32 cores, and that message passing
was the way forward past that. If PG started aiming for 128 core support right
now, it should use some kinda message passing with queues thing, yes?
-Andy
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance