On 02/03/2011 04:56 PM, Greg Smith wrote:
Scott Marlowe wrote:
On Thu, Feb 3, 2011 at 8:57 AM,<gnuo...@rcn.com>  wrote:

Time for my pet meme to wiggle out of its hole (next to Phil's, and a day 
later).  For PG to prosper in the future, it has to embrace the 
multi-core/processor/SSD machine at the query level.  It has to.  And


I'm pretty sure multi-core query processing is in the TODO list.  Not
sure anyone's working on it tho.  Writing a big check might help.


Work on the exciting parts people are interested in is blocked behind completely mundane 
tasks like coordinating how the multiple sessions are going to end up with a consistent 
view of the database. See "Export snapshots to other sessions" at 
http://wiki.postgresql.org/wiki/ClusterFeatures for details on that one.

Parallel query works well for accelerating CPU-bound operations that are 
executing in RAM. The reality here is that while the feature sounds important, 
these situations don't actually show up that often. There are exactly zero 
clients I deal with regularly who would be helped out by this. The ones running 
web applications whose workloads do fit into memory are more concerned about 
supporting large numbers of users, not optimizing things for a single one. And 
the ones who have so much data that single users running large reports would 
seemingly benefit from this are usually disk-bound instead.

The same sort of situation exists with SSDs. Take out the potential users whose 
data can fit in RAM instead, take out those who can't possibly get an SSD big 
enough to hold all their stuff anyway, and what's left in the middle is not 
very many people. In a database context I still haven't found anything better 
to do with a SSD than to put mid-sized indexes on them, ones a bit too large 
for RAM but not so big that only regular hard drives can hold them.

I would rather strongly disagree with the suggestion that embracing either of 
these fancy but not really as functional as they appear at first approaches is 
critical to PostgreSQL's future. They're specialized techniques useful to only 
a limited number of people.

--
Greg Smith   2ndQuadrant usg...@2ndquadrant.com    Baltimore, MD
PostgreSQL Training, Services, and 24x7 Supportwww.2ndQuadrant.us
"PostgreSQL 9.0 High Performance":http://www.2ndQuadrant.com/books


4 cores is cheap and popular now, 6 in a bit, 8 next year, 16/24 cores in 5 
years.  You can do 16 cores now, but its a bit expensive.  I figure hundreds of 
cores will be expensive in 5 years, but possible, and available.

Cpu's wont get faster, but HD's and SSD's will.  To have one database 
connection, which runs one query, run fast, it's going to need multi-core 
support.

That's not to say we need "parallel query's".  Or we need multiple backends to 
work on one query.  We need one backend, working on one query, using mostly the same 
architecture, to just use more than one core.

You'll notice I used _mostly_ and _just_, and have no knowledge of PG 
internals, so I fully expect to be wrong.

My point is, there must be levels of threading, yes?  If a backend has data to 
sort, has it collected, nothing locked, what would it hurt to use multi-core 
sorting?

-- OR --

Threading (and multicore), to me, always mean queues.  What if new type's of backend's 
were created that did "simple" things, that normal backends could distribute 
work to, then go off and do other things, and come back to collect the results.

I thought I read a paper someplace that said shared cache (L1/L2/etc) multicore 
cpu's would start getting really slow at 16/32 cores, and that message passing 
was the way forward past that.  If PG started aiming for 128 core support right 
now, it should use some kinda message passing with queues thing, yes?

-Andy

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to