On Wed, 30 May 2012, Jeff Janes wrote:
But the question now is whether there is a *PG* problem here or not, or is
it Intel's or Linux's problem ? Because still the slowdown was caused by
locking. If there wouldn't be locking there wouldn't be any problems (as
demonstrated a while ago by just cat'ting the files in multiple threads).
You cannot have a traditional RDBMS without locking. From your
I understand the need of significant locking when there concurrent writes,
but not when there only reads. But I'm not a RDBMS expert, so that's
maybe that's misunderstanding on my side.
description of the problem, I probably wouldn't be using a traditional
database system at all for this, but rather flat files and Perl.
Flat files and perl for 25-50 TB of data over few years is a bit extreme
;)
Or
at least, I would partition the data before loading it to the DB,
rather than trying to do it after.
I intensionally did otherwise, because I thought that PG will
to be much smarter than me in juggling the data I'm ingesting (~ tens of
gig each day), join the appropriate bits of data and then split by
partitions. Unfortunately I see that there are some scalability
issues on the way, which I didn't expect. Those aren't fatal, but slightly
disappointing.
But anyway, is idt_match a fairly static table? If so, I'd partition
that into 16 tables, and then have each one of your tasks join against
a different one of those tables. That should relieve the contention
on the index root block, and might have some other benefits as well.
No, idt_match is getting filled by multi-threaded copy() and then joined
with 4 other big tables like idt_phot. The result is then split into
partitions. And I was trying different approaches to fully utilize the
CPUs and/or I/O and somehow parallize the queries. That's the
reasoning for somewhat contrived queries in my test.
Cheers,
S
*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers