On Wed, 30 May 2012, Jeff Janes wrote:

But the question now is whether there is a *PG* problem here or not, or is
it Intel's or Linux's problem ? Because still the slowdown was caused by
locking. If there wouldn't be locking there wouldn't be any problems (as
demonstrated a while ago by just cat'ting the files in multiple threads).

You cannot have a traditional RDBMS without locking.  From your

I understand the need of significant locking when there concurrent writes, but not when there only reads. But I'm not a RDBMS expert, so that's maybe that's misunderstanding on my side.

description of the problem,  I probably wouldn't be using a traditional
database system at all for this, but rather flat files and Perl.

Flat files and perl for 25-50 TB of data over few years is a bit extreme ;)

Or
at least, I would partition the data before loading it to the DB,
rather than trying to do it after.

I intensionally did otherwise, because I thought that PG will to be much smarter than me in juggling the data I'm ingesting (~ tens of gig each day), join the appropriate bits of data and then split by partitions. Unfortunately I see that there are some scalability issues on the way, which I didn't expect. Those aren't fatal, but slightly disappointing.

But anyway, is idt_match a fairly static table?  If so, I'd partition
that into 16 tables, and then have each one of your tasks join against
a different one of those tables.  That should relieve the contention
on the index root block, and might have some other benefits as well.

No, idt_match is getting filled by multi-threaded copy() and then joined with 4 other big tables like idt_phot. The result is then split into partitions. And I was trying different approaches to fully utilize the CPUs and/or I/O and somehow parallize the queries. That's the reasoning for somewhat contrived queries in my test.

Cheers,
        S

*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to