Hi,
On Mon, 4 Jun 2012, Ants Aasma wrote:
On Mon, Jun 4, 2012 at 7:44 PM, Merlin Moncure mmonc...@gmail.com wrote:
I tried to keep it simple at first to find an answer to the question
if it's even worth trying before expending large effort on it. If
anyone with a multisocket machine would chip
On Wed, Jun 6, 2012 at 2:27 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
I've quickly tested your lockfree-getbuffer.patch patch with the test case
you provided and I barely see any improvement (2% at max)
https://docs.google.com/open?id=0B7koR68V2nM1QVBxWGpZdW4wd0U
tested with 24 core (48
On Wed, 6 Jun 2012, Ants Aasma wrote:
On Wed, Jun 6, 2012 at 2:27 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
I've quickly tested your lockfree-getbuffer.patch patch with the test case
you provided and I barely see any improvement (2% at max)
On Wed, Jun 6, 2012 at 2:53 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
On Wed, 6 Jun 2012, Ants Aasma wrote:
On Wed, Jun 6, 2012 at 2:27 PM, Sergey Koposov kopo...@ast.cam.ac.uk
wrote:
I've quickly tested your lockfree-getbuffer.patch patch with the test
case
you provided and I barely
On Wed, 6 Jun 2012, Merlin Moncure wrote:
I think this is the expected result. In the single user case the
spinklock never spins and only has to make the cpu-locking cache
instructions once. can we see results @24 threads?
Here
https://docs.google.com/open?id=0B7koR68V2nM1NDJHLUhNSS0zbUk
On Fri, Jun 1, 2012 at 9:55 PM, Ants Aasma a...@cybertec.at wrote:
On Sat, Jun 2, 2012 at 1:48 AM, Merlin Moncure mmonc...@gmail.com wrote:
Buffer pins aren't a cache: with a cache you are trying to mask a slow
operation (like a disk i/o) with a faster such that the amount of slow
operations
On Mon, Jun 4, 2012 at 5:12 PM, Robert Haas robertmh...@gmail.com wrote:
Note sure about the rest of this patch, but this part is definitely bogus:
+#if !defined(pg_atomic_fetch_and_set)
+#define pg_atomic_fetch_and_set(dst, src, value) \
+ do { S_LOCK(dummy_spinlock); \
+ dst =
On Mon, Jun 4, 2012 at 9:20 AM, Ants Aasma a...@cybertec.at wrote:
On Mon, Jun 4, 2012 at 5:12 PM, Robert Haas robertmh...@gmail.com wrote:
Note sure about the rest of this patch, but this part is definitely bogus:
+#if !defined(pg_atomic_fetch_and_set)
+#define pg_atomic_fetch_and_set(dst,
On Mon, Jun 4, 2012 at 10:17 AM, Merlin Moncure mmonc...@gmail.com wrote:
What happens (in the very unlikely, but possible case?) if another
backend races to the buffer you've pointed at with 'victim'? It looks
like multiple backends share the clock sweep now, but don't you need
to need an
On Mon, Jun 4, 2012 at 6:38 PM, Merlin Moncure mmonc...@gmail.com wrote:
On Mon, Jun 4, 2012 at 10:17 AM, Merlin Moncure mmonc...@gmail.com wrote:
What happens (in the very unlikely, but possible case?) if another
backend races to the buffer you've pointed at with 'victim'? It looks
like
On Mon, Jun 4, 2012 at 10:42 AM, Ants Aasma a...@cybertec.at wrote:
On Mon, Jun 4, 2012 at 6:38 PM, Merlin Moncure mmonc...@gmail.com wrote:
On Mon, Jun 4, 2012 at 10:17 AM, Merlin Moncure mmonc...@gmail.com wrote:
What happens (in the very unlikely, but possible case?) if another
backend
On Thu, 31 May 2012, Jeff Janes wrote:
No, idt_match is getting filled by multi-threaded copy() and then joined
with 4 other big tables like idt_phot. The result is then split into
partitions.
That does make things more complicated. But you could you partition
it at that level and then do
On May31, 2012, at 20:50 , Robert Haas wrote:
Suppose we introduce two new buffer flags,
BUF_NAILED and BUF_NAIL_REMOVAL. When we detect excessive contention
on the buffer header spinlock, we set BUF_NAILED. Once we do that,
the buffer can't be evicted until that flag is removed, and
On Fri, Jun 1, 2012 at 7:47 AM, Florian Pflug f...@phlo.org wrote:
On May31, 2012, at 20:50 , Robert Haas wrote:
Suppose we introduce two new buffer flags,
BUF_NAILED and BUF_NAIL_REMOVAL. When we detect excessive contention
on the buffer header spinlock, we set BUF_NAILED. Once we do that,
Merlin Moncure mmonc...@gmail.com writes:
A potential issue with this line of thinking is that your pin delay
queue could get highly pressured by outer portions of the query (as in
the OP's case) that will get little or no benefit from the delayed
pin. But choosing a sufficiently sized drain
On Jun1, 2012, at 15:45 , Tom Lane wrote:
Merlin Moncure mmonc...@gmail.com writes:
A potential issue with this line of thinking is that your pin delay
queue could get highly pressured by outer portions of the query (as in
the OP's case) that will get little or no benefit from the delayed
On Fri, Jun 1, 2012 at 8:45 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Merlin Moncure mmonc...@gmail.com writes:
A potential issue with this line of thinking is that your pin delay
queue could get highly pressured by outer portions of the query (as in
the OP's case) that will get little or no
On Fri, Jun 1, 2012 at 5:57 PM, Florian Pflug f...@phlo.org wrote:
My proposed algorithm could be made to use exactly that criterion
by tracking a little bit more state. We'd have to tag queue entries
with a flag indicating whether they are
Unpinned (COLD)
Pinned, and unpinning should be
On Fri, Jun 1, 2012 at 8:47 AM, Florian Pflug f...@phlo.org wrote:
A simpler idea would be to collapse UnpinBuffer() / PinBuffer() pairs
by queing UnpinBuffer() requests for a while before actually updating
shared state.
So, what happens when somebody wants a cleanup lock on the buffer
you've
On Fri, Jun 1, 2012 at 10:51 AM, Robert Haas robertmh...@gmail.com wrote:
On Fri, Jun 1, 2012 at 8:47 AM, Florian Pflug f...@phlo.org wrote:
We'd drain the unpin queue whenever we don't expect a PinBuffer() request
to happen for a while. Returning to the main loop is an obvious such place,
On Jun1, 2012, at 19:51 , Robert Haas wrote:
On Fri, Jun 1, 2012 at 8:47 AM, Florian Pflug f...@phlo.org wrote:
A simpler idea would be to collapse UnpinBuffer() / PinBuffer() pairs
by queing UnpinBuffer() requests for a while before actually updating
shared state.
So, what happens when
On Fri, Jun 1, 2012 at 2:06 PM, Jeff Janes jeff.ja...@gmail.com wrote:
On Fri, Jun 1, 2012 at 10:51 AM, Robert Haas robertmh...@gmail.com wrote:
On Fri, Jun 1, 2012 at 8:47 AM, Florian Pflug f...@phlo.org wrote:
We'd drain the unpin queue whenever we don't expect a PinBuffer() request
to
On Fri, Jun 1, 2012 at 2:54 PM, Florian Pflug f...@phlo.org wrote:
On Jun1, 2012, at 19:51 , Robert Haas wrote:
On Fri, Jun 1, 2012 at 8:47 AM, Florian Pflug f...@phlo.org wrote:
A simpler idea would be to collapse UnpinBuffer() / PinBuffer() pairs
by queing UnpinBuffer() requests for a while
Robert Haas robertmh...@gmail.com writes:
Another thought is that if the problem is limited to the root index
block, then we could consider less general solutions, like allowing
backends to cache the root index block and sending some kind of
invalidation when it gets split.
Possibly worth
On Jun1, 2012, at 21:07 , Robert Haas wrote:
On Fri, Jun 1, 2012 at 2:54 PM, Florian Pflug f...@phlo.org wrote:
On Jun1, 2012, at 19:51 , Robert Haas wrote:
On Fri, Jun 1, 2012 at 8:47 AM, Florian Pflug f...@phlo.org wrote:
We'd drain the unpin queue whenever we don't expect a PinBuffer()
On Fri, Jun 1, 2012 at 3:16 PM, Florian Pflug f...@phlo.org wrote:
Ok, now you've lost me. If the read() blocks, how on earth can a few
additional pins/unpins ever account for any meaningful execution time?
It seems to me that once read() blocks we're talking about a delay in the
order of the
On Fri, Jun 1, 2012 at 3:40 PM, Robert Haas robertmh...@gmail.com wrote:
On Fri, Jun 1, 2012 at 3:16 PM, Florian Pflug f...@phlo.org wrote:
Ok, now you've lost me. If the read() blocks, how on earth can a few
additional pins/unpins ever account for any meaningful execution time?
It seems to
On Sat, Jun 2, 2012 at 1:48 AM, Merlin Moncure mmonc...@gmail.com wrote:
Buffer pins aren't a cache: with a cache you are trying to mask a slow
operation (like a disk i/o) with a faster such that the amount of slow
operations are minimized. Buffer pins however are very different in
that we
On Thu, May 31, 2012 at 7:31 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
On Wed, 30 May 2012, Robert Haas wrote:
I'd really like to find out exactly where all those s_lock calls are
coming from. Is there any way you can get oprofile to output a
partial stack backtrace? If you have perf
On Thu, May 31, 2012 at 11:23 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
On Thu, 31 May 2012, Robert Haas wrote:
Thanks. How did you generate this perf report? It's cool, because I
haven't figured out how to make perf generate a report that is easily
email-able, and it seems you have.
On Thu, May 31, 2012 at 10:23 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
On Thu, 31 May 2012, Robert Haas wrote:
Thanks. How did you generate this perf report? It's cool, because I
haven't figured out how to make perf generate a report that is easily
email-able, and it seems you have.
On Thu, 31 May 2012, Robert Haas wrote:
Oh, ho. So from this we can see that the problem is that we're
getting huge amounts of spinlock contention when pinning and unpinning
index pages.
It would be nice to have a self-contained reproducible test case for
this, so that we could experiment
On Thu, May 31, 2012 at 11:54 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
On Thu, 31 May 2012, Robert Haas wrote:
Oh, ho. So from this we can see that the problem is that we're
getting huge amounts of spinlock contention when pinning and unpinning
index pages.
It would be nice to have
On Thu, May 31, 2012 at 2:03 PM, Merlin Moncure mmonc...@gmail.com wrote:
On Thu, May 31, 2012 at 11:54 AM, Sergey Koposov kopo...@ast.cam.ac.uk
wrote:
On Thu, 31 May 2012, Robert Haas wrote:
Oh, ho. So from this we can see that the problem is that we're
getting huge amounts of spinlock
On Thu, May 31, 2012 at 1:50 PM, Robert Haas robertmh...@gmail.com wrote:
On Thu, May 31, 2012 at 2:03 PM, Merlin Moncure mmonc...@gmail.com wrote:
On Thu, May 31, 2012 at 11:54 AM, Sergey Koposov kopo...@ast.cam.ac.uk
wrote:
On Thu, 31 May 2012, Robert Haas wrote:
Oh, ho. So from this we
On Thu, May 31, 2012 at 3:25 PM, Merlin Moncure mmonc...@gmail.com wrote:
Hm, couple questions: how do you determine if/when to un-nail a
buffer, and who makes that decision (bgwriter?)
Well, I think some experimentation might be required, but my first
thought is to tie it into buffer eviction.
On Sun, May 27, 2012 at 11:45 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
Hi,
I did another test using the same data and the same code, which I've
provided before and the performance of the single thread seems to be
degrading quadratically with the number of threads.
Here are the
On Thu, May 31, 2012 at 9:17 AM, Robert Haas robertmh...@gmail.com wrote:
Oh, ho. So from this we can see that the problem is that we're
getting huge amounts of spinlock contention when pinning and unpinning
index pages.
It would be nice to have a self-contained reproducible test case for
On Thu, May 31, 2012 at 11:50 AM, Robert Haas robertmh...@gmail.com wrote:
This test case is unusual because it hits a whole series of buffers
very hard. However, there are other cases where this happens on a
single buffer that is just very, very hot, like the root block of a
btree index,
On Wed, May 30, 2012 at 6:10 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
On Wed, 30 May 2012, Jeff Janes wrote:
But anyway, is idt_match a fairly static table? If so, I'd partition
that into 16 tables, and then have each one of your tasks join against
a different one of those tables.
On Sun, May 27, 2012 at 1:45 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
Hi,
I did another test using the same data and the same code, which I've
provided before and the performance of the single thread seems to be
degrading quadratically with the number of threads.
Here are the
Here is the actual explain analyze of the query on the smaller dataset
which I have been using for the recent testing.
test=# explain analyze create table _tmp0 as select * from
( select *,
(select healpixid from idt_match as m where m.transitid=o.transitid)
as x from
On Wed, May 30, 2012 at 10:42 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
Here is the actual explain analyze of the query on the smaller dataset
which I have been using for the recent testing.
test=# explain analyze create table _tmp0 as select * from
( select *,
(select
On Wed, 30 May 2012, Merlin Moncure wrote:
1. Can we see an explain analyze during a 'bogged' case?
Here is the one to one comparison of the 'bogged'
**
QUERY PLAN
On Wed, May 30, 2012 at 12:58 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
Here is the one to one comparison of the 'bogged' **
QUERY PLAN
On Wed, May 30, 2012 at 12:11 PM, Robert Haas robertmh...@gmail.com wrote:
On Wed, May 30, 2012 at 12:58 PM, Sergey Koposov kopo...@ast.cam.ac.uk
wrote:
Here is the one to one comparison of the 'bogged' **
QUERY PLAN
On Wed, 30 May 2012, Merlin Moncure wrote:
Hm, why aren't we getting a IOS? Just for kicks (assuming this is
test data), can we drop the index on just transitid, leaving the index
on transitid, healpixid?Is enable_indexonlyscan on? Has idt_match
been vacuumed? What kind of plan do you
On Wed, May 30, 2012 at 1:45 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
On Wed, 30 May 2012, Merlin Moncure wrote:
Hm, why aren't we getting a IOS? Just for kicks (assuming this is
test data), can we drop the index on just transitid, leaving the index
on transitid, healpixid? Is
On Wed, 30 May 2012, Merlin Moncure wrote:
hurk -- ISTM that since IOS is masikng the heap lookups, there must
be contention on the index itself? Does this working set fit in
shared memory? If so, what happens when you do a database restart and
repeat the IOS test?
The dataset fits well in
On Wed, May 30, 2012 at 11:45 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
On Wed, 30 May 2012, Merlin Moncure wrote:
Hm, why aren't we getting a IOS? Just for kicks (assuming this is
test data), can we drop the index on just transitid, leaving the index
on transitid, healpixid? Is
On Wed, May 30, 2012 at 3:15 PM, Jeff Janes jeff.ja...@gmail.com wrote:
On Wed, May 30, 2012 at 11:45 AM, Sergey Koposov kopo...@ast.cam.ac.uk
wrote:
On Wed, 30 May 2012, Merlin Moncure wrote:
Hm, why aren't we getting a IOS? Just for kicks (assuming this is
test data), can we drop the
On Wed, 30 May 2012, Merlin Moncure wrote:
How big is idt_match? What if you drop all indexes on idt_match,
encouraging all the backends to do hash joins against it, which occur
in local memory and so don't have contention?
You just missed his post -- it's only 3G. can you run your 'small'
On May30, 2012, at 22:07 , Sergey Koposov wrote:
If I restart the db the timings do not change significantly. There is always
some variation which I don't really understand, e.g. the parallel runs
sometimes
take 18s, or 25 seconds, or 30 seconds per thread. So there is something else
On Wed, 30 May 2012, Florian Pflug wrote:
I wonder if the huge variance could be caused by non-uniform
synchronization costs across different cores. That's not all that
unlikely, because at least some cache levels (L2 and/or L3, I think) are
usually shared between all cores on a single die.
On May31, 2012, at 01:16 , Sergey Koposov wrote:
On Wed, 30 May 2012, Florian Pflug wrote:
I wonder if the huge variance could be caused by non-uniform synchronization
costs across different cores. That's not all that unlikely, because at least
some cache levels (L2 and/or L3, I think) are
On Thu, 31 May 2012, Florian Pflug wrote:
Wait, so performance *increased* by spreading the backends out over as
many dies as possible, not by using as few as possible? That'd be
exactly the opposite of what I'd have expected. (I'm assuming that cores
on one die have ascending ids on linux.
On Wed, May 30, 2012 at 4:16 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
But the question now is whether there is a *PG* problem here or not, or is
it Intel's or Linux's problem ? Because still the slowdown was caused by
locking. If there wouldn't be locking there wouldn't be any problems
Sergey, all,
* Sergey Koposov (kopo...@ast.cam.ac.uk) wrote:
I did a specific test with just 6 threads (== number of cores per cpu)
and ran it on a single phys cpu, it took ~ 12 seconds for each thread,
and when I tried to spread it across 4 cpus it took 7-9 seconds per
thread. But all these
* Sergey Koposov (kopo...@ast.cam.ac.uk) wrote:
I did a specific test with just 6 threads (== number of cores per cpu)
and ran it on a single phys cpu, it took ~ 12 seconds for each thread,
and when I tried to spread it across 4 cpus it took 7-9 seconds per
thread. But all these numbers are
On Wed, 30 May 2012, Jeff Janes wrote:
But the question now is whether there is a *PG* problem here or not, or is
it Intel's or Linux's problem ? Because still the slowdown was caused by
locking. If there wouldn't be locking there wouldn't be any problems (as
demonstrated a while ago by just
On Wed, May 30, 2012 at 9:10 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
I understand the need of significant locking when there concurrent writes,
but not when there only reads. But I'm not a RDBMS expert, so that's maybe
that's misunderstanding on my side.
If we knew in advance that no
On May31, 2012, at 02:26 , Sergey Koposov wrote:
On Thu, 31 May 2012, Florian Pflug wrote:
Wait, so performance *increased* by spreading the backends out over as many
dies as possible, not by using as few as possible? That'd be exactly the
opposite of what I'd have expected. (I'm assuming
Robert,
* Robert Haas (robertmh...@gmail.com) wrote:
On Wed, May 30, 2012 at 9:10 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
I understand the need of significant locking when there concurrent writes,
but not when there only reads. But I'm not a RDBMS expert, so that's maybe
that's
On Wed, May 30, 2012 at 7:00 PM, Stephen Frost sfr...@snowman.net wrote:
Robert,
* Robert Haas (robertmh...@gmail.com) wrote:
On Wed, May 30, 2012 at 9:10 PM, Sergey Koposov kopo...@ast.cam.ac.uk
wrote:
I understand the need of significant locking when there concurrent writes,
but not
Hi,
I did another test using the same data and the same code, which I've
provided before and the performance of the single thread seems to be
degrading quadratically with the number of threads.
Here are the results:
Nthreads Time_to_execute_one_thread
1 8.1
2 7.8
3 8.1
4 9.0
5 10.2
6 11.4
7
On Fri, May 25, 2012 at 10:30 AM, Merlin Moncure mmonc...@gmail.com wrote:
I think what's happening here is that the buffer partitions don't help
(in fact, they hurt) in the presence of multiple concurrent scans that
are operating on approximately the same data. Sooner or later the
scans line
On Sat, 26 May 2012, Robert Haas wrote:
This theory is seeming fairly plausible - how can we test it?
How about trying it with synchronize_seqscans = off? If the
synchronized-seqscan logic is causing contention on the buf mapping
locks and individual buffer locks, that should fix it.
Turning
* Sergey Koposov (kopo...@ast.cam.ac.uk) wrote:
Turning off synch seq scans doesn't help either. 18 sec
multithreaded run vs 7 sec single threaded.
Alright, can you just time 'cat' when they're started a few seconds or
whatever apart from each other? I can't imagine it being affected in
the
On Sat, 26 May 2012, Stephen Frost wrote:
Alright, can you just time 'cat' when they're started a few seconds or
whatever apart from each other? I can't imagine it being affected in
the same way as these, but seems like it wouldn't hurt to check.
I've tryed cat'ting a created in advance 8gig
On Thu, May 24, 2012 at 6:26 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
On Thu, 24 May 2012, Jeff Janes wrote:
Add
#define LWLOCK_STATS
near the top of:
src/backend/storage/lmgr/lwlock.c
and recompile and run a reduced-size workload. When the processes
exits, they will dump a lot of
On Fri, 25 May 2012, Merlin Moncure wrote:
These are all on the buffer partition locks. That makes sense...I was
wrong earlier: this case was in fact 'create table as', not 'insert
select' which rules out both the freelist lock and the wal insert lock
because create table as gets to use both a
On Fri, May 25, 2012 at 8:06 AM, Merlin Moncure mmonc...@gmail.com wrote:
On Thu, May 24, 2012 at 6:26 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
On Thu, 24 May 2012, Jeff Janes wrote:
Add
#define LWLOCK_STATS
near the top of:
src/backend/storage/lmgr/lwlock.c
and recompile and run a
Merlin Moncure mmonc...@gmail.com writes:
Hm, what if BufTableHashPartition() was pseudo randomized so that
different backends would not get the same buffer partition for a
particular tag?
Huh? You have to make sure that different backends will find the same
buffer for the same page, so I
On Fri, May 25, 2012 at 10:22 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Merlin Moncure mmonc...@gmail.com writes:
Hm, what if BufTableHashPartition() was pseudo randomized so that
different backends would not get the same buffer partition for a
particular tag?
Huh? You have to make sure that
* Merlin Moncure (mmonc...@gmail.com) wrote:
Right -- duh. Well, hm. Is this worth fixing? ISTM there's a bit of
'optimizing for pgbench-itis' in the buffer partitions -- they seem
optimized to lever the mostly random access behavior of pgbench. But
how likely is it to see multiple
On Thu, May 24, 2012 at 4:26 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
On Thu, 24 May 2012, Jeff Janes wrote:
Add
#define LWLOCK_STATS
near the top of:
src/backend/storage/lmgr/lwlock.c
and recompile and run a reduced-size workload. When the processes
exits, they will dump a lot of
On Fri, May 25, 2012 at 11:17 AM, Stephen Frost sfr...@snowman.net wrote:
* Merlin Moncure (mmonc...@gmail.com) wrote:
Right -- duh. Well, hm. Is this worth fixing? ISTM there's a bit of
'optimizing for pgbench-itis' in the buffer partitions -- they seem
optimized to lever the mostly random
Merlin Moncure mmonc...@gmail.com writes:
On Fri, May 25, 2012 at 11:17 AM, Stephen Frost sfr...@snowman.net wrote:
Didn't we implement a system whereby this is exactly what we intend to
happen on the read side- that is, everyone doing a SeqScan gangs up on
one ring buffer and follows it,
On Fri, 25 May 2012, Merlin Moncure wrote:
how likely is it to see multiple simultaneous scans in the real world?
Interleaving scans like that is not a very effective optimization --
if it was me, it'd be trying to organize something around a
partitioned tid scan for parallel sequential access.
On Thu, May 24, 2012 at 7:26 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
Here is the output from a multi-threaded run (8thtreads, 22 seconds) sorted
by blk. Not sure whether that's of much use or not:
What are the top dozen or so entries if you sort by shacq?
--
Robert Haas
EnterpriseDB:
On Fri, 25 May 2012, Robert Haas wrote:
On Thu, May 24, 2012 at 7:26 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
Here is the output from a multi-threaded run (8thtreads, 22 seconds) sorted
by blk. Not sure whether that's of much use or not:
What are the top dozen or so entries if you
On Fri, May 25, 2012 at 11:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Merlin Moncure mmonc...@gmail.com writes:
On Fri, May 25, 2012 at 11:17 AM, Stephen Frost sfr...@snowman.net wrote:
Didn't we implement a system whereby this is exactly what we intend to
happen on the read side- that is,
On Fri, 25 May 2012, Merlin Moncure wrote:
can you hack this in heapam.c and see if it helps?
line 131-ish:
if (!RelationUsesLocalBuffers(scan-rs_rd)
scan-rs_nblocks NBuffers / 4)
becomes
if (!RelationUsesLocalBuffers(scan-rs_rd))
(also you can set the
Hi,
I've been running some tests on pg 9.2beta1 and in particular a set
of queries like
create table _tmp0 as select * from (
select *, (select healpixid from idt_match as m where
m.transitid=o.transitid)
as x from
On Thu, May 24, 2012 at 8:24 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
Hi,
I've been running some tests on pg 9.2beta1 and in particular a set
of queries like
create table _tmp0 as select * from (
select *, (select healpixid from idt_match as m where
On Thu, 24 May 2012, Merlin Moncure wrote:
Are you sure? I looked at all the ReleasePredicateLocks calls and
they appear to be guarded by:
/* Nothing to do if this is not a serializable transaction */
if (MySerializableXact == InvalidSerializableXact)
return
On Thu, May 24, 2012 at 6:24 AM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
Hi,
I've been running some tests on pg 9.2beta1 and in particular a set
of queries like
...
And I noticed than when I run the query like the one shown above in parallel
(in multiple connections for ZZZ=0...8) the
On Thu, May 24, 2012 at 1:42 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
I guess there is nothing catastrophically wrong with that, but still I'm
very suprised that you get severe locking problems (factor of two slow-down)
when running parallel read-only transactions.
Me, too. How many
On Thu, 24 May 2012, Robert Haas wrote:
On Thu, May 24, 2012 at 1:42 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
I guess there is nothing catastrophically wrong with that, but still I'm
very suprised that you get severe locking problems (factor of two slow-down)
when running parallel
On Thu, May 24, 2012 at 2:19 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
On Thu, 24 May 2012, Robert Haas wrote:
On Thu, May 24, 2012 at 1:42 PM, Sergey Koposov kopo...@ast.cam.ac.uk
wrote:
I guess there is nothing catastrophically wrong with that, but still I'm
very suprised that you
On Thu, May 24, 2012 at 1:43 PM, Robert Haas robertmh...@gmail.com wrote:
On Thu, May 24, 2012 at 2:19 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
On Thu, 24 May 2012, Robert Haas wrote:
On Thu, May 24, 2012 at 1:42 PM, Sergey Koposov kopo...@ast.cam.ac.uk
wrote:
I guess there is nothing
On Thu, May 24, 2012 at 12:46 PM, Merlin Moncure mmonc...@gmail.com wrote:
On Thu, May 24, 2012 at 2:24 PM, Merlin Moncure mmonc...@gmail.com wrote:
As you can see, raw performance isn't much worse with the larger data
sets, but scalability at high connection counts is severely degraded
once
On Thu, May 24, 2012 at 3:46 PM, Merlin Moncure mmonc...@gmail.com wrote:
On Thu, May 24, 2012 at 2:24 PM, Merlin Moncure mmonc...@gmail.com wrote:
As you can see, raw performance isn't much worse with the larger data
sets, but scalability at high connection counts is severely degraded
once
On Thu, May 24, 2012 at 3:35 PM, Robert Haas robertmh...@gmail.com wrote:
On Thu, May 24, 2012 at 3:46 PM, Merlin Moncure mmonc...@gmail.com wrote:
hm, looking at the code some more, it looks like the whole point of
the strategy system is to do that. ISTM bulk insert type queries
would be
On Thu, May 24, 2012 at 4:46 PM, Merlin Moncure mmonc...@gmail.com wrote:
Wait -- OP's gripe this isn't regarding standard pgbench, but multiple
large concurrent 'insert into foo select...'. I looked in the code
and it appears that the only bulk insert strategy using operations are
copy,
Robert Haas robertmh...@gmail.com writes:
On Thu, May 24, 2012 at 4:46 PM, Merlin Moncure mmonc...@gmail.com wrote:
We don't get to skip wal of course, but we should be able to use a
bulk insert strategy, especially if there was some way of predicting
that a large number of tuples were going
Hi,
On Thu, 24 May 2012, Robert Haas wrote:
Not sure. It might be some other LWLock, but it's hard to tell which
one from the information provided.
If you could tell what's the best way to find out the info that you need,
then I could run it reasonably quickly.
S
On Thu, May 24, 2012 at 3:36 PM, Sergey Koposov kopo...@ast.cam.ac.uk wrote:
Hi,
On Thu, 24 May 2012, Robert Haas wrote:
Not sure. It might be some other LWLock, but it's hard to tell which
one from the information provided.
If you could tell what's the best way to find out the info
On Thu, 24 May 2012, Jeff Janes wrote:
Add
#define LWLOCK_STATS
near the top of:
src/backend/storage/lmgr/lwlock.c
and recompile and run a reduced-size workload. When the processes
exits, they will dump a lot of data about LWLock usage to the logfile.
Generally the LWLock with the most blocks
99 matches
Mail list logo