Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-16 Thread Sergey Koposov

On Wed, 16 Jan 2013, Andres Freund wrote:

So unless somebody suggest otherwise, i'm going to switch to gcc on this
buildfarm.


What about switching to -O1 of 11.1?


I don't know. It is up to -hackers to decide. I think that icc on ia64 
have shown bugginess time after time. But if you think that buildfarm 
with icc 11.1 -O1 carry more information than say running gcc, i can 
still run icc.


S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-16 Thread Sergey Koposov

Hi,

On Wed, 16 Jan 2013, Andres Freund wrote:

On 2013-01-16 01:28:09 -0500, Tom Lane wrote:

It's a compiler bug.


Thanks for investigating. But I'm not sure there is any way other way for 
me other than switching to gcc, because intel stopped providing their 
IA64 version of compilers free of charge even for 
non-commercial/educational people:

http://software.intel.com/en-us/intel-software-development-tools-for-intel-itanium-processors
(their website is also a bit of maze, so I don't see anywhere 
the updated versions of the 10.0, 10.1, 11.1 compilers that I have)


So unless somebody suggest otherwise, i'm going to switch to gcc on this 
buildfarm.


Cheers,
Sergey

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Sergey Koposov

On Tue, 15 Jan 2013, Tom Lane wrote:


BTW, I just finished trying to reproduce this on an IA64 machine
belonging to Red Hat, without success.  So that seems to eliminate
any possibility of the machine architecture being the trigger issue.
The compiler's still a likely cause though.
Anybody have a similar ICC version (dugong's says it is 10.0 20070809)
to try?  Also, Sergey, can you find a non-dot-zero release to try?


I think it is indeed the main issue.
I've tried 10.1 ( 10.1.011 ) and it doesn't fail.

When I tried 11.1 (icc (ICC) 11.1 20100401 ) it failed in a quite strange 
way (I don't quite remember it happenning before)


test tablespace   ... ok
parallel group (18 tests):  txid int2 text name oid varchar int4 char money
float8 uuid float4 int8 boolean bit enum numeric rangetypes
 boolean  ... ok
 char ... ok
 name ... ok
 varchar  ... ok
 text ... ok
 int2 ... ok
 int4 ... ok
 int8 ... ok
 oid  ... ok
 float4   ... ok
 float8   ... ok
 bit  ... ok
 numeric  ... ok
 txid ... ok
 uuid ... ok
 enum ... ok
 money... ok
 rangetypes   ... FAILED (test process exited with exit code
 2)
test strings  ... FAILED (test process exited with exit code
2)
test numerology   ... FAILED (test process exited with exit code
2)
parallel group (19 tests):  path interval time inet circle macaddr comments
timestamp timestamptz reltime date tstypes tinterval abstime timetz lseg box
polygon point
 point... FAILED (test process exited with exit code
2)
 lseg ... FAILED (test process exited with exit code
2)
 box  ... FAILED (test process exited with exit code
2)
 path ... FAILED (test process exited with exit code
2)
 polygon  ... FAILED (test process exited with exit code
2)
 circle   ... FAILED (test process exited with exit code
2)
 date ... FAILED (test process exited with exit code
2)
 time ... FAILED (test process exited with exit code
2)
 timetz   ... FAILED (test process exited with exit code
2)
 timestamp... FAILED (test process exited with exit code
2)
 timestamptz  ... FAILED (test process exited with exit code
2)
 interval ... FAILED (test process exited with exit code
2)
 abstime  ... FAILED (test process exited with exit code
2)
 reltime  ... FAILED (test process exited with exit code
2)
 tinterval... FAILED (test process exited with exit code
2)
 inet ... FAILED (test process exited with exit code
2)
 macaddr  ... FAILED (test process exited with exit code
2)
 tstypes  ... FAILED (test process exited with exit code
2)
 comments ... FAILED (test process exited with exit code
2)
parallel group (6 tests):  geometry regex horology type_sanity oidjoins
opr_sanity
 geometry ... FAILED
 horology ... FAILED
 regex... ok
 oidjoins ... ok
 type_sanity  ... ok
 opr_sanity   ... ok
test insert   ... ok
test create_function_1... ok
test create_type  ... ok
test create_table ... ok
test create_function_2... ok
parallel group (2 tests):  copyselect copy
 copy ... ok
 copyselect   ... ok
parallel group (2 tests):  create_operator create_misc
 create_misc  ... ok
 create_operator  ... ok
parallel group (2 tests):  create_view create_index
 create_index ... FAILED (test process exited with exit code
2)
 create_view  ... ok
parallel group (11 tests):  constraints triggers create_cast
create_function_3 updatable_views inherit drop_if_exists create_aggregate
create_table_like typed_table vacuum
 create_aggregate ... FAILED (test process exited with exit code
2)
 create_function_3... FAILED (test process exited with exit code
2)
 create_cast  ... FAILED (test process exited with exit code
2)
 constraints  ... FAILED (test process exited with exit code
2)
 triggers ... FAILED (test process exited with exit code
2)
 inherit  ... FAILED (test process exited with exit code
2)
 create_table_like... FAILED (test process exite

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Sergey Koposov

On Tue, 15 Jan 2013, Andres Freund wrote:

Any chance you could run make check again but with log_statement=all and
log_min_messages=debug2? That might tell us a bit more, whether the
dependency code doesn't work right or whether the checkpoint is doing
strange things.


Here it is :


2013-01-15 23:06:18 MSK [50f5a8aa.1162:1] DEBUG:  SlruScanDirectory invoking 
callback on pg_notify/
2013-01-15 23:06:18 MSK [50f5a8aa.1162:2] DEBUG:  removing file "pg_notify/"
2013-01-15 23:06:18 MSK [50f5a8aa.1162:3] DEBUG:  max_safe_fds = 985, 
usable_fds = 1000, already_open = 5
2013-01-15 23:06:18 MSK [50f5a8aa.1167:1] LOG:  database system was shut down 
at 2013-01-15 23:06:18 MSK
2013-01-15 23:06:18 MSK [50f5a8aa.1167:2] DEBUG:  checkpoint record is at 
0/17700E0
2013-01-15 23:06:18 MSK [50f5a8aa.1167:3] DEBUG:  redo record is at 0/17700E0; 
shutdown TRUE
2013-01-15 23:06:18 MSK [50f5a8aa.1167:4] DEBUG:  next transaction ID: 0/686; 
next OID: 12031
2013-01-15 23:06:18 MSK [50f5a8aa.1167:5] DEBUG:  next MultiXactId: 1; next 
MultiXactOffset: 0
2013-01-15 23:06:18 MSK [50f5a8aa.1167:6] DEBUG:  oldest unfrozen transaction 
ID: 676, in database 1
2013-01-15 23:06:18 MSK [50f5a8aa.1167:7] DEBUG:  transaction ID wrap limit is 
2147484323, limited by database with OID 1
2013-01-15 23:06:18 MSK [50f5a8aa.1168:1] DEBUG:  checkpointer updated shared 
memory configuration values
2013-01-15 23:06:18 MSK [50f5a8aa.116b:1] LOG:  autovacuum launcher started
2013-01-15 23:06:18 MSK [50f5a8aa.1162:4] LOG:  database system is ready to 
accept connections
2013-01-15 23:06:19 MSK [50f5a8aa.1162:5] DEBUG:  forked new backend, pid=4463 
socket=8
2013-01-15 23:06:19 MSK [50f5a8aa.1162:6] DEBUG:  server process (PID 4463) 
exited with exit code 0
2013-01-15 23:06:19 MSK [50f5a8aa.1162:7] DEBUG:  forked new backend, pid=4465 
socket=8
2013-01-15 23:06:19 MSK [50f5a8ab.1171:1] LOG:  statement: CREATE DATABASE 
"regression" TEMPLATE=template0
2013-01-15 23:06:19 MSK [50f5a8aa.1168:2] LOG:  checkpoint starting: immediate 
force wait
2013-01-15 23:06:19 MSK [50f5a8aa.1168:3] DEBUG:  SlruScanDirectory invoking 
callback on pg_multixact/offsets/
2013-01-15 23:06:19 MSK [50f5a8aa.1168:4] DEBUG:  SlruScanDirectory invoking 
callback on pg_multixact/members/
2013-01-15 23:06:19 MSK [50f5a8aa.1168:5] DEBUG:  attempting to remove WAL 
segments older than log file 
2013-01-15 23:06:19 MSK [50f5a8aa.1168:6] DEBUG:  SlruScanDirectory invoking 
callback on pg_subtrans/
2013-01-15 23:06:19 MSK [50f5a8aa.1168:7] LOG:  checkpoint complete: wrote 3 
buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; 
write=0.001 s, sync=0.000 s, total=0.001 s; sync files=0, longest=0.000 s, 
average=0.000 s
2013-01-15 23:06:19 MSK [50f5a8aa.1168:8] LOG:  checkpoint starting: immediate 
force wait
2013-01-15 23:06:19 MSK [50f5a8aa.1168:9] DEBUG:  attempting to remove WAL 
segments older than log file 
2013-01-15 23:06:19 MSK [50f5a8aa.1168:10] DEBUG:  SlruScanDirectory invoking 
callback on pg_subtrans/
2013-01-15 23:06:19 MSK [50f5a8aa.1168:11] LOG:  checkpoint complete: wrote 0 
buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; 
write=0.001 s, sync=0.000 s, total=0.001 s; sync files=0, longest=0.000 s, 
average=0.000 s
2013-01-15 23:06:19 MSK [50f5a8aa.1162:8] DEBUG:  server process (PID 4465) 
exited with exit code 0
2013-01-15 23:06:19 MSK [50f5a8aa.1162:9] DEBUG:  forked new backend, pid=4467 
socket=8
2013-01-15 23:06:19 MSK [50f5a8ab.1173:1] LOG:  statement: ALTER DATABASE "regression" SET lc_messages TO 'C';ALTER 
DATABASE "regression" SET lc_monetary TO 'C';ALTER DATABASE "regression" SET lc_numeric TO 'C';ALTER DATABASE 
"regression" SET lc_time TO 'C';ALTER DATABASE "regression" SET timezone_abbreviations TO 'Default';
2013-01-15 23:06:19 MSK [50f5a8aa.1162:10] DEBUG:  server process (PID 4467) 
exited with exit code 0
2013-01-15 23:06:19 MSK [50f5a8aa.1162:11] DEBUG:  forked new backend, pid=4469 
socket=8
2013-01-15 23:06:19 MSK [50f5a8ab.1175:1] LOG:  statement: CREATE TABLESPACE 
testspace LOCATION '/home/math/pg_git/src/test/regress/testtablespace';
2013-01-15 23:06:19 MSK [50f5a8ab.1175:2] LOG:  statement: ALTER TABLESPACE 
testspace SET (random_page_cost = 1.0);
2013-01-15 23:06:19 MSK [50f5a8ab.1175:3] LOG:  statement: ALTER TABLESPACE 
testspace SET (some_nonexistent_parameter = true);
2013-01-15 23:06:19 MSK [50f5a8ab.1175:4] ERROR:  unrecognized parameter 
"some_nonexistent_parameter"
2013-01-15 23:06:19 MSK [50f5a8ab.1175:5] STATEMENT:  ALTER TABLESPACE 
testspace SET (some_nonexistent_parameter = true);
2013-01-15 23:06:19 MSK [50f5a8ab.1175:6] LOG:  statement: ALTER TABLESPACE 
testspace RESET (random_page_cost = 2.0);
2013-01-15 23:06:19 MSK [50f5a8ab.1175:7] ERROR:  RESET must not include values 
for parameters
2013-01-15 23:06:19 MSK [50f5a8ab.1175:8] STATEMENT:  ALTER TABLESPACE 
testspace RESET (random_page_cost = 2.0);
2013-01-15 23:

Re: [HACKERS] Curious buildfarm failures (fwd)

2013-01-15 Thread Sergey Koposov

Hi,


Date: Tue, 15 Jan 2013 11:57:07 -0500
From: Tom Lane 
To: Andres Freund 
Cc: m...@sai.msu.ru, pgsql-hackers@postgreSQL.org,
   Andrew Dunstan 
Subject: Re: Curious buildfarm failures

Well, it could be quite reproducible, if for example what's happening is
that the DROP is failing to wait for the checkpointer at all.

Is there a way to enable log_checkpoints during a buildfarm run?
It'd be good to get timestamps added to the postmaster log entries, too.


Here is the log output from the failing pg_regress after enabling checkpoints 
and timestamps:

2013-01-15 21:20:19 MSK [50f58fd3.589e:1] LOG:  database system was shut down 
at 2013-01-15 21:20:19 MS
K
2013-01-15 21:20:19 MSK [50f58fd3.58a2:1] LOG:  autovacuum launcher started
2013-01-15 21:20:19 MSK [50f58fd3.5899:1] LOG:  database system is ready to 
accept connections
2013-01-15 21:20:20 MSK [50f58fd3.589f:1] LOG:  checkpoint starting: immediate 
force wait
2013-01-15 21:20:21 MSK [50f58fd3.589f:2] LOG:  checkpoint complete: wrote 3 
buffers (0.0%); 0 transact
ion log file(s) added, 0 removed, 0 recycled; write=0.604 s, sync=0.000 s, 
total=0.605 s; sync files=0,
 longest=0.000 s, average=0.000 s
2013-01-15 21:20:21 MSK [50f58fd3.589f:3] LOG:  checkpoint starting: immediate 
force wait
2013-01-15 21:20:21 MSK [50f58fd3.589f:4] LOG:  checkpoint complete: wrote 0 
buffers (0.0%); 0 transact
ion log file(s) added, 0 removed, 0 recycled; write=0.000 s, sync=0.000 s, 
total=0.000 s; sync files=0,
 longest=0.000 s, average=0.000 s
2013-01-15 21:20:21 MSK [50f58fd5.58ac:1] ERROR:  unrecognized parameter 
"some_nonexistent_parameter"
2013-01-15 21:20:21 MSK [50f58fd5.58ac:2] STATEMENT:  ALTER TABLESPACE 
testspace SET (some_nonexistent_
parameter = true);
2013-01-15 21:20:21 MSK [50f58fd5.58ac:3] ERROR:  RESET must not include values 
for parameters
2013-01-15 21:20:21 MSK [50f58fd5.58ac:4] STATEMENT:  ALTER TABLESPACE 
testspace RESET (random_page_cos
t = 2.0);
2013-01-15 21:20:21 MSK [50f58fd5.58ac:5] ERROR:  duplicate key value violates 
unique constraint "anind
ex"
2013-01-15 21:20:21 MSK [50f58fd5.58ac:6] DETAIL:  Key (column1)=(1) already 
exists.
2013-01-15 21:20:21 MSK [50f58fd5.58ac:7] STATEMENT:  INSERT INTO 
testschema.atable VALUES(1);
2013-01-15 21:20:21 MSK [50f58fd5.58ac:8] ERROR:  directory "/no/such/location" 
does not exist
2013-01-15 21:20:21 MSK [50f58fd5.58ac:9] STATEMENT:  CREATE TABLESPACE 
badspace LOCATION '/no/such/loc
ation';
2013-01-15 21:20:21 MSK [50f58fd5.58ac:10] ERROR:  tablespace "nosuchspace" 
does not exist
2013-01-15 21:20:21 MSK [50f58fd5.58ac:11] STATEMENT:  CREATE TABLE bar (i int) 
TABLESPACE nosuchspace;
2013-01-15 21:20:21 MSK [50f58fd3.589f:5] LOG:  checkpoint starting: immediate 
force wait
2013-01-15 21:20:21 MSK [50f58fd3.589f:6] LOG:  checkpoint complete: wrote 37 
buffers (0.2%); 0 transac
tion log file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.000 s, 
total=0.001 s; sync files=0
, longest=0.000 s, average=0.000 s
2013-01-15 21:20:21 MSK [50f58fd5.58ac:12] ERROR:  tablespace "testspace" is 
not empty
2013-01-15 21:20:21 MSK [50f58fd5.58ac:13] STATEMENT:  DROP TABLESPACE 
testspace;
2013-01-15 21:20:21 MSK [50f58fd3.589f:7] LOG:  checkpoint starting: immediate 
force wait
2013-01-15 21:20:21 MSK [50f58fd3.589f:8] LOG:  checkpoint complete: wrote 9 
buffers (0.1%); 0 transact
ion log file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.000 s, 
total=0.001 s; sync files=0,
 longest=0.000 s, average=0.000 s
2013-01-15 21:20:21 MSK [50f58fd5.58ac:14] ERROR:  tablespace "testspace" is 
not empty
2013-01-15 21:20:21 MSK [50f58fd5.58ac:15] STATEMENT:  DROP TABLESPACE 
testspace;


And I do see the tblspc file left after the finish of "make check":
tmp_check/data/pg_tblspc/16385/PG_9.3_201212081/16384/16387

Cheers,
S

PS I wouldn't be surprised that it is a compiler bug though. But I did see the 
failure with newer icc as well.


*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bitmap scan much slower than index scan, hash_search_with_hash_value

2012-09-02 Thread Sergey Koposov

On Sun, 2 Sep 2012, Peter Geoghegan wrote:


On 2 September 2012 16:26, Sergey Koposov  wrote:

That's why we support altering that value with an ALTER TABLE...ALTER
COLUMN DDL statement. You might at least consider increasing the
statistics target for the column first though, to see if that will
make the n_distinct value better accord with reality.


Thanks, I've forgot about that possibility.

After fixing the n_distinct to the correct value the index scan became the 
preferred option. and the row number estimate for the index scan became 
more realistic


Sorry for the noise.
Sergey

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bitmap scan much slower than index scan, hash_search_with_hash_value

2012-09-02 Thread Sergey Koposov

On Sun, 2 Sep 2012, Tom Lane wrote:


Sergey Koposov  writes:
The problem is definitely the misestimation here:

 ->  Index Scan using data_objid_idx on data d1  (cost=0.00..26603.32 
rows=415080 width=40) (actual time=0.010..0.050 rows=248 loops=11456)
   Index Cond: (objid = t.objid)

The planner thinks that indexscan will be 2000 times more expensive than
it really is (assuming that the cost per retrieved row is linear, which
it isn't entirely, but close enough for the moment).  Of course, it's
also thinking the bitmap component scan on the same index will be 2000
times more expensive than reality, but that has only perhaps a 4X impact
on the total cost of the bitmap scan, since the use of the other index
is what dominates there.  With a more accurate idea of this join
condition's selectivity, I'm pretty certain it would have gone for a
plain indexscan, or else a bitmap scan using only this index.
So if there's a bug here, it's in eqjoinsel().  Can you pinpoint
anything unusual about the stats of the objid columns?


Here are the pg_stats rows for two tables.

 schemaname | tablename | attname | inherited | null_frac | avg_width | 
n_distinct |







 
most_common_vals

   !






   |





most_common_freq!
s !

  | !


  

Re: [HACKERS] bitmap scan much slower than index scan, hash_search_with_hash_value

2012-09-02 Thread Sergey Koposov
Thanks for your comments. 
On Sun, 2 Sep 2012, Peter Geoghegan wrote:

On 2 September 2012 06:21, Sergey Koposov  wrote:

I think that this kind of question is better suited to the
pgsql-performance list. Granted, it was presented as a bug report
(though they're generally sent to -bugs rather than -hackers), but I
don't think that this is a valid bug.


The reason is that was inclined to think that it is a bug is that I
encountered a similar bug before with bitmap scans and very big 
tables

http://archives.postgresql.org/pgsql-hackers/2011-08/msg00958.php
Furthermore 2 orders of magnitudes more of CPU time for bitmap scans 
comparing to  the index scan didn't sound right to me (although obviously

I'm not in the position to claim that it's 100% bug).


One obvious red-flag from your query plans is that there is a
misestimation of the row return count of a few orders of magnitude in
the Bitmap Index Scan node. Did you trying performing an ANALYZE to
see if that helped? It may also be helpful to show pg_stats entries
for both the data.mjd and test.mjd columns. You may find, prior to
doing an ANALYZE, that there is no entries for one of those tables.


The main large table is static and was analyzed. The test table was as 
well. But as mentioned in another recent email, the query is the join, so 
column correlation is a problem.



Turning off the enable_* planner options in production is generally
discouraged. Certainly, you'd be crazy to do that on a server-wide
basis.


I'm using PG for data mining, data analysis purposes with very few clients 
connected and very large tables, so enable_* is used quite often to fix 
incorrect plans due to incorrect selectivities.


Regards,
Sergey

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bitmap scan much slower than index scan, hash_search_with_hash_value

2012-09-02 Thread Sergey Koposov

On Sun, 2 Sep 2012, Pavel Stehule wrote:



statistics on data_objid_idx  table are absolutly out - so planner
cannot find optimal plan


That doesn't have anything to do with the problem, AFAIU.
First, the data table is static and was analysed.
Second, the query in question is the join, and afaik the estimation of the 
number of rows is known to be incorrect, in the case of column 
correlation.
Third, according at least to my understanding in the fully cached regime 
bitmap scan should not take two orders of magnitude more CPU time than 
index scan.


Sergey


Regard

Pavel Stehule


   Index Cond: (objid = t.objid)
 Total runtime: 66622.026 ms
(11 rows)

Here is the output when bitmap scans are disabled:
QUERY PLAN
 QUERY
PLAN

 Limit  (cost=0.00..329631941.65 rows=1 width=68) (actual
time=0.082..906.876 rows=1 loops=1)
   ->  Nested Loop  (cost=0.00..4979486036.95 rows=151062 width=68) (actual
time=0.081..905.683 rows=1 loops=1)
 Join Filter: (t.mjd = d1.mjd)
 ->  Seq Scan on test t  (cost=0.00..2632.77 rows=151677 width=28)
(actual time=0.009..1.679 rows=11456 loops=1)
 ->  Index Scan using data_objid_idx on data d1
(cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050
rows=248 loops=11456)
   Index Cond: (objid = t.objid)
 Total runtime: 907.462 ms

When the bitmap scans are enabled the "prof" of postgres shows
47.10%  postmaster  postgres   [.] hash_search_with_hash_value
|
--- hash_search_with_hash_value

11.06%  postmaster  postgres   [.] hash_seq_search
|
--- hash_seq_search

 6.95%  postmaster  postgres   [.] hash_any
|
--- hash_any

 5.17%  postmaster  postgres   [.] _bt_checkkeys
|
--- _bt_checkkeys

 4.07%  postmaster  postgres   [.] tbm_add_tuples
|
--- tbm_add_tuples

 3.41%  postmaster  postgres   [.] hash_search
|
--- hash_search


And the last note is that the crts.data table which is being bitmap scanned
is a 1.1Tb table with ~ 20e9 rows. My feeling is that the bitmap index scan
code
is somehow unprepared to combine two bitmaps for such a big table, and this
leads to the terrible performance.

Regards,
Sergey

PS Here are the schemas of the tables, just in case:
wsdb=> \d test
  Table "koposov.test"
 Column  |   Type   | Modifiers
-+--+---
 mjd | double precision |
 fieldid | bigint   |
 intmag  | integer  |
 objid   | bigint   |

wsdb=> \d crts.data
   Table "crts.data"
 Column |   Type   | Modifiers
+--+---
 objid  | bigint   |
 mjd| double precision |
 mag| real |
 emag   | real |
 ra | double precision |
 dec| double precision |
Indexes:
"data_mjd_idx" btree (mjd) WITH (fillfactor=100)
"data_objid_idx" btree (objid) WITH (fillfactor=100)
"data_q3c_ang2ipix_idx" btree (q3c_ang2ipix(ra, "dec")) WITH
(fillfactor=100)

PPS shared_buffers=10GB, work_mem=1GB
All the test shown here were don in fully cached regime.

PPS I can believe that what I'm seeing is a feature, not a bug of bitmap
scans,
and I can live with disabling them, but I still thought it's worth
reporting.


*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers




*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] bitmap scan much slower than index scan, hash_search_with_hash_value

2012-09-01 Thread Sergey Koposov

Hi,

I'm experiencing the case when bitmap scan is ~ 70 times slower than 
index scan which seems to be caused by 1) very big table 2) some hash 
search logic (hash_search_with_hash_value )


Here is the explain analyze of the query with bitmap scans allowed:

wsdb=> explain analyze select * from test as t, crts.data as d1
where d1.objid=t.objid and d1.mjd=t.mjd limit 1;
  QUERY PLAN
---
 Limit  (cost=11514.04..115493165.44 rows=1 width=68) (actual 
time=27.512..66620.231 rows=1 loops=1)
   ->  Nested Loop  (cost=11514.04..1799585184.18 rows=155832 width=68) (actual 
time=27.511..66616.807 rows=1 loops=1)
 ->  Seq Scan on test t  (cost=0.00..2678.40 rows=156240 width=28) 
(actual time=0.010..4.685 rows=11456 loops=1)
 ->  Bitmap Heap Scan on data d1  (cost=11514.04..11518.05 rows=1 
width=40) (actual time=5.807..5.807 rows=1 loops=11456)
   Recheck Cond: ((mjd = t.mjd) AND (objid = t.objid))
   ->  BitmapAnd  (cost=11514.04..11514.04 rows=1 width=0) (actual 
time=5.777..5.777 rows=0 loops=11456)
 ->  Bitmap Index Scan on data_mjd_idx  (cost=0.00..2501.40 
rows=42872 width=0) (actual time=3.920..3.920 rows=22241 loops=11456)
   Index Cond: (mjd = t.mjd)
 ->  Bitmap Index Scan on data_objid_idx  
(cost=0.00..8897.90 rows=415080 width=0) (actual time=0.025..0.025 rows=248 
loops=11456)
   Index Cond: (objid = t.objid)
 Total runtime: 66622.026 ms
(11 rows)

Here is the output when bitmap scans are disabled:  
  QUERY PLAN
 QUERY PLAN 


 Limit  (cost=0.00..329631941.65 rows=1 width=68) (actual 
time=0.082..906.876 rows=1 loops=1)
   ->  Nested Loop  (cost=0.00..4979486036.95 rows=151062 width=68) (actual 
time=0.081..905.683 rows=1 loops=1)
 Join Filter: (t.mjd = d1.mjd)
 ->  Seq Scan on test t  (cost=0.00..2632.77 rows=151677 width=28) 
(actual time=0.009..1.679 rows=11456 loops=1)
 ->  Index Scan using data_objid_idx on data d1  (cost=0.00..26603.32 
rows=415080 width=40) (actual time=0.010..0.050 rows=248 loops=11456)
   Index Cond: (objid = t.objid)
 Total runtime: 907.462 ms

When the bitmap scans are enabled the "prof" of postgres shows
47.10%  postmaster  postgres   [.] hash_search_with_hash_value
|
--- hash_search_with_hash_value

11.06%  postmaster  postgres   [.] hash_seq_search
|
--- hash_seq_search

 6.95%  postmaster  postgres   [.] hash_any
|
--- hash_any

 5.17%  postmaster  postgres   [.] _bt_checkkeys
|
--- _bt_checkkeys

 4.07%  postmaster  postgres   [.] tbm_add_tuples
|
--- tbm_add_tuples

 3.41%  postmaster  postgres   [.] hash_search
|
--- hash_search


And the last note is that the crts.data table which is being bitmap scanned is 
a 1.1Tb table with ~ 20e9 rows. My feeling is that the bitmap index scan code

is somehow unprepared to combine two bitmaps for such a big table, and this
leads to the terrible performance.

Regards,
Sergey

PS Here are the schemas of the tables, just in case:
wsdb=> \d test
  Table "koposov.test"
 Column  |   Type   | Modifiers
-+--+---
 mjd | double precision |
 fieldid | bigint   |
 intmag  | integer  |
 objid   | bigint   |

wsdb=> \d crts.data
   Table "crts.data"
 Column |   Type   | Modifiers
+--+---
 objid  | bigint   |
 mjd| double precision |
 mag| real |
 emag   | real |
 ra | double precision |
 dec| double precision |
Indexes:
"data_mjd_idx" btree (mjd) WITH (fillfactor=100)
"data_objid_idx" btree (objid) WITH (fillfactor=100)
"data_q3c_ang2ipix_idx" btree (q3c_ang2ipix(ra, "dec")) WITH 
(fillfactor=100)

PPS shared_buffers=10GB, work_mem=1GB
All the test shown here were don in fully cached regime.

PPS I can believe that what I'm seeing is a feature, not a bug of bitmap scans,
and I can live with disabling them, but I still thought it's worth reporting.


*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel

Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-06-07 Thread Sergey Koposov

On Thu, 7 Jun 2012, Tom Lane wrote:

I extended the patch to also cover DropDatabaseBuffers,
FlushRelationBuffers, and FlushDatabaseBuffers, which have got the exact
same type of full-pool scan loop, and committed it.


Thanks everybody for the patches/commits.

Sergey

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-06-07 Thread Sergey Koposov

On Tue, 5 Jun 2012, Simon Riggs wrote:


Sounds less good and we'd need reasonable proof it actually did
anything useful without being dangerous.


Doing an initial unlocked test speeds things up another 2.69 fold (on
top of 3.55 for your patch) for me, with 1GB of shared buffers.  That
seems like it should be worthwhile.

How do we go about getting reasonable proof that it is safe?


That's enough for me.


So is it planned to apply that patch for 9.2 ?

Thanks,
S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-06-06 Thread Sergey Koposov

On Wed, 6 Jun 2012, Merlin Moncure wrote:

I think this is the expected result.  In the single user case the
spinklock never spins and only has to make the cpu-locking cache
instructions once.  can we see results @24 threads?


Here 
https://docs.google.com/open?id=0B7koR68V2nM1NDJHLUhNSS0zbUk


*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-06-06 Thread Sergey Koposov

On Wed, 6 Jun 2012, Ants Aasma wrote:


On Wed, Jun 6, 2012 at 2:27 PM, Sergey Koposov  wrote:

I've quickly tested your lockfree-getbuffer.patch patch with the test case
you provided and I barely see any improvement (2% at max)
https://docs.google.com/open?id=0B7koR68V2nM1QVBxWGpZdW4wd0U
tested with 24 core (48 ht cores, Xeon E7- 4807).
Although the tps vs number of threads looks weird


Was this the range scan on the test table? (sorry about the error in
the query, the x should really be id) In that case the results look
really suspicious.


Yes, my fault partially, because without much thought I've put "value" 
instead of "x" in the script. Now after replacing it by "id" the tps are 
much smaller.


Here is the tps vs nthreads I did test up to 10 threads on my 24 cpu 
system (I disabled HT though):

https://docs.google.com/open?id=0B7koR68V2nM1Nk9OcWNJOTRrYVE

Your patch clearly improve the situation (the peak tps is ~ 10% higher), 
but the general picture is the same: flattening of tps vs nthreads.


Cheers,
S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-06-06 Thread Sergey Koposov

Hi,

On Mon, 4 Jun 2012, Ants Aasma wrote:

On Mon, Jun 4, 2012 at 7:44 PM, Merlin Moncure  wrote:
I tried to keep it simple at first to find an answer to the question
if it's even worth trying before expending large effort on it. If
anyone with a multisocket machine would chip in, I'd love to know how
this patch handles on larger machines. I think the most interesting
workloads are read only loads with heavy buffer trashing but inside OS
memory. Select only pgbench with 32MB shared_buffers was withín error
margin, although slightly faster on my machine (Intel i2500K). The
workload that I used to demonstrate gain was an aggregated index scan
to minimise other overheads.


I've quickly tested your lockfree-getbuffer.patch patch with 
the test case you provided and I barely see any improvement (2% at max)

https://docs.google.com/open?id=0B7koR68V2nM1QVBxWGpZdW4wd0U
tested with 24 core (48 ht cores, Xeon E7- 4807).
Although the tps vs number of threads looks weird

Cheers,
S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-06-01 Thread Sergey Koposov

On Thu, 31 May 2012, Jeff Janes wrote:



No, idt_match is getting filled by multi-threaded copy() and then joined
with 4 other big tables like idt_phot. The result is then split into
partitions.


That does make things more complicated.  But you could you partition
it at that level and then do the joins partition-wise?


Unfortunately the information to understand in which partition the data 
needs to be put in is only available from the idt_match table. So I have 
to do at least one join with idt_match. But I will experiment further 
with ways to perform queries, I just stopped doing that because I saw 
these problems with scalability.



I don't have much experience at data partitioning (well, I do, but the
experience is with partitioning in Perl with terabytes of flat files,
not in PG :) ) but I think that once you have your partitioning keys
you want to apply them the same way up and down the data set.


I'm not sure what you mean by "the same way up and down the data set".

Cheers,
S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-06-01 Thread Sergey Koposov

On Fri, 1 Jun 2012, Simon Riggs wrote:



Why do you have 10,000 tables and why is it important to drop them so quickly?


1 tables are there, because that's the number of partitions. And I'm 
dropping them at the moment, because I'm doing testing. So it won't be
really crucial for production. But I still thought it was worth reporting. 
Especially when the table dropping took .5 a sec.


The problem is that when I set up the shared_buffers say to 48G, the 
timings of the tables rise significantly again.



If its that important, why not run the drop in parallel sessions?


Yes, before there was a strong reason to do that, now the timings are more 
manageable, but maybe I'll implement that.


Cheers,
S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-05-31 Thread Sergey Koposov

On Thu, 31 May 2012, Simon Riggs wrote:



That struck me as a safe and easy optimisation. This was a problem I'd
been trying to optimise for 9.2, so I've written a patch that appears
simple and clean enough to be applied directly.


Thanks! The patch indeed improved the timings, 
The dropping of 100 tables in a single commit before the patch took ~ 50 
seconds, now it takes ~ 5 sec (it would be nice to reduce it further 
though, because the dropping of 1 tables still takes ~10 min).


Cheers,
S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-31 Thread Sergey Koposov

On Thu, 31 May 2012, Robert Haas wrote:


Oh, ho.  So from this we can see that the problem is that we're
getting huge amounts of spinlock contention when pinning and unpinning
index pages.

It would be nice to have a self-contained reproducible test case for
this, so that we could experiment with it on other systems.


I have created it a few days ago:
http://archives.postgresql.org/pgsql-hackers/2012-05/msg01143.php

It is still valid. And I'm using exactly it to test. The only thing to 
change is to create a two-col index and drop another index.

The scripts are precisely the ones I'm using now.

The problem is that in order to see a really big slowdown (10 times slower 
than a single thread) I've had to raise the buffers to 48g but it was 
slow for smaller shared buffer settings as well.


But I'm not sure how sensitive the test is to the hardware.

Cheers,
S

*
Sergey E. Koposov,  PhD, Research Associate
Institute of Astronomy, University of Cambridge 
Madingley road, CB3 0HA, Cambridge, UK Tel: +44-1223-337-551

Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-30 Thread Sergey Koposov

On Wed, 30 May 2012, Jeff Janes wrote:


But the question now is whether there is a *PG* problem here or not, or is
it Intel's or Linux's problem ? Because still the slowdown was caused by
locking. If there wouldn't be locking there wouldn't be any problems (as
demonstrated a while ago by just cat'ting the files in multiple threads).


You cannot have a traditional RDBMS without locking.  From your


I understand the need of significant locking when there concurrent writes, 
but not when there only reads. But I'm not a RDBMS expert, so that's 
maybe that's misunderstanding on my side.



description of the problem,  I probably wouldn't be using a traditional
database system at all for this, but rather flat files and Perl.


Flat files and perl for 25-50 TB of data over few years is a bit extreme 
;)



Or
at least, I would partition the data before loading it to the DB,
rather than trying to do it after.


I intensionally did otherwise, because I thought that PG will 
to be much smarter than me in juggling the data I'm ingesting (~ tens of 
gig each day), join the appropriate bits of data and then split by 
partitions. Unfortunately I see that there are some scalability 
issues on the way, which I didn't expect. Those aren't fatal, but slightly 
disappointing.



But anyway, is idt_match a fairly static table?  If so, I'd partition
that into 16 tables, and then have each one of your tasks join against
a different one of those tables.  That should relieve the contention
on the index root block, and might have some other benefits as well.


No, idt_match is getting filled by multi-threaded copy() and then joined 
with 4 other big tables like idt_phot. The result is then split into 
partitions. And I was trying different approaches to fully utilize the 
CPUs and/or I/O and somehow parallize the queries. That's the 
reasoning for somewhat contrived  queries in my test.


Cheers,
S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-30 Thread Sergey Koposov

On Thu, 31 May 2012, Florian Pflug wrote:

Wait, so performance *increased* by spreading the backends out over as 
many dies as possible, not by using as few as possible? That'd be 
exactly the opposite of what I'd have expected. (I'm assuming that cores 
on one die have ascending ids on linux. If you could post the contents 
of /proc/cpuinfo, we could verify that)


Yes, you are correct. And I also can confirm that the cpus in the cpuinfo 
are ordered by "physical id" e.g. they go like

0 0 0 0 0 0 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3

I did a specific test with just 6 threads (== number of cores per cpu)
and ran it on a single phys cpu, it took ~ 12 seconds for each thread,
and when I tried to spread it across 4 cpus it took 7-9 seconds per 
thread. But all these numbers are anyway significantly better then when I 
didn't use taskset. Which probably means without it the processes were 
jumping from core to core ? ...


Because still the slowdown was caused by locking. If there wouldn't be 
locking there wouldn't be any problems (as demonstrated a while ago by 
just cat'ting the files in multiple threads).


Yup, we'll have to figure out a way to reduce the locking overhead. 9.2 
already scales much better to a large number of cores than previous 
versions did, but your test case shows that there's still room for 
improvement.


Yes, and unfortunately these scaling problems in read-only cpu 
bound queries, where naively I wound't expect any.


Thanks,

Sergey

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-30 Thread Sergey Koposov

On Wed, 30 May 2012, Florian Pflug wrote:


I wonder if the huge variance could be caused by non-uniform 
synchronization costs across different cores. That's not all that 
unlikely, because at least some cache levels (L2 and/or L3, I think) are 
usually shared between all cores on a single die. Thus, a cache bouncing 
line between cores on the same die might very well be faster then it 
bouncing between cores on different dies.


On linux, you can use the taskset command to explicitly assign processes 
to cores. The easiest way to check if that makes a difference is to 
assign one core for each connection to the postmaster before launching 
your test. Assuming that cpu assignment are inherited to child 
processes, that should then spread your backends out over exactly the 
cores you specify.


Wow, thanks! This seems to be working to some extend. I've found that 
distributing each thread x ( 0(reminder, that i have HT disabled and in total I have 4 cpus with 6 
proper cores each) gives quite good results. And after a few runs, I seem 
to be getting a more or less stable results for the multiple threads, 
with the performance of multithreaded runs going from 6 to 11 seconds for 
various threads. (another reminder is that 5-6 seconds  is roughly the 
timing of a my queries running in a single  thread).


So to some extend one can say that the problem is partially solved (i.e. 
it is probably understood)
But the question now is whether there is a *PG* problem here or 
not, or is it Intel's or Linux's problem ? 
Because still the slowdown was caused by locking. If there wouldn't be 
locking there wouldn't be any problems (as demonstrated a while ago by 
just cat'ting the files in multiple threads).


Cheers,
S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-30 Thread Sergey Koposov

On Wed, 30 May 2012, Merlin Moncure wrote:


How big is idt_match?  What if you drop all indexes on idt_match,
encouraging all the backends to do hash joins against it, which occur
in local memory and so don't have contention?


You just missed his post -- it's only 3G.   can you run your 'small'
working set against 48gb shared buffers?


Just tried 3times and it actually got much worse ~ 70-80 seconds per 
query in the  parallel setup ( i.e. >10 times slower than the single run)


The oprofile then looks like this:

CPU: Intel Architectural Perfmon, speed 1862 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask 
of 0x00 (No unit mask) count 10
samples  %linenr info symbol name
---
883523   46.3676  (no location information)   s_lock
  883523   100.000  (no location information)   s_lock [self]
---
303984   15.9532  (no location information)   PinBuffer
  303984   100.000  (no location information)   PinBuffer [self]

The problem is that since there is that variability in times, I don't 
really 100% know whether this trend of slow-down with increasing 
shared memory is genuine or not.


I've also tried just in case shared_buffers=1G, and it is still very slow 
(50 sec).


After that I changed the shared buffers back to 10G and the timings got 
back to 25 sec.


Weird...

I still wonder whether there is problem with the way the locking is done 
(as referenced in the recent "droping tables slowiness" thread).


Cheers,
S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-30 Thread Sergey Koposov

On Wed, 30 May 2012, Merlin Moncure wrote:


hurk --  ISTM that since IOS is masikng the heap lookups, there must
be contention on the index itself?  Does this working set fit in
shared memory?  If so, what happens when you do a database restart and
repeat the IOS test?


The dataset fits well in shared buffers

Here are the data sizes
 public.idt_photoobservation_small| 521 MB |
  528756
 public.idt_match_idx | 1084 MB|
  528762
 public.idt_match_transitid_healpixid_idx | 1522 MB|
 1955140
 public.idt_match | 2906 MB|
  528753

And shared buffers are 10G

If I restart the db the timings do not change significantly. There is always some 
variation which I don't really understand, e.g. the parallel runs sometimes

take 18s, or 25 seconds, or 30 seconds per thread. So there is something else 
affecting
the runs -- I don't know, maybe that's related to which thread starts first,
or what is the starting point of the seq scan... (there is no other 
activity on the machine btw).


Cheers,
S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-30 Thread Sergey Koposov

On Wed, 30 May 2012, Merlin Moncure wrote:


Hm, why aren't we getting a IOS?  Just for kicks (assuming this is
test data), can we drop the index on just transitid, leaving the index
on transitid, healpixid?Is enable_indexonlyscan on?  Has idt_match
been vacuumed?  What kind of plan do you get when do:


Okay dropping the index on transitid solved the issue with indexonlyscan 
but didn't solve the original problem. Actually the indexonlyscan made the 
sequential queries faster but not the parallel ones.


Here is a side by side explain analyze again for indexonly scans

Bogged:


 QUERY PLAN
-
 Sort  (cost=82262274.19..82262286.69 rows=5000 width=498) (actual 
time=19635.318..19635.318 rows=0 loops=1)
   Sort Key: y.x
   Sort Method: quicksort  Memory: 25kB
   ->  Subquery Scan on y  (cost=0.00..82261967.00 rows=5000 width=498) (actual 
time=19635.282..19635.282 rows=0 loops=1)
 Filter: ((y.x % 16::bigint) = 7)
 Rows Removed by Filter: 100
 ->  Limit  (cost=0.00..82246967.00 rows=100 width=490) (actual 
time=0.076..19373.922 rows=100 loops=1)
   ->  Seq Scan on idt_photoobservation_small o  
(cost=0.00..82246967.00 rows=100 width=490) (actual time=0.074..19179.336 
rows=100 loops=1)
 SubPlan 1
   ->  Index Only Scan using 
idt_match_transitid_healpixid_idx on idt_match m  (cost=0.00..82.17 rows=1 
width=8) (actual time=0.016..0.017 rows=1 loops=100)
 Index Cond: (transitid = o.transitid)
 Heap Fetches: 0
 Total runtime: 19702.727 ms
(13 rows)
***
vs non-bogged
***

 QUERY PLAN
-
 Sort  (cost=82262274.19..82262286.69 rows=5000 width=498) (actual 
time=6198.334..6198.334 rows=0 loops=1)
   Sort Key: y.x
   Sort Method: quicksort  Memory: 25kB
   ->  Subquery Scan on y  (cost=0.00..82261967.00 rows=5000 width=498) (actual 
time=6198.285..6198.285 rows=0 loops=1)
 Filter: ((y.x % 16::bigint) = 7)
 Rows Removed by Filter: 100
 ->  Limit  (cost=0.00..82246967.00 rows=100 width=490) (actual 
time=0.116..5932.766 rows=100 loops=1)
   ->  Seq Scan on idt_photoobservation_small o  
(cost=0.00..82246967.00 rows=100 width=490) (actual time=0.113..5730.012 
rows=100 loops=1)
 SubPlan 1
   ->  Index Only Scan using 
idt_match_transitid_healpixid_idx on idt_match m  (cost=0.00..82.17 rows=1 
width=8) (actual time=0.003..0.003 rows=1 loops=100)
 Index Cond: (transitid = o.transitid)
 Heap Fetches: 0
 Total runtime: 6239.512 ms
(13 rows)
**

Cheers,
S

PS I didn't know that index on two columns together with the indexonlyscan can help 
the queries based on one of those columns. Very helpful!


*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-30 Thread Sergey Koposov

On Wed, 30 May 2012, Merlin Moncure wrote:



1. Can we see an explain analyze during a 'bogged' case?


Here is the one to one comparison of the 'bogged' 
**

  
QUERY PLAN
--
 Sort  (cost=63835201.73..63835214.23 rows=5000 width=498) (actual 
time=18007.500..18007.500 rows=0 loops=1)
   Sort Key: y.x
   Sort Method: quicksort  Memory: 25kB
   ->  Subquery Scan on y  (cost=0.00..63834894.54 rows=5000 width=498) (actual 
time=18007.454..18007.454 rows=0 loops=1)
 Filter: ((y.x % 16::bigint) = 7)
 Rows Removed by Filter: 100
 ->  Limit  (cost=0.00..63819894.51 rows=102 width=490) (actual 
time=0.047..17734.570 rows=100 loops=1)
   ->  Seq Scan on idt_photoobservation_small o  
(cost=0.00..63819894.51 rows=102 width=490) (actual time=0.045..17543.902 
rows=100 loops=1)
 SubPlan 1
   ->  Index Scan using idt_match_transitid_idx on 
idt_match m  (cost=0.00..63.74 rows=1 width=8) (actual time=0.015..0.015 rows=1 
loops=100)
 Index Cond: (transitid = o.transitid)
 Total runtime: 18056.866 ms
(12 rows)

Time: 18067.929 ms
*

vs  non-bogged:

**
  
QUERY PLAN
--
 Sort  (cost=63835201.73..63835214.23 rows=5000 width=498) (actual 
time=6635.133..6635.133 rows=0 loops=1)
   Sort Key: y.x
   Sort Method: quicksort  Memory: 25kB
   ->  Subquery Scan on y  (cost=0.00..63834894.54 rows=5000 width=498) (actual 
time=6635.091..6635.091 rows=0 loops=1)
 Filter: ((y.x % 16::bigint) = 7)
 Rows Removed by Filter: 100
 ->  Limit  (cost=0.00..63819894.51 rows=102 width=490) (actual 
time=0.059..6344.683 rows=100 loops=1)
   ->  Seq Scan on idt_photoobservation_small o  
(cost=0.00..63819894.51 rows=102 width=490) (actual time=0.056..6149.429 
rows=100 loops=1)
 SubPlan 1
   ->  Index Scan using idt_match_transitid_idx on 
idt_match m  (cost=0.00..63.74 rows=1 width=8) (actual time=0.003..0.004 rows=1 
loops=100)
 Index Cond: (transitid = o.transitid)
 Total runtime: 6669.215 ms
(12 rows)

Time: 6673.991 ms
**



2. Can we try to get 'index only scan' working over idt_match?  That's
should be a matter of adjusting the index so that it's:
create index on idt_match (transitid, healpixid);


I don't think I could do that. I created the index, you asked (although 
it is unclear why it would help...), tried explain, and it still used a 
simple index scan. After disabling the index scan it decided to use the bitmap scan 
(which isn't actually faster in the multithreaded setup. And after disabling the 
bitmapscan it switched to seqscan).


Cheers,
Sergey

PS Just for the reference the current indices on idt_match are
"idt_match_idx" btree (healpixid)
"idt_match_transitid_healpixid_idx" btree (transitid, healpixid)
"idt_match_transitid_idx" btree (transitid)



*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-30 Thread Sergey Koposov

Here is the actual explain analyze of the query on the smaller dataset
which I have been using for the recent testing.

test=# explain analyze create table _tmp0 as select * from
  ( select *,
(select healpixid from idt_match as m where m.transitid=o.transitid)
as x from idt_photoobservation_small as o offset 0
  ) as y where x%16=0 order by x;
  
QUERY PLAN

p--
 Sort  (cost=63835201.73..63835214.23 rows=5000 width=498) (actual 
time=8203.041..8252.216 rows=173696 loops=1)
   Sort Key: y.x
   Sort Method: quicksort  Memory: 182555kB
   ->  Subquery Scan on y  (cost=0.00..63834894.54 rows=5000 width=498) (actual 
time=0.102..7602.947 rows=173696 loops=1)
 Filter: ((y.x % 16::bigint) = 0)
 Rows Removed by Filter: 826304
 ->  Limit  (cost=0.00..63819894.51 rows=102 width=490) (actual 
time=0.041..7296.401 rows=100 loops=1)
   ->  Seq Scan on idt_photoobservation_small o  
(cost=0.00..63819894.51 rows=102 width=490) (actual time=0.038..7094.555 
rows=100 loops=1)
 SubPlan 1
   ->  Index Scan using idt_match_transitid_idx on 
idt_match m  (cost=0.00..63.74 rows=1 width=8) (actual time=0.003..0.004 rows=1 
loops=100)
 Index Cond: (transitid = o.transitid)
 Total runtime: 8908.631 ms

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] slow dropping of tables, DropRelFileNodeBuffers, tas

2012-05-29 Thread Sergey Koposov

Hello,

I was running some tests on PG9.2beta where I'm creating and dropping 
large number of tables (~ 2).


And I noticed that table dropping was extremely slow -- e.g. like half a 
second per table.


I tried to move the table dropping into bigger transactions (100 per one 
transaction) (increasing in the same time max_locks_per_trans to 128).

And still, the commits took ~ 50-60 seconds.

I tried to oprofile it, and I saw that 99% is spend on 
DropRelFileNodeBuffers, and when compiled with symbols (CFLAGS=-g)
I saw that actually most of the time is spend in tas() function, see 
below:


PU: Intel Architectural Perfmon, speed 1862 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask 
of 0x00 (No unit mask) count 10
samples  %linenr info symbol name
---
831665   86.6756  s_lock.h:204tas
  831665   100.000  s_lock.h:204tas [self]
---
122573   12.7745  bufmgr.c:2048   DropRelFileNodeBuffers
  122573   100.000  bufmgr.c:2048   DropRelFileNodeBuffers [self]
---
849   0.0885  xlog.c:697  XLogInsert
  849  100.000  xlog.c:697  XLogInsert [self]
---
269   0.0280  catcache.c:444  CatalogCacheIdInvalidate
  269  100.000  catcache.c:444  CatalogCacheIdInvalidate [self]
---
232   0.0242  catcache.c:1787 PrepareToInvalidateCacheTuple
  232  100.000  catcache.c:1787 PrepareToInvalidateCacheTuple 
[self]
---
202   0.0211  dynahash.c:807  hash_search_with_hash_value
  202  100.000  dynahash.c:807  hash_search_with_hash_value 
[self]
---
199   0.0207  nbtsearch.c:344 _bt_compare
  199  100.000  nbtsearch.c:344 _bt_compare [self]
---
198   0.0206  list.c:506  list_member_oid
  198  100.000  list.c:506  list_member_oid [self]


Is the current behaviour expected ?
Because I saw the comment around droprelfilenodebuffers, saying
"XXX currently it sequentially searches the buffer pool, should be
changed to more clever ways of searching.  However, this 
routine is used only in code paths that aren't very 
performance-critical, and we shouldn't slow down the hot paths to make it 
faster ".


Maybe it is stupid, but I also wonder whether the root cause for what I'm 
seeing can be also responsible for the problems I recently  reported 
about the scaling and locking

http://archives.postgresql.org/pgsql-hackers/2012-05/msg01118.php

Some additional info:
The database is accessed in a single thread, shared_buffers are 10G. 
The tables themselves are empty essentially. the cpu if it matters is 
4 times Xeon E7- 4807 (Westmere architecture).

I did a vacuum full of everything just in case and it didn't help
And another maybe important factor is that I noticed is that
pg_catalog.pg_attribute is quite large (238 meg) (because of the large 
number of tables times columns).


I also stopped PG with gdb a few times and it was always at this backtrace:

(gdb) bt
#0  tas (lock=0x7fa4e3007538 "\001") at 
../../../../src/include/storage/s_lock.h:218
#1  0x006e6956 in DropRelFileNodeBuffers (rnode=..., 
forkNum=VISIBILITYMAP_FORKNUM, firstDelBlock=0) at bufmgr.c:2062
#2  0x0070c014 in smgrdounlink (reln=0x1618210, 
forknum=VISIBILITYMAP_FORKNUM, isRedo=0 '\000') at smgr.c:354
#3  0x0051ecf6 in smgrDoPendingDeletes (isCommit=1 '\001') at 
storage.c:364
#4  0x004a7b33 in CommitTransaction () at xact.c:1925
#5  0x004a8479 in CommitTransactionCommand () at xact.c:2524
#6  0x00710b3f in finish_xact_command () at postgres.c:2419
#7  0x0070ff4e in exec_execute_message (portal_name=0x1608990 "", 
max_rows=1) at postgres.c:1956
#8  0x00712988 in PostgresMain (argc=2, argv=0x1548568, username=0x1548390 
with only forknum changing to MAIN_FORKNUM or FSM_FORKNUM


Thanks in advance,
Sergey


*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.pos

Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-27 Thread Sergey Koposov

Hi,

I did another test using the same data and the same code, which I've 
provided before and the performance of the single thread seems to be 
degrading quadratically with the number of threads.


Here are the results:
Nthreads Time_to_execute_one_thread
1 8.1
2 7.8
3 8.1
4 9.0
5 10.2
6 11.4
7 13.3
8 16.1
9 19.0
10 21.4
11 23.8
12 27.3
13 30.2
14 32.0
15 34.1
16 37.5

Regards,
Sergey



On Sat, 26 May 2012, Stephen Frost wrote:


* Sergey Koposov (kopo...@ast.cam.ac.uk) wrote:

Turning off synch seq scans doesn't help either. 18 sec
multithreaded run vs 7 sec single threaded.


Alright, can you just time 'cat' when they're started a few seconds or
whatever apart from each other?  I can't imagine it being affected in
the same way as these, but seems like it wouldn't hurt to check.

Thanks,

Stephen



*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-26 Thread Sergey Koposov

On Sat, 26 May 2012, Stephen Frost wrote:


Alright, can you just time 'cat' when they're started a few seconds or
whatever apart from each other?  I can't imagine it being affected in
the same way as these, but seems like it wouldn't hurt to check.


I've tryed cat'ting a created in advance 8gig file on the same filesystem.
And  during the multi-threaded run it takes ~4sec, roughly the same as 
when run separately.



*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-26 Thread Sergey Koposov

On Sat, 26 May 2012, Robert Haas wrote:

This theory is seeming fairly plausible - how can we test it?

How about trying it with synchronize_seqscans = off?  If the
synchronized-seqscan logic is causing contention on the buf mapping
locks and individual buffer locks, that should fix it.


Turning off synch seq scans doesn't help either. 18 sec multithreaded run 
vs 7 sec single threaded.


S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-25 Thread Sergey Koposov

On Fri, 25 May 2012, Merlin Moncure wrote:

can you hack this in heapam.c and see if it helps?
line 131-ish:

if (!RelationUsesLocalBuffers(scan->rs_rd) &&
scan->rs_nblocks > NBuffers / 4)
becomes
if (!RelationUsesLocalBuffers(scan->rs_rd))

(also you can set the partition count back).


The first few runs of my test with the suggested change made the 
multithreaded queries even slower by a factor of two.

E.g. 44 seconds for multithreaded vs ~ 7sec single threaded.
(with the default 9.2beta I get ~ 14-22sec for multithreaded run)
But now after another few reruns of my test, i see again this variability 
in times for multithreaded runs. It went down to 18-22seconds. Then again 
to 44 sec. So overall the change you suggested either doesn't improve 
the situation or makes it worse .


Regards,
S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-25 Thread Sergey Koposov

On Fri, 25 May 2012, Robert Haas wrote:


On Thu, May 24, 2012 at 7:26 PM, Sergey Koposov  wrote:

Here is the output from a multi-threaded run (8thtreads, 22 seconds) sorted
by blk. Not sure whether that's of much use or not:


What are the top dozen or so entries if you sort by shacq?


Here it is:

PID 7108 lwlock 44: shacq 1127902 exacq 1548 blk 511
PID 7115 lwlock 44: shacq 1127495 exacq 633 blk 401
PID 7114 lwlock 44: shacq 1127418 exacq 1024 blk 702
PID 7111 lwlock 44: shacq 1127324 exacq 957 blk 651
PID 7109 lwlock 44: shacq 1127261 exacq 388 blk 466
PID 7110 lwlock 44: shacq 1127256 exacq 1132 blk 774
PID 7113 lwlock 44: shacq 1127179 exacq 920 blk 665
PID 7112 lwlock 44: shacq 1127148 exacq 1323 blk 838
PID 7108 lwlock 48: shacq 1125039 exacq 1592 blk 546
PID 7115 lwlock 48: shacq 1124666 exacq 559 blk 397
PID 7111 lwlock 48: shacq 1124523 exacq 1009 blk 955
PID 7114 lwlock 48: shacq 1124502 exacq 1041 blk 976
PID 7110 lwlock 48: shacq 1124460 exacq 1128 blk 1110
PID 7109 lwlock 48: shacq 1124402 exacq 384 blk 602
PID 7112 lwlock 48: shacq 1124394 exacq 1350 blk 1373
PID 7113 lwlock 48: shacq 1124383 exacq 868 blk 871
PID 7115 lwlock 1020: shacq 100 exacq 0 blk 0
PID 7115 lwlock 1018: shacq 100 exacq 0 blk 0
PID 7114 lwlock 1020: shacq 100 exacq 0 blk 0
PID 7114 lwlock 1018: shacq 100 exacq 0 blk 0
PID 7113 lwlock 1020: shacq 100 exacq 0 blk 0
PID 7113 lwlock 1018: shacq 100 exacq 0 blk 0
PID 7112 lwlock 1020: shacq 100 exacq 0 blk 0
PID 7112 lwlock 1018: shacq 100 exacq 0 blk 0
PID 7111 lwlock 1020: shacq 100 exacq 0 blk 0
PID 7111 lwlock 1018: shacq 100 exacq 0 blk 0
PID 7110 lwlock 1020: shacq 100 exacq 0 blk 0
PID 7110 lwlock 1018: shacq 100 exacq 0 blk 0
PID 7109 lwlock 1020: shacq 100 exacq 0 blk 0
PID 7109 lwlock 1018: shacq 100 exacq 0 blk 0
PID 7108 lwlock 1020: shacq 100 exacq 0 blk 0
PID 7108 lwlock 1018: shacq 100 exacq 0 blk 0
PID 7108 lwlock 43: shacq 439454 exacq 1614 blk 151
PID 7115 lwlock 43: shacq 438993 exacq 558 blk 126
PID 7111 lwlock 43: shacq 438817 exacq 899 blk 200
PID 7114 lwlock 43: shacq 438795 exacq 898 blk 230
PID 7109 lwlock 43: shacq 438787 exacq 314 blk 153
PID 7110 lwlock 43: shacq 438768 exacq 1026 blk 231
PID 7113 lwlock 43: shacq 438720 exacq 833 blk 178
PID 7112 lwlock 43: shacq 438691 exacq 1123 blk 215
PID 7108 lwlock 34: shacq 343982 exacq 1595 blk 177
PID 7115 lwlock 34: shacq 343582 exacq 600 blk 139
PID 7109 lwlock 34: shacq 343410 exacq 348 blk 167
PID 7111 lwlock 34: shacq 343398 exacq 926 blk 255
PID 7114 lwlock 34: shacq 343362 exacq 939 blk 214
PID 7110 lwlock 34: shacq 343338 exacq 1074 blk 226
PID 7113 lwlock 34: shacq 343283 exacq 832 blk 226
PID 7112 lwlock 34: shacq 343247 exacq 1255 blk 271
PID 7108 lwlock 38: shacq 333989 exacq 1704 blk 137
PID 7115 lwlock 38: shacq 333512 exacq 581 blk 106
PID 7109 lwlock 38: shacq 40 exacq 322 blk 115
PID 7111 lwlock 38: shacq 06 exacq 882 blk 152
PID 7114 lwlock 38: shacq 333266 exacq 977 blk 183
PID 7110 lwlock 38: shacq 333249 exacq 1013 blk 165
PID 7113 lwlock 38: shacq 333226 exacq 828 blk 163
PID 7112 lwlock 38: shacq 333175 exacq 1246 blk 183
PID 7108 lwlock 42: shacq 233215 exacq 1539 blk 237
PID 7115 lwlock 42: shacq 232846 exacq 517 blk 138
PID 7111 lwlock 42: shacq 232645 exacq 885 blk 215
PID 7109 lwlock 42: shacq 232611 exacq 363 blk 140
PID 7113 lwlock 42: shacq 232600 exacq 860 blk 192
PID 7110 lwlock 42: shacq 232599 exacq 1049 blk 262
PID 7114 lwlock 42: shacq 232591 exacq 935 blk 318
PID 7112 lwlock 42: shacq 232505 exacq 1262 blk 240
PID 7108 lwlock 47: shacq 228682 exacq 1495 blk 177
PID 7115 lwlock 47: shacq 228397 exacq 516 blk 109
PID 7110 lwlock 47: shacq 228218 exacq 1082 blk 378
PID 7109 lwlock 47: shacq 228097 exacq 386 blk 158
PID 7111 lwlock 47: shacq 228093 exacq 907 blk 321
PID 7113 lwlock 47: shacq 228085 exacq 909 blk 306
PID 7112 lwlock 47: shacq 227993 exacq 1248 blk 387
PID 7114 lwlock 47: shacq 227986 exacq 929 blk 449
PID 7108 lwlock 36: shacq 205287 exacq 1592 blk 78
PID 7115 lwlock 36: shacq 204845 exacq 535 blk 86
PID 7111 lwlock 36: shacq 204675 exacq 830 blk 108
PID 7114 lwlock 36: shacq 204634 exacq 930 blk 109
PID 7109 lwlock 36: shacq 204581 exacq 363 blk 83
PID 7113 lwlock 36: shacq 204566 exacq 833 blk 112
PID 7110 lwlock 36: shacq 204546 exacq 1014 blk 106
PID 7112 lwlock 36: shacq 204508 exacq 1180 blk 140
PID 7108 lwlock 35: shacq 136067 exacq 1537 blk 56
PID 7115 lwlock 35: shacq 135532 exacq 494 blk 47
PID 7111 lwlock 35: shacq 135404 exacq 820 blk 76
PID 7109 lwlock 35: shacq 135362 exacq 326 blk 56
PID 7114 lwlock 35: shacq 135342 exacq 879 blk 102
PID 7110 lwlock 35: shacq 135333 exacq 1016 blk 88
PID 7113 lwlock 35: shacq 135318 exacq 829 blk 75
PID 7112 lwlock 35: shacq 135257 exacq 1229 blk 98
PID 7108 lwlock 41: shacq 134338 exacq 1540 blk 55
PID 7115 lwlock 41: shacq 133929 exacq 542 blk 50
PID 7109 lwlock 41: shacq 133781 exacq 360 blk 49
PID 7111 lwlock 41: shacq 1

Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-25 Thread Sergey Koposov

On Fri, 25 May 2012, Merlin Moncure wrote:

how likely is it to see multiple simultaneous scans in the real world?
Interleaving scans like that is not a very effective optimization --
if it was me, it'd be trying to organize something around a
partitioned tid scan for parallel sequential access.


Regarding the "real world use".  I can say for myself is that the reason I'm
running the queries like the ones I have in my test is that I have a large
inflow of data every day, tens of gb, which has to be processed and since PG
doesn't have any parallelization of queries built in, and running the
processing in a single thread will take too long, I'm trying to work around
that by manually running multiple threads in PG and trying to split the work
among them. So not having scalability here is going to hurt us (but I 
admit that it's probably not very popular use-case for PG).


Regards,
S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-25 Thread Sergey Koposov

On Fri, 25 May 2012, Merlin Moncure wrote:


These are all on the buffer partition locks.  That makes sense...I was
wrong earlier: this case was in fact 'create table as', not 'insert
select' which rules out both the freelist lock and the wal insert lock
because create table as gets to use both a bulk insert strategy and
wal avoiding logic (assuming wal log level is not 'archive or higher'.
So, why aren't the lock partitions helping here?


FYI the WAL level is default, and here are the relevand changed params 
from the .conf (autovacuum was disabled but the data was vacuumed 
manually).


max_connections = 100   # (change requires restart)
shared_buffers = 10GB   # min 128kB
work_mem = 500MB# min 64kB
maintenance_work_mem = 500MB# min 1MB
synchronous_commit = off# synchronization level;
checkpoint_segments = 20# in logfile segments, min 1, 16MB each
enable_hashjoin = off
enable_mergejoin = off
effective_cache_size = 10GB
autovacuum = off# Enable autovacuum subprocess?  'on'


Regards,
S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-24 Thread Sergey Koposov

On Thu, 24 May 2012, Jeff Janes wrote:

Add
#define LWLOCK_STATS
near the top of:
src/backend/storage/lmgr/lwlock.c

and recompile and run a reduced-size workload.  When the processes
exits, they will dump a lot of data about LWLock usage to the logfile.
Generally the LWLock with the most blocks on it will be the main
culprit.


Here is the output from a multi-threaded run (8thtreads, 22 seconds) 
sorted by blk. Not sure whether that's of much use or not:


PID 7112 lwlock 48: shacq 1124394 exacq 1350 blk 1373
PID 7110 lwlock 48: shacq 1124460 exacq 1128 blk 1110
PID 7114 lwlock 48: shacq 1124502 exacq 1041 blk 976
PID 7111 lwlock 48: shacq 1124523 exacq 1009 blk 955
PID 7113 lwlock 48: shacq 1124383 exacq 868 blk 871
PID 7112 lwlock 44: shacq 1127148 exacq 1323 blk 838
PID 7110 lwlock 44: shacq 1127256 exacq 1132 blk 774
PID 7114 lwlock 44: shacq 1127418 exacq 1024 blk 702
PID 7113 lwlock 44: shacq 1127179 exacq 920 blk 665
PID 7111 lwlock 44: shacq 1127324 exacq 957 blk 651
PID 7109 lwlock 48: shacq 1124402 exacq 384 blk 602
PID 7108 lwlock 48: shacq 1125039 exacq 1592 blk 546
PID 7108 lwlock 44: shacq 1127902 exacq 1548 blk 511
PID 7109 lwlock 44: shacq 1127261 exacq 388 blk 466
PID 7114 lwlock 47: shacq 227986 exacq 929 blk 449
PID 7115 lwlock 44: shacq 1127495 exacq 633 blk 401
PID 7115 lwlock 48: shacq 1124666 exacq 559 blk 397
PID 7112 lwlock 47: shacq 227993 exacq 1248 blk 387
PID 7110 lwlock 47: shacq 228218 exacq 1082 blk 378
PID 7111 lwlock 47: shacq 228093 exacq 907 blk 321
PID 7114 lwlock 42: shacq 232591 exacq 935 blk 318
PID 7113 lwlock 47: shacq 228085 exacq 909 blk 306
PID 7112 lwlock 34: shacq 343247 exacq 1255 blk 271
PID 7110 lwlock 42: shacq 232599 exacq 1049 blk 262
PID 7111 lwlock 34: shacq 343398 exacq 926 blk 255
PID 7112 lwlock 42: shacq 232505 exacq 1262 blk 240
PID 7108 lwlock 42: shacq 233215 exacq 1539 blk 237
PID 7110 lwlock 43: shacq 438768 exacq 1026 blk 231
PID 7114 lwlock 43: shacq 438795 exacq 898 blk 230
PID 7113 lwlock 34: shacq 343283 exacq 832 blk 226
PID 7110 lwlock 34: shacq 343338 exacq 1074 blk 226
PID 7114 lwlock 0: shacq 0 exacq 14864 blk 219
PID 7112 lwlock 43: shacq 438691 exacq 1123 blk 215
PID 7111 lwlock 42: shacq 232645 exacq 885 blk 215
PID 7114 lwlock 34: shacq 343362 exacq 939 blk 214
PID 7111 lwlock 0: shacq 0 exacq 13638 blk 205
PID 7111 lwlock 43: shacq 438817 exacq 899 blk 200
PID 7114 lwlock 7: shacq 0 exacq 547 blk 193
PID 7113 lwlock 42: shacq 232600 exacq 860 blk 192
PID 7110 lwlock 0: shacq 0 exacq 16862 blk 191
PID 7114 lwlock 38: shacq 333266 exacq 977 blk 183
PID 7112 lwlock 38: shacq 333175 exacq 1246 blk 183
PID 7113 lwlock 7: shacq 0 exacq 548 blk 178
PID 7113 lwlock 43: shacq 438720 exacq 833 blk 178
PID 7115 lwlock 7: shacq 0 exacq 549 blk 177
PID 7108 lwlock 47: shacq 228682 exacq 1495 blk 177
PID 7108 lwlock 34: shacq 343982 exacq 1595 blk 177
PID 7112 lwlock 0: shacq 0 exacq 19538 blk 172
PID 7111 lwlock 7: shacq 0 exacq 549 blk 172
PID 7115 lwlock 17: shacq 0 exacq 9927 blk 167
PID 7109 lwlock 34: shacq 343410 exacq 348 blk 167
PID 7112 lwlock 7: shacq 0 exacq 548 blk 166
PID 7110 lwlock 38: shacq 333249 exacq 1013 blk 165
PID 7113 lwlock 38: shacq 333226 exacq 828 blk 163
PID 7110 lwlock 7: shacq 0 exacq 548 blk 162
PID 7109 lwlock 7: shacq 0 exacq 548 blk 161
PID 7109 lwlock 47: shacq 228097 exacq 386 blk 158
PID 7112 lwlock 45: shacq 130843 exacq 1201 blk 154
PID 7109 lwlock 43: shacq 438787 exacq 314 blk 153
PID 7111 lwlock 38: shacq 06 exacq 882 blk 152
PID 7108 lwlock 43: shacq 439454 exacq 1614 blk 151
PID 7108 lwlock 17: shacq 0 exacq 23165 blk 147
PID 7108 lwlock 7: shacq 0 exacq 549 blk 145
PID 7113 lwlock 0: shacq 0 exacq 13394 blk 142
PID 7112 lwlock 46: shacq 132972 exacq 1196 blk 142
PID 7112 lwlock 36: shacq 204508 exacq 1180 blk 140
PID 7109 lwlock 42: shacq 232611 exacq 363 blk 140
PID 7115 lwlock 34: shacq 343582 exacq 600 blk 139
PID 7114 lwlock 45: shacq 130818 exacq 903 blk 139
PID 7115 lwlock 42: shacq 232846 exacq 517 blk 138
PID 7108 lwlock 38: shacq 333989 exacq 1704 blk 137
PID 7115 lwlock 43: shacq 438993 exacq 558 blk 126
PID 7115 lwlock 0: shacq 0 exacq 3161 blk 125
PID 7110 lwlock 40: shacq 129577 exacq 1021 blk 122
PID 7113 lwlock 45: shacq 130833 exacq 814 blk 120
PID 7111 lwlock 17: shacq 0 exacq 4607 blk 119
PID 7109 lwlock 0: shacq 0 exacq 5711 blk 119
PID 7110 lwlock 45: shacq 130865 exacq 1001 blk 117
PID 7109 lwlock 38: shacq 40 exacq 322 blk 115
PID 7114 lwlock 46: shacq 132997 exacq 860 blk 112
PID 7113 lwlock 36: shacq 204566 exacq 833 blk 112
PID 7112 lwlock 40: shacq 129528 exacq 1234 blk 111
PID 7115 lwlock 47: shacq 228397 exacq 516 blk 109
PID 7114 lwlock 36: shacq 204634 exacq 930 blk 109
PID 7112 lwlock 41: shacq 133600 exacq 1136 blk 109
PID 7111 lwlock 36: shacq 204675 exacq 830 blk 108
PID 7108 lwlock 0: shacq 0 exacq 6504 blk 108
PID 7115 lwlock 38: shacq 333512 exacq 581 blk 106
PID 7110 lwlock 36: shacq 204546 exacq 1014 blk 106
PID 7113 lwlock 37: shacq 127040 exacq 79

Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-24 Thread Sergey Koposov

Hi,

On Thu, 24 May 2012, Robert Haas wrote:

Not sure.  It might be some other LWLock, but it's hard to tell which
one from the information provided.


If you could tell what's the best way to find out the info that you need, 
then I could run it reasonably quickly.


S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-24 Thread Sergey Koposov

On Thu, 24 May 2012, Robert Haas wrote:


As you can see, raw performance isn't much worse with the larger data
sets, but scalability at high connection counts is severely degraded
once the working set no longer fits in shared_buffers.


Actually the problem persits even when I trim the dataset size to be within
the shared_buffers.

Here is the dump (0.5 gig in size, tested with shared_buffers=10G,
work_mem=500Mb):
http://www.ast.cam.ac.uk/~koposov/files/dump.gz
And I attach the script

For my toy dataset the performance of a single thread goes down 
from ~6.4 to 18 seconds (~ 3 times worse),


And actually while running the script repeatedly on my main machine, for 
some reason I saw  some variation in terms of how much threaded execution 
is slower than a single thread.


Now I see 25 seconds for multi threaded run vs the same ~ 6 second for a 
single thread.


The oprofile shows
 782355   21.5269  s_lock
  782355   100.000  s_lock [self]
---
709801   19.5305  PinBuffer
  709801   100.000  PinBuffer [self]
---
3264578.9826  LWLockAcquire
  326457   100.000  LWLockAcquire [self]
---
3094378.5143  UnpinBuffer
  309437   100.000  UnpinBuffer [self]
---
2529726.9606  ReadBuffer_common
  252972   100.000  ReadBuffer_common [self]
---
2015585.5460  LockBuffer
  201558   100.000  LockBuffer [self]


It is interesting that On another machine with much smaller shared memory 
(3G), smaller RAM (12G),  smaller number of cpus  and PG 9.1 running I was 
getting consistently ~ 7.2 vs 4.5 sec (for multi vs single thread)


PS Just in case the CPU on the main machine I'm testing is Xeon(R) CPU E7- 
4807 (the total number of real cores is 24)






*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

script.sh
Description: Bourne shell script
drop table _tmpXX ;
\timing
create table _tmpXX as select * from 
  ( select *, 
  (select healpixid from idt_match as m where m.transitid=o.transitid) 
as x from idt_photoobservation_small as o offset 0
  ) as y where x%16=XX order by x;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-24 Thread Sergey Koposov

On Thu, 24 May 2012, Robert Haas wrote:


On Thu, May 24, 2012 at 1:42 PM, Sergey Koposov  wrote:

I guess there is nothing catastrophically wrong with that, but still I'm
very suprised that you get severe locking problems (factor of two slow-down)
when running parallel read-only transactions.


Me, too.  How many concurrent connections are you running, and does
your working set exceed shared_buffers?  Can you provide a
self-contained reproducible test case?


The last tests I've been doing were with 8 connections.
And the working set is roughly 30Gig, which is ~ 3x the shared buffers. 
(but ~ 50% of RAM).


Regarding the test-case, I'll try to see whether I can still observe the 
same slowing down if I chop the main table by a factor of few, so I can 
put the data somewhere for download.


S

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-24 Thread Sergey Koposov

On Thu, 24 May 2012, Merlin Moncure wrote:


Are you sure?  I looked at all the ReleasePredicateLocks calls and
they appear to be guarded by:

   /* Nothing to do if this is not a serializable transaction */
   if (MySerializableXact == InvalidSerializableXact)
   return false;
What's the default isolation mode set to?


Sorry, it was indeed my mistake. I was pointing opreport to the 
binary which was recompiled, while the old version was still running , so 
I guess oprofile was picking up wrong function names


Here is the correct oprofile: when multiple queries are running :
samples  %symbol name
---
952457   13.8715  LWLockAcquire
  952457   100.000  LWLockAcquire [self]
---
779077   11.3464  PinBuffer
  779077   100.000  PinBuffer [self]
---
759898   11.0671  s_lock
  759898   100.000  s_lock [self]
---
6897536.7178  slot_deform_tuple
  689753   100.000  slot_deform_tuple [self]
---
5260025.1230  UnpinBuffer
  526002   100.000  UnpinBuffer [self]

When only one is running:
samples  %symbol name
---
163268   14.0343  slot_deform_tuple
  163268   100.000  slot_deform_tuple [self]
---
126018   10.8324  _bt_compare
  126018   100.000  _bt_compare [self]
---
1139759.7972  ExecProject
  113975   100.000  ExecProject [self]
---
49760 4.2773  FunctionCall2Coll
  49760100.000  FunctionCall2Coll [self]
---
49164 4.2261  LWLockAcquire
  49164100.000  LWLockAcquire [self]
---
43526 3.7414  hash_search_with_hash_value
  43526100.000  hash_search_with_hash_value [self]



I guess there is nothing catastrophically wrong with that, but still I'm 
very suprised that you get severe locking problems (factor of two 
slow-down) when running parallel read-only transactions.


Sergey

*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile

2012-05-24 Thread Sergey Koposov

Hi,

I've been running some tests on pg 9.2beta1 and in particular a set
of queries like

create table _tmp0 as select * from (
select *, (select healpixid from idt_match as m where
m.transitid=o.transitid)
as x from idt_photoobservation as o offset 0
  ) as y where x%16=ZZZ order by x;

(where ZZZ is some number integer number 0<=ZZZ<16)

With the following plan:
--
 Sort  (cost=3228814504.96..3228815137.21 rows=252902 width=498)
   Sort Key: y.x
   ->  Subquery Scan on y  (cost=0.00..3228791809.25 rows=252902 width=498)
 Filter: ((y.x % 16::bigint) = 0)
 ->  Limit  (cost=0.00..3228033102.41 rows=50580456 width=490)
   ->  Seq Scan on idt_photoobservation o  
(cost=0.00..3228033102.41 rows=50580456 width=490)
 SubPlan 1
   ->  Index Scan using idt_match_transitid_idx on 
idt_match m  (cost=0.00..63.74 rows=1 width=8)
 Index Cond: (transitid = o.transitid)

The schema of the tables are:

e2es2=> \d idt_match
 Table "public.idt_match"
  Column   |   Type   | Modifiers
---+--+---
 sourceid  | bigint   |
 transitid | bigint   |
 healpixid | bigint   |
 flagsxm   | smallint |
Indexes:
"idt_match_idx" btree (healpixid)
"idt_match_transitid_idx" btree (transitid)

Table "public.idt_photoobservation"
  Column   |   Type   | Modifiers
---+--+---
 transitid | bigint   |
 fluxbp0   | real |

more columns

Indexes:
"idt_photoobservation_idx" btree (transitid)

And I noticed than when I run the query like the one shown above in parallel
(in multiple connections for ZZZ=0...8) the performance of each query 
drops down significantly (factor of 2)(despite the fact that during the 
execution of the query postgres is mostly CPU bound).


So I tried to oprofile it and strace it, And that's what I saw:

The strace -c of each process  shows something like that
#
Process 18660 detached
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 76.250.001342   0268987   semop
 23.750.000418   0 61694   read
  0.000.00   0   138   lseek
  0.000.00   0   355   select
  0.000.00   0 3   kill
-- --- --- - - 
100.000.001760331177   total
###

And the oprofile shows this on top:
---
2863981  25.7117  ReleasePredicateLocks
  2863981  100.000  ReleasePredicateLocks [self]
---
1246629  11.1917  LocalBufferAlloc
  1246629  100.000  LocalBufferAlloc [self]
---
1135393  10.1931  CheckForSerializableConflictIn
  1135393  100.000  CheckForSerializableConflictIn [self]


So there is a lot of locking for some reason, And I was surprised to see 
anything related to SSI in the profile at all, because I'm not running 
serializable transactions (I was just running my queries from multiple 
psql sessions).


Is there a reasonable explanation for what I'm seeing  ?

Thanks in advance,
Sergey

More info: there is no other concurrent activity on the machine.
shared_buffers is 10G, The machine has ~ 60G of RAM, 24 cores and proper 
RAID


Another note is that the oprofile and strace were obtained during the 
stage when the idt_photo.. table was scanned and data was being stored in 
the pgsql_tmp (not during the last bit when the data in pgsql_tmp were

actually being sorted).


*
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers