[PERFORM] BUG: endless lseek(.., SEEK_END) from select queries on x64 builds

2013-02-22 Thread Markus Schulz
hello,

i have a strange and reproducible bug with some select queries and 64bit 
postgresql builds (works fine on 32bit builds).
The postgres process will run with 100% cpu-load (no io-wait) and strace 
will show endless lseek(..., SEEK_END) calls on one table for minutes.
lseek(28, 0, SEEK_END)  = 26697728
lseek(28, 0, SEEK_END)  = 26697728
lseek(28, 0, SEEK_END)  = 26697728
...
the file-descriptor 28 points to the file for the webapps_base.Acquisition 
table (see query/plan below).

Now the details:

The Query:
select count(ac.ID) as col_0_0_ from 
webapps_base.Acquisition ac, 
webapps_base.SalesPartnerStructure struc
where 
struc.fk_SalesPartner_child=ac.fk_SalesPartner_ID 
and struc.fk_SalesPartner_parent=200
and (ac.CreationDate between '2012-02-01' and '2013-01-31') 
and ac.acquisitiondepot='STANDARD' 
and ('2013-01-31' between struc.ValidFrom 
and coalesce(struc.ValidTo, '2013-01-31'))

The Plan:
Aggregate  (cost=32617.11..32617.12 rows=1 width=8) (actual 
time=204.180..204.180 rows=1 loops=1)
  -  Merge Join  (cost=32232.01..32598.26 rows=7543 width=8) (actual 
time=172.882..202.218 rows=2 loops=1)
Merge Cond: (ac.fk_salespartner_id = struc.fk_salespartner_child)
-  Sort  (cost=5582.60..5635.69 rows=21235 width=16) (actual 
time=28.920..31.121 rows=21204 loops=1)
  Sort Key: ac.fk_salespartner_id
  Sort Method: quicksort  Memory: 1763kB
  -  Bitmap Heap Scan on acquisition ac  (cost=395.26..4056.43 
rows=21235 width=16) (actual time=3.064..15.868 rows=21223 loops=1)
Recheck Cond: ((creationdate = '2012-02-01'::date) AND 
(creationdate = '2013-01-31'::date))
Filter: ((acquisitiondepot)::text = 'STANDARD'::text)
-  Bitmap Index Scan on index_acquistion_creationdate  
(cost=0.00..389.95 rows=21267 width=0) (actual time=2.890..2.890 rows=21514 
loops=1)
  Index Cond: ((creationdate = '2012-02-01'::date) 
AND (creationdate = '2013-01-31'::date))
-  Sort  (cost=26648.60..26742.61 rows=37606 width=8) (actual 
time=143.952..152.808 rows=131713 loops=1)
  Sort Key: struc.fk_salespartner_child
  Sort Method: quicksort  Memory: 8452kB
  -  Bitmap Heap Scan on salespartnerstructure struc  
(cost=3976.80..23790.79 rows=37606 width=8) (actual time=13.279..64.681 
rows=114772 loops=1)
Recheck Cond: (fk_salespartner_parent = 200)
Filter: (('2013-01-31'::date = validfrom) AND 
('2013-01-31'::date = COALESCE(validto, '2013-01-31'::date)))
-  Bitmap Index Scan on index_parent_salespartner  
(cost=0.00..3967.39 rows=114514 width=0) (actual time=13.065..13.065 
rows=116479 loops=1)
  Index Cond: (fk_salespartner_parent = 200)
Total runtime: 205.397 ms

as you can see the query runs fine. 
I can run this query from a bash-psql-while-loop/jdbc-cli-tool 
endless without any problems. 
so far so good.

But now i run the same query from:

JBoss EAP 5.1.2 with connection pooling and xa-datasource/two-phase-commits 
(transactions on multiple datasources needed)
*and* *prepared-statement-cache-size1000/prepared-statement-cache-size*

i can run the query four times with good performance and after that postgresql 
starts with the strange lseek() behavior. 
The query needs more then a minute to complete and during execution the 
postgres process runs at 100% cpu load with lseek calls (straced).
If i flush the connection pool (close all open connections from the jboss 
jmx-console) it works again for four calls.
These problem applies only to 64bit builds. If i run a 32bit postgresql 
server it works fine.

We have tested the following environments:

- Debian Squeeze 64bit with Postgresql 9.1.[5,6,7] - Bad behavior
- Debian Wheezy 64bit with Postgresql 9.1.8 64bit - Bad behavior
- Ubuntu 12.04 LTS 64bit with Postgresql 9.1.8 64bit - Bad behavior
- Windows 7 x64 with Postgresql 9.1.8 64bit - Bad behavior
- Debian Wheezy 64bit with EnterpriseDB 9.2 64bit - Bad behavior

- Debian Wheezy 64bit with Postgresql 9.1.8 32bit - Good behavior
- Debian Wheezy 32bit with Postgresql 9.1.8 32bit - Good behavior

as you can see all 64bit builds of postgresql are affected (independent from 
os-arch).

If i disable the prepared-statement-cache-size (remove it from -ds.xml) 
it works on 64bit build too.

regards,
msc


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BUG: endless lseek(.., SEEK_END) from select queries on x64 builds

2013-02-22 Thread Heikki Linnakangas

On 22.02.2013 10:25, Markus Schulz wrote:

i can run the query four times with good performance and after that postgresql
starts with the strange lseek() behavior.


By default, the JDBC driver re-plans the prepared statement for the 
first 4 invocations of the query. On the fifth invocation, it switches 
to using a generic plan, which will be reused on subsequent invocations. 
See http://jdbc.postgresql.org/documentation/head/server-prepare.html. 
The generic plan seems to perform much worse in this case. You can 
disable that mechanism and force re-planning the query every time by 
setting the prepareThreshold=0 parameter on the data source.


You could check what the generic plan looks like by taking the query 
used in the java program, with the parameter markers, and running 
EXPLAIN on that.


PostgreSQL version 9.2 might work better in this case. It has some 
smarts in the server to generate parameter-specific plans even when 
prepared statements are used, if the planner thinks a specific plan will 
be faster.


- Heikki


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] BUG: endless lseek(.., SEEK_END) from select queries on x64 builds

2013-02-22 Thread Stephen Frost
Markus,

* Markus Schulz (m...@antzsystem.de) wrote:
 as you can see the query runs fine. 
 I can run this query from a bash-psql-while-loop/jdbc-cli-tool 
 endless without any problems. 
 so far so good.
[...]
 JBoss EAP 5.1.2 with connection pooling and xa-datasource/two-phase-commits 
 (transactions on multiple datasources needed)
 *and* *prepared-statement-cache-size1000/prepared-statement-cache-size*
 
 i can run the query four times with good performance and after that 
 postgresql 
 starts with the strange lseek() behavior. 

It sounds like your bash script and JBoss are doing something different.
Would it be possible for you to turn on log_statements = 'all' in PG,
see what's different, and then update the bash/psql script to do exactly
what JBoss does, and see if you can reproduce it that way?

It certainly looks like a PG bug, but it'd be a lot easier to debug with
a simple, well-defined test case which shows the failure.

Thanks!

Stephen


signature.asc
Description: Digital signature


[PERFORM] Avoiding Recheck Cond when using Select Distinct

2013-02-22 Thread jackrg
The following query produces a Recheck Cond and a costly Bitmap Heap Scan
even though I have a composite index that covers both columns being filtered
and selected.  I believe this is because the initial bitmap scan produces
2912 rows, which is too many for the available bitmap space.  I've tried
rewriting the command as Select ... group by but it still uses the BHS. Is
there a way to rewrite this command that would improve performance by
avoiding the costly Bitmap Heap Scan?


SELECT distinct store_id, book_id FROM sales_points  WHERE
sales_points.store_id IN (1, 2, 3, 4, 5, 6, 199, 201, 202) AND
sales_points.book_id IN (421, 422, 419, 420)

Here is the explain/analyze output:


HashAggregate  (cost=5938.72..5939.01 rows=97 width=8) (actual
time=10.837..10.854 rows=32 loops=1)
  -  Bitmap Heap Scan on sales_points  (cost=47.03..5936.53 rows=2191
width=8) (actual time=0.547..5.296 rows=4233 loops=1)
Recheck Cond: (book_id = ANY ('{421,422,419,420}'::integer[]))
Filter: (store_id = ANY ('{1,2,3,4,5,6,199,201,202}'::integer[]))
-  Bitmap Index Scan on index_sales_points_on_book_id 
(cost=0.00..46.92 rows=4430 width=0) (actual time=0.469..0.469 rows=4233
loops=1)
  Index Cond: (book_id = ANY ('{421,422,419,420}'::integer[]))
Total runtime: 10.935 ms


Actual runtime is more like 15ms when tested against a development database
(which gave est. total runtime of 6ms).  Under load in production, the
command takes 10,158 ms.  Tuning Postgre is not an option, as the instance
is provided by Heroku and as far as I know cannot be tuned by me.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Avoiding-Recheck-Cond-when-using-Select-Distinct-tp5746290.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Are bitmap index scans slow to start?

2013-02-22 Thread Carlo Stonebanks
Hi Jeff, thanks for the reply.

 

 

What is going on during the interregnum?  Whatever it is, it seems to be
driving the log_2013_01_session_idx index out of the cache, but not the
log_2013_01 table.  (Or perhaps the table visit is getting the benefit of
effective_io_concurrency?)
.

Rebuilding the index might help, as it would put all the leaf pages holding
values for session_id=27 adjacent to each other, so they would read from
disk faster.  But with a name like session_id, I don't know how long such
clustering would last though.

 

 

Technically, nothing should be happening. We used to keep one massive audit
log, and was impossible to manage due to its size. We then changed to a
strategy where every month a new audit log would be spawned, and since
log_2013_01 represents January, the log should be closed and nothing should
have changed (it is technically possible that a long-running process would
spill over into February, but not by this much). So, assuming that it's
stable, it should be a very good candidate for reindexing,  no?

 

Our effective_io_concurrency is 1, and last I heard the PG host was a LINUX
4 drive RAID10, so I don't know if there is any benefit to  raising this
number - and if there was any benfit, it would be to the Bitmap Scan, and
the problem is the data building before the fact.

 

 the bitmap itself doesn't get cached.  But the data needed to construct
the bitmap does get cached.  It gets cached by the generic caching methods
of PG and the OS, not through something specific to bitmaps.


 

This has always been a problem for me. I spend hours trying different
strategies and think I've solved the problem, when in fact it seems like a
cache has spun up, and then something else expires it and the problem is
back. Is there a way around this problem, can I force the expiration of a
cache?

 

Carlo

 



Re: [PERFORM] Are bitmap index scans slow to start?

2013-02-22 Thread Carlo Stonebanks
A cool idea, but if I understand it correctly very specific and fussy. New
DB's are spawned on this model, and all the developers would have to be
aware of this non-standard behaviour, and DBAs would have to create these
indexes every month, for every DB (as the log tables are created every
month). There are 89 session_id values in the January log (log_2013_01) so
this would quickly get out of control. But - like I said - an interesting
idea for more specific challenges.

 

From: Marc Mamin [mailto:m.ma...@intershop.de] 
Sent: February 21, 2013 2:41 PM
To: Jeff Janes; Carlo Stonebanks
Cc: pgsql-performance@postgresql.org
Subject: AW: [PERFORM] Are bitmap index scans slow to start?

 

 

Rebuilding the index might help, as it would put all the leaf pages holding
values for session_id=27 adjacent to each other, so they would read from
disk faster.  But with a name like session_id, I don't know how long such
clustering would last though.

If I'm right about the index disk-read time, then switching to a plain
index scan rather than a bitmap index scan would make no difference--either
way the data has to come off the disk.  


 

I'd prefer a
strategy that allowed fast performance the first time, rather than slow
the
first time and extremely fast subsequently.

Hello,

if the index is only used to locate rows for single session_id, you may
consider split it in a set of partial indexes.

e.g. 
create index i_0 on foo where session_id%4 =0;
create index i_1 on foo where session_id%4 =1;
create index i_2 on foo where session_id%4 =2;
create index i_3 on foo where session_id%4 =3;

(can be built in parallel using separate threads)

Then you will have to ensure that all your WHERE clauses also contain the
index condition:

WHERE session_id = 27 AND session_id%4 =27%4

regards,

Marc Mamin



Re: [PERFORM] Avoiding Recheck Cond when using Select Distinct

2013-02-22 Thread Vitalii Tymchyshyn
2013/2/22 jackrg j...@groundbreakingsoftware.com

 Tuning Postgre is not an option, as the instance
 is provided by Heroku and as far as I know cannot be tuned by me.

 Most tuning parameters can be set at per-query basis, so you can issue
alter database set param=value
to have same effect as if it was set through postgresql.conf.


Re: [PERFORM] BUG: endless lseek(.., SEEK_END) from select queries on x64 builds

2013-02-22 Thread Markus Schulz
Am Freitag, 22. Februar 2013, 14:35:25 schrieb Heikki Linnakangas:
 On 22.02.2013 10:25, Markus Schulz wrote:
  i can run the query four times with good performance and after that
  postgresql starts with the strange lseek() behavior.
 
 By default, the JDBC driver re-plans the prepared statement for the
 first 4 invocations of the query. On the fifth invocation, it switches
 to using a generic plan, which will be reused on subsequent invocations.

that sounds really interesting and i would try to change my java-jdbc-test-cli 
program according to that, but ...

 See http://jdbc.postgresql.org/documentation/head/server-prepare.html.
 The generic plan seems to perform much worse in this case. You can
 disable that mechanism and force re-planning the query every time by
 setting the prepareThreshold=0 parameter on the data source.

it wouldn't explain why the same jboss runs fine with a 32bit postgresql 
server (i switched only the datasource to another server with exactly the same 
database).

 You could check what the generic plan looks like by taking the query
 used in the java program, with the parameter markers, and running
 EXPLAIN on that.

how can i do this?
I've tried the following in my ejb-test-function to:

String query = ...
entitymanager.createNativeQuery(query)...;
entitymanager.createNativeQuery(EXPLAIN ANALYZE  + query)...;

but the second createNativeQuery call runs fast every time and will show the 
same plan and the first hangs after the fourth call to this function.

 PostgreSQL version 9.2 might work better in this case. It has some
 smarts in the server to generate parameter-specific plans even when
 prepared statements are used, if the planner thinks a specific plan will
 be faster.

this wouldn't help:
 - Debian Wheezy 64bit with EnterpriseDB 9.2 64bit - Bad behavior

we tried postgresql 9.2 too

 - Heikki

regards,
msc


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Bad query plan with high-cardinality column

2013-02-22 Thread Alexander Staubo
I have a problem with a query that is planned wrong. This is my schema:

   create table comments (
 id serial primary key,
 conversation_id integer,
 created_at timestamp
   );
   create index comments_conversation_id_index on comments (conversation_id);
   create index comments_created_at_index on comments (created_at);

The table has 3.5M rows, and 650k unique values for conversation_id, where 
the histogram goes up to 54000 rows for the most frequent ID, with a long tail. 
There are only 20 values with a frequency of 1000 or higher. The created_at 
column has 3.5M distinct values.

Now, I have this query:

   select comments.id from comments where
 conversation_id = 3975979 order by created_at limit 13

This filters about 5000 rows and returns the oldest 13 rows. But the query is 
consistently planned wrong:

   Limit  (cost=0.00..830.53 rows=13 width=12) (actual time=3174.862..3179.525 
rows=13 loops=1)
 Buffers: shared hit=2400709 read=338923 written=21 
   
 -  Index Scan using comments_created_at_index on comments  
(cost=0.00..359938.52 rows=5634 width=12) (actual time=3174.860..3179.518 
rows=13 loops=1)
   Filter: (conversation_id = 3975979)  
   
   Rows Removed by Filter: 2817751  
   
   Buffers: shared hit=2400709 read=338923 written=21   
   
   Total runtime: 3179.553 ms   
   

It takes anywhere between 3 seconds and several minutes to run, depending on 
how warm the disk cache is. This is the correct plan and index usage:

   Limit  (cost=6214.34..6214.38 rows=13 width=12) (actual time=25.471..25.473 
rows=13 loops=1)
 
 Buffers: shared hit=197 read=4510  

 -  Sort  (cost=6214.34..6228.02 rows=5471 width=12) (actual 
time=25.469..25.470 rows=13 loops=1)
  
   Sort Key: created_at 

   Sort Method: top-N heapsort  Memory: 25kB

   Buffers: shared hit=197 read=4510

   -  Index Scan using comments_conversation_id_index on comments  
(cost=0.00..6085.76 rows=5471 width=12) (actual time=1.163..23.955 rows=5834 
loops=1)
 Index Cond: (conversation_id = 3975979)

 Buffers: shared hit=197 read=4510  

   Total runtime: 25.500 ms 


The problem for Postgres is obviously to estimate how many rows have a given 
conversation_id value, but I have confirmed that the value is correctly 
tracked in the histogram.

I'm at a loss how to explain why the planner thinks scanning a huge index that 
covers the entire table will ever beat scanning a small index that has 17% of 
the table's values. Even if the entire database were in RAM, this would hit way 
too much buffers unnecessarily. (I have determined that planner will 
consistently use the bad plan for higher-frequency values, and the good plan 
for lower-frequency values.) It will *always* be better to branch off the 
comments_conversation_id_index index.

Another curious thing: If I run ANALYZE repeatedly, the planner will sometimes, 
oddly enough, choose the correct plan. This behaviour actually seems related to 
effective_cache_size; if it's set small (128MB), the planner will sometimes 
favour the good plan, but if large (= 2GB) it will consistently use the bad 
plan. Not sure if ANALYZE is changing anything or if it's just bad timing.

Things I have tried: I have bumped the statistics target up to 1, but it 
does not help. I have also tried setting n_distinct for the column manually, 
since Postgres guesses it's 285k instead of 650k, but 

Re: [PERFORM] Are bitmap index scans slow to start?

2013-02-22 Thread Nikolas Everett
I can't really help, but I can make it more clear why postgres is choosing
a _bitmap_ index scan rather than a regular index scan.  With a regular
index scan it pumps the index for the locations of the rows that it points
to and loads those rows as it finds them.  This works great if the rows in
the index are sorta sorted - that way it isn't jumping around the table
randomly.  Random io is slow.  In a bitmap index scan pg pumps the index
and buffers the by shoving them in a big bitmap.  Then, it walks the bitmap
in order to produce in order io.  PG makes the choice based on a measure of
the index's correlation.

The problem comes down to you inserting the sessions concurrently with one
another.  My instinct would be to lower the FILLFACTOR on newly created
indecies so they can keep their entries more in order.  I'm not sure why I
have that instinct but it feels right.  Also, you might could try
clustering newly created tables on session_id and setting the fillfactor
down so rows with the same session id will stick together on disk.

Now that I look stuff up on the internet I'm not sure where I saw that pg
tries to maintain a cluster using empty space from FILLFACTOR but I _think_
it does.  I'm not sure what is going on with my google foo today.

Nik


On Fri, Feb 22, 2013 at 12:50 PM, Carlo Stonebanks 
stonec.regis...@sympatico.ca wrote:

 A cool idea, but if I understand it correctly very specific and fussy. New
 DB’s are spawned on this model, and all the developers would have to be
 aware of this non-standard behaviour, and DBA”s would have to create these
 indexes every month, for every DB (as the log tables are created every
 month). There are 89 session_id values in the January log (log_2013_01) so
 this would quickly get out of control. But – like I said – an interesting
 idea for more specific challenges.

 ** **

 *From:* Marc Mamin [mailto:m.ma...@intershop.de]
 *Sent:* February 21, 2013 2:41 PM
 *To:* Jeff Janes; Carlo Stonebanks
 *Cc:* pgsql-performance@postgresql.org
 *Subject:* AW: [PERFORM] Are bitmap index scans slow to start?

 ** **

 ** **

 Rebuilding the index might help, as it would put all the leaf pages
 holding values for session_id=27 adjacent to each other, so they would read
 from disk faster.  But with a name like session_id, I don't know how
 long such clustering would last though.

 If I'm right about the index disk-read time, then switching to a plain
 index scan rather than a bitmap index scan would make no difference--either
 way the data has to come off the disk.


  

 I'd prefer a
 strategy that allowed fast performance the first time, rather than slow
 the
 first time and extremely fast subsequently.

 Hello,

 if the index is only used to locate rows for single session_id, you may
 consider split it in a set of partial indexes.

 e.g.
 create index i_0 on foo where session_id%4 =0;
 create index i_1 on foo where session_id%4 =1;
 create index i_2 on foo where session_id%4 =2;
 create index i_3 on foo where session_id%4 =3;

 (can be built in parallel using separate threads)

 Then you will have to ensure that all your WHERE clauses also contain the
 index condition:

 WHERE session_id = 27 AND session_id%4 =27%4

 regards,

 Marc Mamin



Re: [PERFORM] Avoiding Recheck Cond when using Select Distinct

2013-02-22 Thread Maciek Sakrejda
On Fri, Feb 22, 2013 at 9:59 AM, Vitalii Tymchyshyn tiv...@gmail.com wrote:
 Tuning Postgre is not an option, as the instance
 is provided by Heroku and as far as I know cannot be tuned by me.

 Most tuning parameters can be set at per-query basis, so you can issue
 alter database set param=value
 to have same effect as if it was set through postgresql.conf.

Jack,

Jeff brought up some great points and What Vitalii suggested should
let you tweak most knobs, but if you're running into limitations of
the platform or you find default settings which seem outright
incorrect, please file a support ticket and we'll be happy to work
with you.

Thanks,
Maciek
Heroku Postgres


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Are bitmap index scans slow to start?

2013-02-22 Thread Carlo Stonebanks
 Also, you might could try clustering newly created tables on session_id
and setting the fillfactor down so rows with the same session id will stick
together on disk.

 

 

My understanding of PG's cluster is that this is a one-time command that
creates a re-ordered table and doesn't maintain the clustered order until
the command is issued again. During the CLUSTER, the table is read and write
locked. So, in order for me to use this I would need to set up a timed event
to CLUSTER occasionally.

 

 I can't really help, but I can make it more clear why postgres is
choosing a _bitmap_ index scan rather than a regular index scan

 

 

The EXPLAIN ANALYZE is showing it is taking a long time to prepare the
bitmap (i.e.-  Bitmap Index Scan on log_2013_01_session_idx
(cost=0.00..63186.52

rows=2947664 width=0) (actual time=32611.918..32611.918 rows=2772042
loops=1) Index Cond: (session_id = 27) the bitmap scan is actually very
fast. Jeff sasys that the bitmap is not cached, so I will assume the PG
general caches being created are of general use. 

 

I think what I need to do is figure out is:

 

1)  Why does it take 36 seconds to set up the general index caches?

2)  What can I do about it (what stats do I need to look at)?

3)  How can I force these caches to expire so I can tell if the strategy
worked?

 

 

 

 

From: Nikolas Everett [mailto:nik9...@gmail.com] 
Sent: February 22, 2013 2:05 PM
To: Carlo Stonebanks
Cc: Marc Mamin; Jeff Janes; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Are bitmap index scans slow to start?

 

I can't really help, but I can make it more clear why postgres is choosing a
_bitmap_ index scan rather than a regular index scan.  With a regular index
scan it pumps the index for the locations of the rows that it points to and
loads those rows as it finds them.  This works great if the rows in the
index are sorta sorted - that way it isn't jumping around the table
randomly.  Random io is slow.  In a bitmap index scan pg pumps the index and
buffers the by shoving them in a big bitmap.  Then, it walks the bitmap in
order to produce in order io.  PG makes the choice based on a measure of the
index's correlation.

 

The problem comes down to you inserting the sessions concurrently with one
another.  My instinct would be to lower the FILLFACTOR on newly created
indecies so they can keep their entries more in order.  I'm not sure why I
have that instinct but it feels right.  Also, you might could try clustering
newly created tables on session_id and setting the fillfactor down so rows
with the same session id will stick together on disk.

 

Now that I look stuff up on the internet I'm not sure where I saw that pg
tries to maintain a cluster using empty space from FILLFACTOR but I _think_
it does.  I'm not sure what is going on with my google foo today.

 

Nik

 

On Fri, Feb 22, 2013 at 12:50 PM, Carlo Stonebanks
stonec.regis...@sympatico.ca wrote:

A cool idea, but if I understand it correctly very specific and fussy. New
DB's are spawned on this model, and all the developers would have to be
aware of this non-standard behaviour, and DBAs would have to create these
indexes every month, for every DB (as the log tables are created every
month). There are 89 session_id values in the January log (log_2013_01) so
this would quickly get out of control. But - like I said - an interesting
idea for more specific challenges.

 

From: Marc Mamin [mailto:m.ma...@intershop.de] 
Sent: February 21, 2013 2:41 PM
To: Jeff Janes; Carlo Stonebanks
Cc: pgsql-performance@postgresql.org
Subject: AW: [PERFORM] Are bitmap index scans slow to start?

 

 

Rebuilding the index might help, as it would put all the leaf pages holding
values for session_id=27 adjacent to each other, so they would read from
disk faster.  But with a name like session_id, I don't know how long such
clustering would last though.

If I'm right about the index disk-read time, then switching to a plain
index scan rather than a bitmap index scan would make no difference--either
way the data has to come off the disk.  


 

I'd prefer a
strategy that allowed fast performance the first time, rather than slow
the
first time and extremely fast subsequently.

Hello,

if the index is only used to locate rows for single session_id, you may
consider split it in a set of partial indexes.

e.g. 
create index i_0 on foo where session_id%4 =0;
create index i_1 on foo where session_id%4 =1;
create index i_2 on foo where session_id%4 =2;
create index i_3 on foo where session_id%4 =3;

(can be built in parallel using separate threads)

Then you will have to ensure that all your WHERE clauses also contain the
index condition:

WHERE session_id = 27 AND session_id%4 =27%4

regards,

Marc Mamin

 



Re: [PERFORM] Bad query plan with high-cardinality column

2013-02-22 Thread Tom Lane
Alexander Staubo a...@bengler.no writes:
select comments.id from comments where
  conversation_id = 3975979 order by created_at limit 13

 I'm at a loss how to explain why the planner thinks scanning a huge
 index that covers the entire table will ever beat scanning a small index
 that has 17% of the table's values.

The reason is that the LIMIT may stop the query before it's scanned all
of the index.  The planner estimates on the assumption that the desired
rows are roughly uniformly distributed within the created_at index, and
on that assumption, it looks like this query will stop fairly soon ...
but evidently, that's wrong.  On the other hand, it knows quite well
that the other plan will require pulling out 5000-some rows and then
sorting them before it can return anything, so that's not going to be
exactly instantaneous either.

In this example, I'll bet that conversation_id and created_at are pretty
strongly correlated, and that most or all of the rows with that specific
conversation_id are quite far down the created_at ordering, so that the
search through the index takes a long time to run.  OTOH, with another
conversation_id the same plan might run almost instantaneously.

If you're concerned mostly with this type of query then a 2-column index
on (conversation_id, created_at) would serve your purposes nicely.  You
could likely even dispense with the separate index on conversation_id
alone.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad query plan with high-cardinality column

2013-02-22 Thread Kevin Grittner
Alexander Staubo a...@bengler.no wrote:

 This is my schema:

   create table comments (
 id serial primary key,
 conversation_id integer,
 created_at timestamp
   );
   create index comments_conversation_id_index on comments (conversation_id);
   create index comments_created_at_index on comments (created_at);

I suspect you would be better off without those two indexes, and
instead having an index on (conversation_id, created_at).  Not just
for the query you show, but in general.

   select comments.id from comments where
 conversation_id = 3975979 order by created_at limit 13

 This filters about 5000 rows and returns the oldest 13 rows. But
 the query is consistently planned wrong:

 [planner thinks it will be cheaper to read index in ORDER BY
 sequence and filter rows until it has 13 than to read 5471 rows
 and sort them to pick the top 13 after the sort.]

In my experience these problems come largely from the planner not
knowing the cost of dealing with each tuple.  I see a lot less of
this if I raise cpu_tuple_cost to something in the 0.03 to 0.05
range.

-- 
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad query plan with high-cardinality column

2013-02-22 Thread Alexander Staubo
On Friday, February 22, 2013 at 21:33 , Tom Lane wrote:
 The reason is that the LIMIT may stop the query before it's scanned all
 of the index. The planner estimates on the assumption that the desired
 rows are roughly uniformly distributed within the created_at index, and
 on that assumption, it looks like this query will stop fairly soon ...
 but evidently, that's wrong. On the other hand, it knows quite well
 that the other plan will require pulling out 5000-some rows and then
 sorting them before it can return anything, so that's not going to be
 exactly instantaneous either.
 
 In this example, I'll bet that conversation_id and created_at are pretty
 strongly correlated, and that most or all of the rows with that specific
 conversation_id are quite far down the created_at ordering, so that the
 search through the index takes a long time to run. OTOH, with another
 conversation_id the same plan might run almost instantaneously.


That's right. So I created a composite index, and not only does this make the 
plan correct, but the planner now chooses a much more efficient plan than the 
previous index that indexed only on conversation_id:

Limit  (cost=0.00..30.80 rows=13 width=12) (actual time=0.042..0.058 
rows=13 loops=1)
   
  Buffers: shared hit=8 


  -  Index Scan using index_comments_on_conversation_id_and_created_at on 
comments  (cost=0.00..14127.83 rows=5964 width=12) (actual time=0.039..0.054 
rows=13 loops=1)
Index Cond: (conversation_id = 3975979) 


Buffers: shared hit=8   


Total runtime: 0.094 ms 




Is this because it can get the value of created_at from the index, or is it 
because it can know that the index is pre-sorted, or both?

Very impressed that Postgres can use a multi-column index for this. I just 
assumed, wrongly, that it couldn't. I will have to go review my other tables 
now and see if they can benefit from multi-column indexes.

Thanks!


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad query plan with high-cardinality column

2013-02-22 Thread Alexander Staubo
On Friday, February 22, 2013 at 21:47 , Kevin Grittner wrote:
 I suspect you would be better off without those two indexes, and
 instead having an index on (conversation_id, created_at). Not just
 for the query you show, but in general.


Indeed, that solved it, thanks!
 


 In my experience these problems come largely from the planner not
 knowing the cost of dealing with each tuple. I see a lot less of
 this if I raise cpu_tuple_cost to something in the 0.03 to 0.05
 range.


Is this something I can just frob a bit without worrying about it adversely 
impacting database performance across the board, or should I be very careful 
and do lots of testing on a staging box first?


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad query plan with high-cardinality column

2013-02-22 Thread Kevin Grittner
Alexander Staubo a...@bengler.no wrote:
 On Friday, February 22, 2013 at 21:47 , Kevin Grittner wrote:

 In my experience these problems come largely from the planner
 not knowing the cost of dealing with each tuple. I see a lot
 less of this if I raise cpu_tuple_cost to something in the 0.03
 to 0.05 range.

 Is this something I can just frob a bit without worrying about it
 adversely impacting database performance across the board, or
 should I be very careful and do lots of testing on a staging box
 first?

If possible, I would recommend trying it with the old indexes and
seeing whether it causes it to choose the better plan.  (Of course,
you're not going to beat the plan you get with the two-column index
for this query, but it might help it better cost the other
alternatives, which would be a clue that it makes your overall
costing model more accurate and would have a more general benefit.)
You can play with settings like this in a single session without
affecting any other sessions.

I always recommend testing a change like this in staging and
closely monitoring after deploying to production, to confirm the
overall benefit and look for any odd cases which might suffer a
performance regression.  For this particular change, I have never
seen a negative effect, but I'm sure that it's possible to have a
scenario where it isn't helpful.

Personally, I have changed this setting many times and have often
noted that 0.02 was not enough to cause choice of an optimal plan,
0.03 was always enough to do it if adjusting this setting was going
to help at all, and boosting it to 0.05 never caused further plan
changes in the cases I tested.  I have never seen such increases
cause less optimal plan choice.

If you try this, please post your results.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance