Re: [PERFORM] Planner selects different execution plans depending on limit

2012-09-11 Thread Bill Martin
On 10/09/12 16:24, bill_mar...@freenet.demailto:bill_mar...@freenet.de wrote:

Hi All

I´ve ft_simple_core_content_content_idx
  ON core_content
  USING gin
  (to_tsvector('simple'::regconfig, content) );


If I´m seaching for a word which is NOT in the column content the query plan 
and the execution time differs with the given limit.
If I choose 3927 or any higher number the query execution took only few 
milliseconds.

core_content content where
to_tsvector('simple', content.content) @@ tsquery(plainto_tsquery('simple', 
'asdasdadas') :: varchar || ':*')=true
Limit 3927

Limit  (cost=0.00..19302.23 rows=3926 width=621) (actual 
time=52147.149..52147.149 rows=0 loops=1)
  -  Seq Scan on core_content content  (cost=0.00..98384.34 rows=20011 
width=621) (actual time=52147.147..52147.147 rows=0 loops=1)
Filter: (to_tsvector('simple'::regconfig, content) @@ 
'''asdasdadas'':*'::tsquery)
Total runtime: 52147.173 ms

Is there any posibility to improve the performance even if the limit is only 
10? Is it possible to determine that the query optimizer takes only the fast 
bitmap heap scan instead of the slow seq scan?

The big hammer is: set enable_seqscan = off, but if you tell which PG version 
you're on there may be something to do. I suggest you'd start by bumping the 
statistics target for the column to 1 and run analyze to see what that 
changes.

--
Jesper

Hi,
my email client delete a lot of the content of the original thread message. 
Here is the full content:

Hi All

I´ve created following table which contains one million records.

CREATE TABLE core_content
(
  id bigint NOT NULL,
  content text NOT NULL,
  short_content text,
  CONSTRAINT core_content_pkey PRIMARY KEY (id )
)

CREATE INDEX ft_simple_core_content_content_idx
  ON core_content
  USING gin
  (to_tsvector('simple'::regconfig, content) );


If I´m seaching for a word which is not in the column content the query plan 
and the execution time differs with the given limit.
If I choose 3927 or any higher number the query execution took only few 
milliseconds.

select * from core_content content where
to_tsvector('simple', content.content) @@ tsquery(plainto_tsquery('simple', 
'asdasdadas') :: varchar || ':*')=true
Limit 3927

Limit  (cost=10091.09..19305.68 rows=3927 width=621) (actual time=0.255..0.255 
rows=0 loops=1)
  -  Bitmap Heap Scan on core_content content  (cost=10091.09..57046.32 
rows=20011 width=621) (actual time=0.254..0.254 rows=0 loops=1)
Recheck Cond: (to_tsvector('simple'::regconfig, content) @@ 
'''asdasdadas'':*'::tsquery)
-  Bitmap Index Scan on ft_simple_core_content_content_idx  
(cost=0.00..10086.09 rows=20011 width=0) (actual time=0.251..0.251 rows=0 
loops=1)
  Index Cond: (to_tsvector('simple'::regconfig, content) @@ 
'''asdasdadas'':*'::tsquery)
Total runtime: 0.277 ms

If I choose 3926 or any lower number (e.g. 10) the query execution took more 
than fifty seconds.

select * from core_content content where
to_tsvector('simple', content.content) @@ tsquery(plainto_tsquery('simple', 
'asdasdadas') :: varchar || ':*')=true
Limit 3927

Limit  (cost=0.00..19302.23 rows=3926 width=621) (actual 
time=52147.149..52147.149 rows=0 loops=1)
  -  Seq Scan on core_content content  (cost=0.00..98384.34 rows=20011 
width=621) (actual time=52147.147..52147.147 rows=0 loops=1)
Filter: (to_tsvector('simple'::regconfig, content) @@ 
'''asdasdadas'':*'::tsquery)
Total runtime: 52147.173 ms

Is there any posibility to tune up the performance even if the limit is only 
10? Is it possible to determine that the query optimizer takes
only the fast bitmap heap scan instead of the slow seq scan?

I use PostgreSQL 9.1.5.; Intel i5-2400 @ 3.1 GHz, 16GB; Windows 7 64 Bit

Regards,
Bill Martin



[PERFORM] add column with default value is very slow

2012-09-11 Thread AI Rumman
I have a table as follows:
\d entity
 Table public.entity
Column|Type | Modifiers
--+-+
 crmid| integer | not null
 smcreatorid  | integer | not null default 0
 smownerid| integer | not null default 0
 modifiedby   | integer | not null default 0
 setype   | character varying(30)   | not null
 description  | text|
 createdtime  | timestamp without time zone | not null
 modifiedtime | timestamp without time zone | not null
 viewedtime   | timestamp without time zone |
 status   | character varying(50)   |
 version  | integer | not null default 0
 presence | integer | default 1
 deleted  | integer | not null default 0
Indexes:
entity_pkey PRIMARY KEY, btree (crmid)
entity_createdtime_idx btree (createdtime)
entity_modifiedby_idx btree (modifiedby)
entity_modifiedtime_idx btree (modifiedtime)
entity_setype_idx btree (setype) WHERE deleted = 0
entity_smcreatorid_idx btree (smcreatorid)
entity_smownerid_idx btree (smownerid)
ftx_en_entity_description gin (to_tsvector('vcrm_en'::regconfig,
for_fts(description)))
entity_deleted_idx btree (deleted)
Referenced by:
TABLE service CONSTRAINT fk_1_service FOREIGN KEY (serviceid)
REFERENCES entity(crmid) ON DELETE CASCADE
TABLE servicecontracts CONSTRAINT fk_1_servicecontracts FOREIGN KEY
(servicecontractsid) REFERENCES entity(crmid) ON DELETE CASCADE
TABLE vantage_cc2entity CONSTRAINT fk_vantage_cc2entity_entity
FOREIGN KEY (crm_id) REFERENCES entity(crmid) ON UPDATE CASCADE ON DELETE
CASCADE
TABLE vantage_emails_optout_history CONSTRAINT
fk_vantage_emails_optout_history_crmid FOREIGN KEY (crmid) REFERENCES
entity(crmid) ON DELETE CASCADE
TABLE vantage_emails_optout_history CONSTRAINT
fk_vantage_emails_optout_history_emailid FOREIGN KEY (emailid) REFERENCES
entity(crmid) ON DELETE CASCADE

I execued the query:
ALTER TABLE entity ADD COLUMN owner_type char(1) NOT NULL default 'U';

The db is stuck. The enity table has 2064740 records;

Watching locks:
select
 pg_stat_activity.datname,pg_class.relname,pg_locks.mode, pg_locks.granted,
pg_stat_activity.usename,substr(pg_stat_activity.current_query,1,10),
pg_stat_activity.query_start,
age(now(),pg_stat_activity.query_start) as age, pg_stat_activity.procpid
from pg_stat_activity,pg_locks left
outer join pg_class on (pg_locks.relation = pg_class.oid)
where pg_locks.pid=pg_stat_activity.procpid order by query_start;


  datname  |   relname   |mode
| granted | usename  |   substr   |  query_start  |
  age   | procpid
---+-+-+-+--++---+-+-
 db_test | entity_modifiedtime_idx  | AccessExclusiveLock | t
| user | ALTER TABL | 2012-09-11 12:26:20.269965+06 | 00:45:46.101971 |
  13574
 db_test | | ExclusiveLock   | t
| user | ALTER TABL | 2012-09-11 12:26:20.269965+06 |
00:45:46.101971 |   13574
 db_test | entity_modifiedby_idx| AccessExclusiveLock | t
| user | ALTER TABL | 2012-09-11 12:26:20.269965+06 | 00:45:46.101971 |
  13574
 db_test | entity_createdtime_idx   | AccessExclusiveLock | t
| user | ALTER TABL | 2012-09-11 12:26:20.269965+06 | 00:45:46.101971 |
  13574
 db_test | entity| ShareLock   | t   | user
| ALTER TABL | 2012-09-11 12:26:20.269965+06 | 00:45:46.101971 |   13574
 db_test | entity| AccessExclusiveLock | t   | user
| ALTER TABL | 2012-09-11 12:26:20.269965+06 | 00:45:46.101971 |   13574
 db_test | | AccessExclusiveLock | t
| user | ALTER TABL | 2012-09-11 12:26:20.269965+06 |
00:45:46.101971 |   13574
 db_test | | AccessExclusiveLock | t
| user | ALTER TABL | 2012-09-11 12:26:20.269965+06 |
00:45:46.101971 |   13574
 db_test | | ExclusiveLock   | t
| user | ALTER TABL | 2012-09-11 12:26:20.269965+06 |
00:45:46.101971 |   13574
 db_test | entity_pkey   | AccessExclusiveLock | t   | user
| ALTER TABL | 2012-09-11 12:26:20.269965+06 | 00:45:46.101971 |   13574
 db_test | | ShareLock   | t
| user | ALTER TABL | 2012-09-11 12:26:20.269965+06 |
00:45:46.101971 |   13574
 db_test | ftx_en_entity_description | AccessExclusiveLock | t   | user
| ALTER TABL | 2012-09-11 12:26:20.269965+06 | 00:45:46.101971 |   13574
 db_test | | AccessShareLock | 

Re: [PERFORM] add column with default value is very slow

2012-09-11 Thread Albe Laurenz
AI Rumman wrote:
 I execued the query:
 ALTER TABLE entity ADD COLUMN owner_type char(1) NOT NULL default 'U';
 
 The db is stuck. The enity table has 2064740 records;
 
 Watching locks:
[all locks are granted]

 Any idea for the db stuck?

To add the column, PostgreSQL has to modify all rows in the table.

But then 2064740 records is not very much, so it shouldn't take forever.

Do you see processor or I/O activity?

Yours,
Laurenz Albe


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] add column with default value is very slow

2012-09-11 Thread hubert depesz lubaczewski
On Tue, Sep 11, 2012 at 07:20:28PM +0600, AI Rumman wrote:
 I have a table as follows:
 I execued the query:
 ALTER TABLE entity ADD COLUMN owner_type char(1) NOT NULL default 'U';
 
 The db is stuck. The enity table has 2064740 records;

such alter table has to rewrite whole table. So it will take a while

 Watching locks:

output of this was perfectly unreadable, because your email client
wrapped lines at some random places.

In future - please put such dumps on some paste site, or just attach it
to mail, and not copy/paste them to body of message.

Best regards,

depesz

-- 
The best thing about modern society is how easy it is to avoid contact with it.
 http://depesz.com/


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] add column with default value is very slow

2012-09-11 Thread AI Rumman
I added the excel file for locks data.
I was surprised to see that while I was updating a single column value for
all records in a tables, all indexes are locked by the server.

On Tue, Sep 11, 2012 at 7:44 PM, hubert depesz lubaczewski 
dep...@depesz.com wrote:

 On Tue, Sep 11, 2012 at 07:20:28PM +0600, AI Rumman wrote:
  I have a table as follows:
  I execued the query:
  ALTER TABLE entity ADD COLUMN owner_type char(1) NOT NULL default 'U';
 
  The db is stuck. The enity table has 2064740 records;

 such alter table has to rewrite whole table. So it will take a while

  Watching locks:

 output of this was perfectly unreadable, because your email client
 wrapped lines at some random places.

 In future - please put such dumps on some paste site, or just attach it
 to mail, and not copy/paste them to body of message.

 Best regards,

 depesz

 --
 The best thing about modern society is how easy it is to avoid contact
 with it.

 http://depesz.com/



pglocks.xls
Description: MS-Excel spreadsheet

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] add column with default value is very slow

2012-09-11 Thread AI Rumman
Table size is 1186 MB.
I split the command in three steps as you said, but the result same during
the update operation.
One more thing, I have just restored the db from dump and analyzed it and
I am using Postgresql 9.1 with 3 GB Ram with dual core machine.


On Tue, Sep 11, 2012 at 7:59 PM, hubert depesz lubaczewski 
dep...@depesz.com wrote:

 On Tue, Sep 11, 2012 at 07:55:24PM +0600, AI Rumman wrote:
  I added the excel file for locks data.

 well, it worked, but why didn't you just make it text file, in notepad or
 something like this?

  I was surprised to see that while I was updating a single column value
 for
  all records in a tables, all indexes are locked by the server.

 alter table is not locked (At least looking at the pg_locks data you
 showed).

 this means - it just takes long time.

 Please do:
 select pg_total_relation_size('entity');
 to see how much data it has to rewrite.

 for future - just don't do alter table, with default, and not null.
 doing it via add column; set default; batch-backfill data, set not null
 will take longer but will be done with much shorter locks.

 Best regards,

 depesz

 --
 The best thing about modern society is how easy it is to avoid contact
 with it.

 http://depesz.com/



Re: [PERFORM] add column with default value is very slow

2012-09-11 Thread hubert depesz lubaczewski
On Tue, Sep 11, 2012 at 07:55:24PM +0600, AI Rumman wrote:
 I added the excel file for locks data.

well, it worked, but why didn't you just make it text file, in notepad or
something like this?

 I was surprised to see that while I was updating a single column value for
 all records in a tables, all indexes are locked by the server.

alter table is not locked (At least looking at the pg_locks data you
showed).

this means - it just takes long time.

Please do:
select pg_total_relation_size('entity');
to see how much data it has to rewrite.

for future - just don't do alter table, with default, and not null.
doing it via add column; set default; batch-backfill data, set not null
will take longer but will be done with much shorter locks.

Best regards,

depesz

-- 
The best thing about modern society is how easy it is to avoid contact with it.
 http://depesz.com/


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] add column with default value is very slow

2012-09-11 Thread hubert depesz lubaczewski
On Tue, Sep 11, 2012 at 08:04:06PM +0600, AI Rumman wrote:
 Table size is 1186 MB.

if it takes long, it just means that your IO is slow.

 I split the command in three steps as you said, but the result same during
 the update operation.

three? I was showing four steps, and one of them is usually consisting
hundreds, if not thousands, of queries.

 One more thing, I have just restored the db from dump and analyzed it and
 I am using Postgresql 9.1 with 3 GB Ram with dual core machine.

so it looks like your IO channel is slow.

Best regards,

depesz

-- 
The best thing about modern society is how easy it is to avoid contact with it.
 http://depesz.com/


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] add column with default value is very slow

2012-09-11 Thread Andrew Dunstan


On 09/11/2012 09:55 AM, AI Rumman wrote:

I added the excel file for locks data.
I was surprised to see that while I was updating a single column value 
for all records in a tables, all indexes are locked by the server.



Any ALTER TABLE command locks the whole table in ACCESS EXCLUSIVE mode, 
indexes included. See the description of ACCESS EXCLUSIVE lock at 
http://www.postgresql.org/docs/current/static/explicit-locking.html


cheers

andrew




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner selects different execution plans depending on limit

2012-09-11 Thread Tom Lane
Bill Martin bill.mar...@communote.com writes:
 I´ve created following table which contains one million records.
 ...

 Limit  (cost=10091.09..19305.68 rows=3927 width=621) (actual 
 time=0.255..0.255 rows=0 loops=1)
   -  Bitmap Heap Scan on core_content content  (cost=10091.09..57046.32 
 rows=20011 width=621) (actual time=0.254..0.254 rows=0 loops=1)
 Recheck Cond: (to_tsvector('simple'::regconfig, content) @@ 
 '''asdasdadas'':*'::tsquery)
 -  Bitmap Index Scan on ft_simple_core_content_content_idx  
 (cost=0.00..10086.09 rows=20011 width=0) (actual time=0.251..0.251 rows=0 
 loops=1)
   Index Cond: (to_tsvector('simple'::regconfig, content) @@ 
 '''asdasdadas'':*'::tsquery)
 Total runtime: 0.277 ms

 Is there any posibility to tune up the performance even if the limit is only 
 10?

The problem is the way-off rowcount estimate (20011 rows when it's
really none); with a smaller estimate there, the planner wouldn't decide
to switch to a seqscan.

Did you take the advice to increase the column's statistics target?
Because 20011 looks suspiciously close to the default estimate that
tsquery_opr_selec will fall back on if it hasn't got enough stats
to come up with a trustworthy estimate for a *-pattern query.

(I think there are probably some bugs in tsquery_opr_selec's estimate
for this, as I just posted about on pgsql-hackers.  But this number
looks like you're not even getting to the estimation code, for lack
of enough statistics entries.)

The other thing that seems kind of weird here is that the cost estimate
for the bitmap index scan seems out of line even given the
2-entries-to-fetch estimate.  I'd have expected a cost estimate of a
few hundred for that, not 1.  Perhaps this index is really bloated,
and it's time to REINDEX it?

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance