Re: Slow query on a one-tuple table

2019-09-19 Thread Tom Lane
=?UTF-8?Q?Lu=c3=ads_Roberto_Weck?=  writes:
> As fas as autovacuum options, this is what I'm using:

> autovacuum_vacuum_scale_factor=0,

Ugh ... maybe I'm misremembering, but I *think* that has the effect
of disabling autovac completely.  You don't want zero.

Check in pg_stat_all_tables.last_autovacuum to see if anything
is happening.  If the dates seem reasonably current, then I'm wrong.

regards, tom lane




Re: Slow query on a one-tuple table

2019-09-19 Thread Luís Roberto Weck

Em 19/09/2019 19:32, Michael Lewis escreveu:


I have about 6 bigint fields in this table that are very frequently
updated, but none of these are indexed. I thought that by not
having an
index on them, would make all updates HOT, therefore not bloating the
primary key index. Seems I was wrong?


HOT update is only possible if there is room in the page. How wide is 
your single tuple?


Have you tuned autovacuum or are you running defaults? Not sure of 
your perception of "very frequently" updated values, but if you have 
bloat issue, vacuum early and often. Not sure how the math works out 
on a table with single tuple in terms of calculating when it is time 
to vacuum, but it certainly needs to be tuned differently than a table 
with millions of rows which is what I would be more used to.


I'm not sure how to measure how wide the tuple is, can you point me in 
the right direction?


As fas as autovacuum options, this is what I'm using:

autovacuum_enabled=true,
fillfactor=50,
autovacuum_vacuum_threshold=25,
autovacuum_vacuum_scale_factor=0,
autovacuum_analyze_threshold=10,
autovacuum_analyze_scale_factor=0.05,
autovacuum_vacuum_cost_delay=10,
autovacuum_vacuum_cost_limit=1000,
toast.autovacuum_enabled=true

By "very frequently" I mean I can update it up to 80 times a day. 
Usually this number is closer to 10.


Re: Slow query on a one-tuple table

2019-09-19 Thread Michael Lewis
>
> I have about 6 bigint fields in this table that are very frequently
> updated, but none of these are indexed. I thought that by not having an
> index on them, would make all updates HOT, therefore not bloating the
> primary key index. Seems I was wrong?
>

HOT update is only possible if there is room in the page. How wide is your
single tuple?

Have you tuned autovacuum or are you running defaults? Not sure of your
perception of "very frequently" updated values, but if you have bloat
issue, vacuum early and often. Not sure how the math works out on a table
with single tuple in terms of calculating when it is time to vacuum, but it
certainly needs to be tuned differently than a table with millions of rows
which is what I would be more used to.


Re: Slow query on a one-tuple table

2019-09-19 Thread Luís Roberto Weck

Em 19/09/2019 17:41, Luís Roberto Weck escreveu:

Em 19/09/2019 17:24, Luís Roberto Weck escreveu:

Em 19/09/2019 17:11, Igor Neyman escreveu:

With LIMIT 1, I get 3 shared buffers hit, pretty much always.

 



Check if assessoria_pkey index is bloated.

Regards,
Igor Neyman




With this query[1] it shows:

current_database|schemaname|tblname   |idxname 
|real_size|extra_size|extra_ratio|fillfactor|bloat_size|bloat_ratio|is_na|
|--|--|---|-|--|---|--|--|---|-| 

database_name   |public    |assessoria|assessoria_pkey| 
16384| 0|    0.0|    90|   0.0| 0.0|false|


[1]https://github.com/ioguix/pgsql-bloat-estimation/blob/master/btree/btree_bloat-superuser.sql 






Using the quer provided here[1] I see this comment:

  /*
   * distinct_real_item_keys is how many distinct "data" fields on page
   * (excludes highkey).
   *
   * If this is less than distinct_block_pointers on an internal page, 
that
   * means that there are so many duplicates in its children that 
there are
   * duplicate high keys in children, so the index is probably pretty 
bloated.

   *
   * Even unique indexes can have duplicates.  It's sometimes 
interesting to
   * watch out for how many distinct real items there are within leaf 
pages,
   * compared to the number of live items, or total number of items.  
Ideally,

   * these will all be exactly the same for unique indexes.
   */

In my case, I'm seeing:

distinct_real_item_keys|distinct_block_pointers|
---|---|
  1| 63|

This is about half an hour after running VACUUM FULL ANALYZE on the 
table.


What can I do to reduce this?


[1] 
https://wiki.postgresql.org/wiki/Index_Maintenance#Summarize_keyspace_of_a_B-Tree_inde
Like Igor suggested, the index bloat seems to be at fault here. After 
dropping the PK, I'm getting these plans:


First run (SELECT asscod, asscambol FROM ASSESSORIA WHERE asscod = 1 
ORDER BY asscod):


 Seq Scan on public.assessoria  (cost=0.00..88.01 rows=1 width=62) 
(actual time=0.242..0.810 rows=1 loops=1)

   Output: asscod, asscambol
   Filter: (assessoria.asscod = 1)
   Buffers: shared hit=88
 Planning Time: 0.312 ms
 Execution Time: 0.876 ms
(6 rows)

Subsequent runs get increasingly faster, up to 0.080ms execution times.

Using LIMIT 1, I get on the first run:

 Limit  (cost=0.00..88.01 rows=1 width=62) (actual time=0.252..0.254 
rows=1 loops=1)

   Output: asscod, asscambol
   Buffers: shared hit=17
   ->  Seq Scan on public.assessoria  (cost=0.00..88.01 rows=1 
width=62) (actual time=0.250..0.250 rows=1 loops=1)

 Output: asscod, asscambol
 Filter: (assessoria.asscod = 1)
 Buffers: shared hit=17
 Planning Time: 0.334 ms
 Execution Time: 0.296 ms


Subsequent runs look more like this:

 Limit  (cost=0.00..88.01 rows=1 width=62) (actual time=0.057..0.057 
rows=1 loops=1)

   Output: asscod, asscambol
   Buffers: shared hit=17
   ->  Seq Scan on public.assessoria  (cost=0.00..88.01 rows=1 
width=62) (actual time=0.056..0.056 rows=1 loops=1)

 Output: asscod, asscambol
 Filter: (assessoria.asscod = 1)
 Buffers: shared hit=17
 Planning Time: 0.082 ms
 Execution Time: 0.068 ms

I have about 6 bigint fields in this table that are very frequently 
updated, but none of these are indexed. I thought that by not having an 
index on them, would make all updates HOT, therefore not bloating the 
primary key index. Seems I was wrong?





Re: Surprising benchmark count(1) vs. count(*)

2019-09-19 Thread Tom Lane
Adam Brusselback  writes:
> It would be nice if Postgres optimized this case though because it is
> really really common from what i've seen.

Since the introduction of the "planner support function" infrastructure,
it'd be possible to do this without it being a completely ugly kluge:
we could put the logic for it into a planner support function attached
to count(any).  Currently planner support functions are only called for
regular functions, but we could certainly envision adding the ability to
do it for aggregates (and window functions too, why not).

I'm not particularly planning to do that myself, but if someone else
wants to write a patch, have at it.

regards, tom lane




Re: Surprising benchmark count(1) vs. count(*)

2019-09-19 Thread Adam Brusselback
I will say I've seen count(1) in the wild a ton, as well as at my own
company from developers who were used to it not making a difference.

There have been a couple queries in the hot path that I have had to changed
from count(1) to count(*) as part of performance tuning, but in general
it's not worth me worrying about. There are usually larger performance
issues to track down in complex queries.

It would be nice if Postgres optimized this case though because it is
really really common from what i've seen.

Thanks,
-Adam


Re: Slow query on a one-tuple table

2019-09-19 Thread Luís Roberto Weck

Em 19/09/2019 17:24, Luís Roberto Weck escreveu:

Em 19/09/2019 17:11, Igor Neyman escreveu:

With LIMIT 1, I get 3 shared buffers hit, pretty much always.

 



Check if assessoria_pkey index is bloated.

Regards,
Igor Neyman




With this query[1] it shows:

current_database|schemaname|tblname   |idxname 
|real_size|extra_size|extra_ratio|fillfactor|bloat_size|bloat_ratio|is_na|
|--|--|---|-|--|---|--|--|---|-| 

database_name   |public    |assessoria|assessoria_pkey| 16384| 
0|    0.0|    90|   0.0| 0.0|false|


[1]https://github.com/ioguix/pgsql-bloat-estimation/blob/master/btree/btree_bloat-superuser.sql 






Using the quer provided here[1] I see this comment:

  /*
   * distinct_real_item_keys is how many distinct "data" fields on page
   * (excludes highkey).
   *
   * If this is less than distinct_block_pointers on an internal page, that
   * means that there are so many duplicates in its children that there are
   * duplicate high keys in children, so the index is probably pretty 
bloated.

   *
   * Even unique indexes can have duplicates.  It's sometimes 
interesting to
   * watch out for how many distinct real items there are within leaf 
pages,
   * compared to the number of live items, or total number of items.  
Ideally,

   * these will all be exactly the same for unique indexes.
   */

In my case, I'm seeing:

distinct_real_item_keys|distinct_block_pointers|
---|---|
  1| 63|

This is about half an hour after running VACUUM FULL ANALYZE on the table.

What can I do to reduce this?


[1] 
https://wiki.postgresql.org/wiki/Index_Maintenance#Summarize_keyspace_of_a_B-Tree_index





Re: Slow query on a one-tuple table

2019-09-19 Thread Luís Roberto Weck

Em 19/09/2019 17:11, Igor Neyman escreveu:

With LIMIT 1, I get 3 shared buffers hit, pretty much always.



Check if assessoria_pkey index is bloated.

Regards,
Igor Neyman




With this query[1] it shows:

current_database|schemaname|tblname   |idxname 
|real_size|extra_size|extra_ratio|fillfactor|bloat_size|bloat_ratio|is_na|

|--|--|---|-|--|---|--|--|---|-|
database_name   |public    |assessoria|assessoria_pkey| 16384| 
0|    0.0|    90|   0.0| 0.0|false|


[1]https://github.com/ioguix/pgsql-bloat-estimation/blob/master/btree/btree_bloat-superuser.sql 






RE: Slow query on a one-tuple table

2019-09-19 Thread Igor Neyman
With LIMIT 1, I get 3 shared buffers hit, pretty much always.



Check if assessoria_pkey index is bloated.

Regards,
Igor Neyman




Re: Slow query on a one-tuple table

2019-09-19 Thread Luís Roberto Weck

Em 19/09/2019 15:34, Igor Neyman escreveu:

-Original Message-
From: Luís Roberto Weck [mailto:luisrobe...@siscobra.com.br]
Sent: Thursday, September 19, 2019 2:30 PM
To: Michael Lewis 
Cc: pgsql-performance@lists.postgresql.org
Subject: Re: Slow query on a one-tuple table

WARNING: This email originated from outside of Perceptron! Please be mindful of 
PHISHING and MALWARE risks.

Em 19/09/2019 14:21, Michael Lewis escreveu:

Is this result able to be repeated?

Yes, I  can consistently repeat it.

Postgres version is 11.1.

Other executions:

Index Scan using assessoria_pkey on public.assessoria (cost=0.25..2.47
rows=1 width=62) (actual time=1.591..4.035 rows=1 loops=1)
Output: asscod, asscambol
Index Cond: (assessoria.asscod = 1)
Buffers: shared hit=1187
Planning Time: 0.053 ms
Execution Time: 4.055 ms

Index Scan using assessoria_pkey on public.assessoria (cost=0.25..2.47
rows=1 width=62) (actual time=1.369..3.838 rows=1 loops=1)
Output: asscod, asscambol
Index Cond: (assessoria.asscod = 1)
Buffers: shared hit=1187
Planning Time: 0.033 ms
Execution Time: 3.851 ms



But can you repeat it with "LIMIT 1"?
Notice huge difference in "buffers hit" while doing (the same) Index Scan in 
two plans.

Regards,
Igor Neyman

With LIMIT 1, I get 3 shared buffers hit, pretty much always.




RE: Slow query on a one-tuple table

2019-09-19 Thread Igor Neyman

-Original Message-
From: Luís Roberto Weck [mailto:luisrobe...@siscobra.com.br] 
Sent: Thursday, September 19, 2019 2:30 PM
To: Michael Lewis 
Cc: pgsql-performance@lists.postgresql.org
Subject: Re: Slow query on a one-tuple table

WARNING: This email originated from outside of Perceptron! Please be mindful of 
PHISHING and MALWARE risks.

Em 19/09/2019 14:21, Michael Lewis escreveu:
> Is this result able to be repeated?

Yes, I  can consistently repeat it.

Postgres version is 11.1.

Other executions:

Index Scan using assessoria_pkey on public.assessoria (cost=0.25..2.47
rows=1 width=62) (actual time=1.591..4.035 rows=1 loops=1)
   Output: asscod, asscambol
   Index Cond: (assessoria.asscod = 1)
   Buffers: shared hit=1187
Planning Time: 0.053 ms
Execution Time: 4.055 ms

Index Scan using assessoria_pkey on public.assessoria (cost=0.25..2.47
rows=1 width=62) (actual time=1.369..3.838 rows=1 loops=1)
   Output: asscod, asscambol
   Index Cond: (assessoria.asscod = 1)
   Buffers: shared hit=1187
Planning Time: 0.033 ms
Execution Time: 3.851 ms



But can you repeat it with "LIMIT 1"?
Notice huge difference in "buffers hit" while doing (the same) Index Scan in 
two plans.

Regards,
Igor Neyman


Re: Slow query on a one-tuple table

2019-09-19 Thread Luís Roberto Weck



Em 19/09/2019 14:21, Michael Lewis escreveu:

Is this result able to be repeated?


Yes, I  can consistently repeat it.

Postgres version is 11.1.

Other executions:

Index Scan using assessoria_pkey on public.assessoria (cost=0.25..2.47 
rows=1 width=62) (actual time=1.591..4.035 rows=1 loops=1)

  Output: asscod, asscambol
  Index Cond: (assessoria.asscod = 1)
  Buffers: shared hit=1187
Planning Time: 0.053 ms
Execution Time: 4.055 ms

Index Scan using assessoria_pkey on public.assessoria (cost=0.25..2.47 
rows=1 width=62) (actual time=1.369..3.838 rows=1 loops=1)

  Output: asscod, asscambol
  Index Cond: (assessoria.asscod = 1)
  Buffers: shared hit=1187
Planning Time: 0.033 ms
Execution Time: 3.851 ms




Re: Delete huge Table under XFS

2019-09-19 Thread Christoph Berg
Re: Joao Junior 2019-09-19 

> A table with 800 gb means 800 files of 1 gb. When I use truncate or drop
> table,  xfs that is a log based filesystem,  will write lots of data in its
> log and this is the problem. The problem is not postgres, it is the way
> that xfs works with big files , or being more clear, the way that it
> handles lots of files.

Why is the runtime of a DROP TABLE command important? Is anything
waiting for it?

Christoph




Re: Slow query on a one-tuple table

2019-09-19 Thread Michael Lewis
Is this result able to be repeated?


Re: Delete huge Table under XFS

2019-09-19 Thread Joao Junior
A table with 800 gb means 800 files of 1 gb. When I use truncate or drop
table,  xfs that is a log based filesystem,  will write lots of data in its
log and this is the problem. The problem is not postgres, it is the way
that xfs works with big files , or being more clear, the way that it
handles lots of files.

Regards,
Joao

On Thu, Sep 19, 2019, 18:50 Andreas Kretschmer 
wrote:

>
>
> Am 19.09.19 um 17:59 schrieb Joao Junior:
> >
> >
> > I have a table that Is not being use anymore, I want to drop it.
> > The table is huge, around 800GB and it has some index on it.
> >
> > When I execute the drop table command it goes very slow, I realised
> > that the problem is the filesystem.
> > It seems that XFS doesn't handle well big files, there are some
> > discussion about it in some lists.
>
> PG doesn't create one big file for this table, but about 800 files with
> 1GB size each.
>
> >
> > I have to find a way do delete the table in chunks.
>
> Why? If you want to delete all rows, just use TRUNCATE.
>
>
> Regards, Andreas
>
> --
> 2ndQuadrant - The PostgreSQL Support Company.
> www.2ndQuadrant.com
>
>
>
>


Slow query on a one-tuple table

2019-09-19 Thread Luís Roberto Weck

Hi!

I have a query that SELECT's only one tuple using a PK 
(https://explain.depesz.com/s/Hskt) 


the field I am selecting are a bigint and a text. Why does it read 1095 
shared buffers read?


If I adda LIMIT 1 clause, the query runs much faster: 
https://explain.depesz.com/s/bSZn


This table has only one tuple anyway, so I can't understand why does it 
takes so long without the LIMIT 1.




Re: Delete huge Table under XFS

2019-09-19 Thread Andreas Kretschmer




Am 19.09.19 um 17:59 schrieb Joao Junior:



I have a table that Is not being use anymore, I want to drop it.
The table is huge, around 800GB and it has some index on it.

When I execute the drop table command it goes very slow, I realised 
that the problem is the filesystem.
It seems that XFS doesn't handle well big files, there are some 
discussion about it in some lists.


PG doesn't create one big file for this table, but about 800 files with 
1GB size each.




I have to find a way do delete the table in chunks.


Why? If you want to delete all rows, just use TRUNCATE.


Regards, Andreas

--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com





Re: comparing output of internal pg tables of referenced tables

2019-09-19 Thread Tom Lane
Mariel Cherkassky  writes:
> 2)select conname,  (select r.relname from pg_class r where r.oid =
> c.confrelid) as orig_table,(select array_agg(attname) from pg_attribute
>where attrelid = c.confrelid and ARRAY[attnum] <@ c.conkey) as
> orig_cols,  (select r.relname from pg_class r where r.oid = c.conrelid) as
> foreign_table,  (select array_agg(attname) from pg_attributewhere
> attrelid = c.conrelid and ARRAY[attnum] <@ c.conkey) as foreign_cols from
> pg_constraint c   where c.confrelid = (select oid from pg_class where
> relname = '*my_table*') and c.contype='f'

> On the second output in the orig_cols I got a few weird outputs like
> : {pg.dropped.5} or even a columns that doesnt have a
> unique index (just a random column from the orig_table).

You need to be looking at confkey not conkey for the columns in the
confrelid table.

regards, tom lane




Delete huge Table under XFS

2019-09-19 Thread Joao Junior
Hi,
I am running Postgresql 9.6  XFS as filesystem , kernel Linux 2.6.32.

I have a table that Is not being use anymore, I want to drop it.
The table is huge, around 800GB and it has some index on it.

When I execute the drop table command it goes very slow, I realised that
the problem is the filesystem.
It seems that XFS doesn't handle well big files, there are some
discussion about it in some lists.

I have to find a way do delete the table in chunks.

My first attempt was:

Iterate from the tail of the table until the beginning.
Delete some blocks of the table.
Run vacuum on it
iterate again

The plan is delete some amount of blocks at the end of the table, in chunks
of some size and vacuum it  waiting  for vacuum shrink the table.
it seems work, the table has  been shrink but each vacuum takes a huge
amount of time, I suppose it is because of the index. there is another
point, the index still huge and will be.

I am thinking of another way of doing this.
I can get  the relfilenode of the table, in this way I can get the files
that belongs to the table and simply delete batches of files in a way that
don't put so much load on disk.
Do the same for the index.
Once I delete all table's files and index's files, I could simply execute
the command drop table and the entries from the catalog would deleted.

I would appreciate any kind of comments.
thanks!


Re: Surprising benchmark count(1) vs. count(*)

2019-09-19 Thread Tom Lane
Thomas Kellerer  writes:
> Laurenz Albe schrieb am 19.09.2019 um 12:22:
>> "count(1)" has to check if 1 IS NULL for each row, because NULL
>> values are not counted.  "count(*)" doesn't have to do that.

> But 1 is a constant, why does it need to check it for each row? 

[ shrug... ]  There's no special optimization for that case.
And I can't say that it seems attractive to add one.

regards, tom lane




Re: Surprising benchmark count(1) vs. count(*)

2019-09-19 Thread Laurenz Albe
On Thu, 2019-09-19 at 12:09 +0200, Thomas Kellerer wrote:
> https://blog.jooq.org/2019/09/19/whats-faster-count-or-count1/
> 
> Is there a reason why count(*) seems to be faster?

"count(*)" is just the SQL standard's way of saying what you'd
normally call "count()", that is, an aggregate without arguments.

"count(1)" has to check if 1 IS NULL for each row, because NULL
values are not counted.  "count(*)" doesn't have to do that.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Surprising benchmark count(1) vs. count(*)

2019-09-19 Thread Thomas Kellerer
https://blog.jooq.org/2019/09/19/whats-faster-count-or-count1/

Is there a reason why count(*) seems to be faster?