Re: [PERFORM] Plan differences

2016-01-04 Thread Anton Melser
Hi,


>Changing the locale to anything other than C or POSIX will
> have a performance overhead.  I’m pretty sure that just declaring the
> locale on the indexes is just like plastering over the cracks.
>
>
>
> Is it possible to reload the database with the same locale as the original
> database server?
>

Sorry, I wasn't clear - I did end up recreating the DB with lc_collate =
"C" and lc_ctype = "C" and loading all data and the plans are for this
situation (i.e., both are now the same, "C" everywhere) Maybe it is just a
case of optimisations being removed in the RC?

Cheers,
Anton


Re: [PERFORM] Plan differences

2016-01-01 Thread Anton Melser
>
> Declaring new indexes with COLLATE "C" and removing the old indexes fixed
>> the like problem but it created a another - the > and < queries need a sort
>> before passing off the the new index. Having two indexes seems to give me
>> the best of both worlds, though obviously it's taking up (much) more space.
>> As space isn't ever likely to be a problem, and there are no updates (only
>> copy) to these tables, I'll keep it like this to avoid having to reload the
>> entire DB.
>
>
I spoke a little soon - while many of the simple queries are now hitting
the indexes, some of the more complicated ones are still producing
substantially inferior plans, even after reloading the whole DB with an
identical lc_collate and lc_ctype. Here are the plans on the original
server and the new server (identical collations, lctypes and index types -
btree C). I have been experimenting (accepted = accepted2,
idx_accepted2_mid = idx_accepted_mid, etc.) and the tables no longer have
exactly the same data but there is nothing substantially different (a few
days of data more with about a year total). The oldserver query is actually
working on about 3x the amount of data - I tried reducing the amounts on
the new server to get done in memory but it didn't seem to help the plan.

 HashAggregate  (cost=3488512.43..3496556.16 rows=536249 width=143) (actual
time=228467.924..229026.799 rows=1426351 loops=1)
   Group Key: to_char(timezone('UTC'::text, a.tstamp), '-MM-DD'::text),
a.column1, CASE WHEN (d.column1 IS NOT NULL) THEN d.column1 ELSE
fff.column1 END, a.column2
   ->  Merge Left Join  (cost=110018.15..3072358.66 rows=23780215
width=143) (actual time=3281.993..200563.177 rows=23554638 loops=1)
 Merge Cond: ((a.message_id)::text = (fff.message_id)::text)
 ->  Merge Left Join  (cost=110017.58..2781199.04 rows=23780215
width=136) (actual time=3281.942..157385.338 rows=23554636 loops=1)
   Merge Cond: ((a.message_id)::text = (d.message_id)::text)
   ->  Index Scan using idx_accepted2_mid on accepted a
 (cost=0.70..2226690.13 rows=23780215 width=83) (actual
time=3.690..73048.662 rows=23554632 loops=1)
 Index Cond: ((message_id)::text > '20151130'::text)
 Filter: (((mrid)::text <>
'----'::text) AND ((mrid)::text <>
'BAT'::text) AND ((column2)::text <> 'text1'::text) AND ((column2)::text
!~~ 'text2.%'::text))
 Rows Removed by Filter: 342947
   ->  Index Scan using idx_delivered2_mid on delivered d
 (cost=110016.89..482842.01 rows=3459461 width=53) (actual
time=3278.245..64031.033 rows=23666434 loops=1)
 Index Cond: ((message_id)::text > '20151130'::text)
 Filter: (NOT (hashed SubPlan 1))
 Rows Removed by Filter: 443
 SubPlan 1
   ->  Index Scan using idx_failed2_mid on failed ff
 (cost=0.57..109953.48 rows=25083 width=46) (actual time=0.041..3124.642
rows=237026 loops=1)
 Index Cond: ((message_id)::text >
'20151130'::text)
 Filter: ((severity)::text = 'permanent'::text)
 Rows Removed by Filter: 5080519
 ->  Index Scan using idx_failed2_mid on failed fff
 (cost=0.57..112718.27 rows=25083 width=53) (actual time=0.034..4861.762
rows=236676 loops=1)
   Index Cond: ((message_id)::text > '20151130'::text)
   Filter: ((severity)::text = 'permanent'::text)
   Rows Removed by Filter: 5080519
 Planning time: 2.039 ms
 Execution time: 229076.361 ms


 HashAggregate  (cost=7636055.05..7640148.23 rows=272879 width=143) (actual
time=488739.376..488915.545 rows=403741 loops=1)
   Group Key: to_char(timezone('UTC'::text, a.tstamp), '-MM-DD'::text),
a.column1, CASE WHEN (d.column1 IS NOT NULL) THEN d.column1 ELSE
fff.column1 END, a.column2
   ->  Hash Right Join  (cost=5119277.32..7528101.45 rows=6168777
width=143) (actual time=271256.212..480958.460 rows=6516196 loops=1)
 Hash Cond: ((d.message_id)::text = (a.message_id)::text)
 ->  Bitmap Heap Scan on delivered2 d  (cost=808012.86..3063311.98
rows=3117499 width=53) (actual time=7012.487..194557.307 rows=6604970
loops=1)
   Recheck Cond: ((message_id)::text > '20151225'::text)
   Rows Removed by Index Recheck: 113028616
   Filter: (NOT (hashed SubPlan 1))
   Rows Removed by Filter: 88
   Heap Blocks: exact=1146550 lossy=2543948
   ->  Bitmap Index Scan on idx_delivered_mid
 (cost=0.00..100075.17 rows=6234997 width=0) (actual
time=4414.860..4414.860 rows=6605058 loops=1)
 Index Cond: ((message_id)::text > '20151225'::text)
   SubPlan 1
 ->  Bitmap Heap Scan on failed2 ff
 (cost=19778.06..707046.73 rows=44634 width=46) (actual
time=828.164..1949.687 rows=71500 loops=1)
   

Re: [PERFORM] Plan differences

2015-12-31 Thread Anton Melser
Hi,

It is looking like some missing optimization that was removed from RC
> release.
>

Thanks. Is there some discussion of why these optimisations were removed? I
started looking at some of the more complicated queries I do and there are
many occasions where there are 10-30x performance degradations compared
with the RC. Not what I was hoping for with a much more powerful machine!
Were these optimisations really dangerous? Is there any (easy and safe) way
to get them back or would I need to reinstall an RC version?

Thanks again,
Anton


[PERFORM] Plan differences

2015-12-31 Thread Anton Melser
Hi,

I moved a DB between two "somewhat" similar Postgres installs and am
getting much worse plans on the second. The DB was dumped via pg_dump
(keeping indexes, etc.) and loaded to the new server.

The first (installed via emerge):

select version();
 PostgreSQL 9.4rc1 on x86_64-pc-linux-gnu, compiled by
x86_64-pc-linux-gnu-gcc (Gentoo 4.7.3-r1 p1.4, pie-0.5.5) 4.7.3, 64-bit

The second (installed from the Postgres centos repo) :

select version();
 PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7
20120313 (Red Hat 4.4.7-11), 64-bit

SHOW ALL; gives identical results on both - I increased several values on
both servers:

max_connections = 300

shared_buffers = 16GB
temp_buffers = 128MB
work_mem = 128MB

seq_page_cost = 0.5
random_page_cost = 1.0
effective_cache_size = 16GB

The first machine has 32GB of RAM and 16 cores (Intel(R) Xeon(R) CPU
E5-2650 v2 @ 2.60GHz) and the second 96GB of RAM and 24 cores (Intel(R)
Xeon(R) CPU E5-2430L v2 @ 2.40GHz). I have a series of python scripts
(including a Django site) also on the machine but did before also - load
shouldn't have changed (there were some external backups on the other
machine and on the new machine only my DB + scripts).

dd performance is similar for sizes under the RAM size:

oldserver:~$ dd if=/dev/zero of=output.img bs=8k count=256k
262144+0 records in
262144+0 records out
2147483648 bytes (2.1 GB) copied, 2.04997 s, 1.0 GB/s
oldserver:~$ dd if=/dev/zero of=output.img bs=8k count=1M
1048576+0 records in
1048576+0 records out
8589934592 bytes (8.6 GB) copied, 13.7105 s, 627 MB/s

[newserver ~]$ dd if=/dev/zero of=output.img bs=8k count=256k
262144+0 records in
262144+0 records out
2147483648 bytes (2.1 GB) copied, 2.03452 s, 1.1 GB/s
[newserver ~]$ dd if=/dev/zero of=output.img bs=8k count=1M
1048576+0 records in
1048576+0 records out
8589934592 bytes (8.6 GB) copied, 21.4785 s, 400 MB/s

But significantly better on the new machine over the RAM size:

oldserver:~$ dd if=/dev/zero of=output.img bs=8k count=5M
5242880+0 records in
5242880+0 records out
42949672960 bytes (43 GB) copied, 478.037 s, 89.8 MB/s

[newserver ~]$ dd if=/dev/zero of=output.img bs=8k count=15M
15728640+0 records in
15728640+0 records out
128849018880 bytes (129 GB) copied, 256.748 s, 502 MB/s

I get the following plan on the old machine for a query:

oldserver=# explain analyze select count(0) from (select message_id,
count(0) from accepted where message_id like '20151213%' group by
message_id having count(0) > 1) as toto;

QUERY PLAN

---
 Aggregate  (cost=734.26..734.27 rows=1 width=0) (actual
time=2519.545..2519.546 rows=1 loops=1)
   ->  GroupAggregate  (cost=0.70..452.90 rows=22509 width=46) (actual
time=2519.542..2519.542 rows=0 loops=1)
 Group Key: accepted.message_id
 Filter: (count(0) > 1)
 Rows Removed by Filter: 1289815
 ->  Index Only Scan using idx_accepted2_mid on accepted
 (cost=0.70..2.72 rows=22509 width=46) (actual time=0.037..1613.982
rows=1289815 loops=1)
   Index Cond: ((message_id >= '20151213'::text) AND
(message_id < '20151214'::text))
   Filter: ((message_id)::text ~~ '20151213%'::text)
   Heap Fetches: 1289815
 Planning time: 0.325 ms
 Execution time: 2519.610 ms
(11 rows)

Time: 2520.534 ms

On the new machine, I was originally getting:

newserver=# explain analyze select count(0) from (select message_id,
count(0) from accepted where message_id like '20151213%' group by
message_id having count(0) > 1) as toto;
  QUERY
PLAN
---
 Aggregate  (cost=8018044.22..8018044.23 rows=1 width=0) (actual
time=123964.197..123964.197 rows=1 loops=1)
   ->  GroupAggregate  (cost=7935128.17..7988431.35 rows=2369030 width=46)
(actual time=123964.195..123964.195 rows=0 loops=1)
 Group Key: accepted.message_id
 Filter: (count(0) > 1)
 Rows Removed by Filter: 1289817
 ->  Sort  (cost=7935128.17..7941050.75 rows=2369030 width=46)
(actual time=123112.260..123572.412 rows=1289817 loops=1)
   Sort Key: accepted.message_id
   Sort Method: external merge  Disk: 70920kB
   ->  Seq Scan on accepted  (cost=0.00..7658269.38
rows=2369030 width=46) (actual time=4450.097..105171.191 rows=1289817
loops=1)
 Filter: ((message_id)::text ~~ '20151213%'::text)
 Rows Removed by Filter: 232872643
 Planning time: 0.145 ms
 Execution time: 123995.671 ms

But after a vacuum analyze got:

newserver=# explain analyze select count(0) from (select message_id,
count(0) from accepted where message_id like '20151213%' group by
message_id having count(0) > 

Re: [PERFORM] Plan differences

2015-12-31 Thread Anton Melser
>
> I'd bet your old database is in C locale and the new one is not.
>

Remind me never to never bet against you :-).


> The LIKE optimization requires an index that's sorted according to plain
> C (strcmp) rules.  A regular text index will be that way only if the
> database's LC_COLLATE is C.
>
> If you don't want to rebuild the whole database, you can create indexes to
> support this by declaring them with COLLATE "C", or the older way is to
> declare them with text_pattern_ops as the index opclass.
>

Declaring new indexes with COLLATE "C" and removing the old indexes fixed
the like problem but it created a another - the > and < queries need a sort
before passing off the the new index. Having two indexes seems to give me
the best of both worlds, though obviously it's taking up (much) more space.
As space isn't ever likely to be a problem, and there are no updates (only
copy) to these tables, I'll keep it like this to avoid having to reload the
entire DB.

Thanks very much for your help.
Cheers,
Anton


[PERFORM] Reinitialising stats once only without restarting

2007-12-20 Thread Anton Melser
Hi,
Sorry but I couldn't find the answer to this...

I would like to empty all stats (pg_stat_all_tables probably mostly)
so I can get an idea of what's going on now. Is this possible? I
didn't want to just go deleting without knowing what it would do...
Thanks
Anton

-- 
echo '16i[q]sa[ln0=aln100%Pln100/snlbx]sbA0D4D465452snlbxq' | dc
This will help you for 99.9% of your problems ...

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Reinitialising stats once only without restarting

2007-12-20 Thread Anton Melser
On 20/12/2007, Alvaro Herrera [EMAIL PROTECTED] wrote:
 Anton Melser escribió:
  Hi,
  Sorry but I couldn't find the answer to this...
 
  I would like to empty all stats (pg_stat_all_tables probably mostly)
  so I can get an idea of what's going on now. Is this possible? I
  didn't want to just go deleting without knowing what it would do...

 Sure, use pg_stat_reset();

Pura vida, gracias.
A

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Postgres performance problem

2007-08-28 Thread Anton Melser
Just a random thought/question...
Are you running else on the machine? When you say resource usage, do
you mean hd space, memory, processor, ???
What are your values in top?
More info...
Cheers
Anton


On 27/08/2007, Bill Moran [EMAIL PROTECTED] wrote:
 In response to Chris Mair [EMAIL PROTECTED]:

   Hi,
  
   Note: I have already vacumm full. It does not solve the problem.

 To jump in here in Chris' defense, regular vacuum is not at all the same
 as vacuum full.  Periodic vacuum is _much_ preferable to an occasional
 vacuum full.

 The output of vacuum verbose would have useful information ... are you
 exceeding your FSM limits?

 Try a reindex on the database.  There may be some obscure corner
 cases where reindex makes a notable improvement in performance.

   I have a postgres 8.1 database. In the last days I have half traffic
   than 4 weeks ago, and resources usage is twice. The resource monitor
   graphs also shows hight peaks (usually there is not peaks)

 Resource monitor graphs?  That statement means nothing to me, therefore
 I don't know if the information they're providing is useful or accurate,
 or even _what_ it is.  What, exactly, are these graphs monitoring?

 You might want to provide your postgresql.conf.

 Have you considered the possibility that the database has simply got more
 records and therefore access takes more IO and CPU?

   The performarce is getting poor with the time.
  
   Im not able to find the problem, seems there is not slow querys ( I have
   log_min_duration_statement = 5000 right now, tomorrow I ll decrease it )
  
   Server is HP, and seems there is not hardware problems detected.
  
   Any ideas to debug it?
 
  Hi,
 
  first of all: let us know the exact version of PG and the OS.
 
  If performance is getting worse, there ususally is some bloat
  envolved. Not vacuuming aggressivly enough, might be the most
  common cause. Do you autovacuum or vacuum manually?
  Tell us more...
 
 
  Bye,
  Chris.
 
 
 
  ---(end of broadcast)---
  TIP 3: Have you checked our extensive FAQ?
 
 http://www.postgresql.org/docs/faq
 
 
 
 
 
 


 --
 Bill Moran
 Collaborative Fusion Inc.
 http://people.collaborativefusion.com/~wmoran/

 [EMAIL PROTECTED]
 Phone: 412-422-3463x4023

 
 IMPORTANT: This message contains confidential information and is
 intended only for the individual named. If the reader of this
 message is not an intended recipient (or the individual
 responsible for the delivery of this message to an intended
 recipient), please be advised that any re-use, dissemination,
 distribution or copying of this message is prohibited. Please
 notify the sender immediately by e-mail if you have received
 this e-mail by mistake and delete this e-mail from your system.
 E-mail transmission cannot be guaranteed to be secure or
 error-free as information could be intercepted, corrupted, lost,
 destroyed, arrive late or incomplete, or contain viruses. The
 sender therefore does not accept liability for any errors or
 omissions in the contents of this message, which arise as a
 result of e-mail transmission.
 

 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match



-- 
echo '16i[q]sa[ln0=aln100%Pln100/snlbx]sbA0D4D465452snlbxq' | dc
This will help you for 99.9% of your problems ...

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] PostgreSQL 8.2.3 VACUUM Timings/Performance

2007-03-13 Thread Anton Melser

In fact, getting rid of vacuum full, or changing it to work like
cluster, has been proposed in the past. The use case really is pretty
narrow; cluster is a lot faster if there's a lot of unused space in the
table, and if there's not, vacuum full isn't going to do much so there's
not much point running it in the first place. The reason it exists is
largely historical, there hasn't been a pressing reason to remove it either.


I can assure you it is a great way to get back gigabytes when someone
has put no vacuum strategy in place and your 200K row table (with
about 200 bytes per row) is taking up 1.7gig!!!
Vive le truncate table, and vive le vacuum full!
:-)
Anton

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings