Re: [PERFORM] Slow query after upgrade from 9.0 to 9.2

2013-01-11 Thread Andrzej Zawadzki
On 10.01.2013 19:17, Jeff Janes wrote:
 On Thu, Jan 10, 2013 at 5:32 AM, Andrzej Zawadzki zawa...@wp.pl wrote:
 Why that's happens? All configurations are identical. Only engine is
 different.
 Could you post explain (analyze, buffers) instead of just explain?
Impossible, 1h of waiting and I've killed that.
 Also, if you temporarily set enable_seqscan=off on 9.2, what plan do
 you then get?

Plan is different.

Index Scan using sygma_arrear_credit_id on sygma_arrear sar 
(cost=11.07..390.66 rows=1 width=265)
  Index Cond: (credit_id = 3102309)
  Filter: ((arrear_import_id = $0) AND (arrear_flag_id = 2))
  InitPlan 1 (returns $0)
-  Limit  (cost=0.00..11.07 rows=1 width=8)
  -  Nested Loop  (cost=0.00..54961299.49 rows=4963314 width=8)
Join Filter: (sa.arrear_import_id = ai.id)
-  Index Scan Backward using report_date_bank_id_key
on arrear_import ai  (cost=0.00..62.81 rows=469 width=8)
  Filter: (import_type_id = 1)
-  Materialize  (cost=0.00..574515.68 rows=6138000
width=4)
  -  Index Scan using
sygma_arrear_arrear_import_id_idx on sygma_arrear sa 
(cost=0.00..519848.68 rows=6138000 width=4)
Filter: (arrear_flag_id = 2)

The real query is still slow.

-- 
Andrzej Zawadzki


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow query after upgrade from 9.0 to 9.2

2013-01-11 Thread Andrzej Zawadzki
On 10.01.2013 19:48, Matheus de Oliveira wrote:


 On Thu, Jan 10, 2013 at 11:32 AM, Andrzej Zawadzki zawa...@wp.pl
 mailto:zawa...@wp.pl wrote:

 Hi!

 Small query run on 9.0 very fast:

 SELECT * from sygma_arrear sar where sar.arrear_import_id = (
 select sa.arrear_import_id from sygma_arrear sa,
 arrear_import ai
 where sa.arrear_flag_id = 2
 AND sa.arrear_import_id = ai.id http://ai.id
 AND ai.import_type_id = 1
 order by report_date desc limit 1)
 AND sar.arrear_flag_id = 2
 AND sar.credit_id = 3102309 tel:3102309

 Index Scan using sygma_arrear_credit_id on sygma_arrear sar
 (cost=0.66..362.03 rows=1 width=265)
   Index Cond: (credit_id = 3102309 tel:3102309)
   Filter: ((arrear_import_id = $0) AND (arrear_flag_id = 2))
   InitPlan 1 (returns $0)
 -  Limit  (cost=0.00..0.66 rows=1 width=8)
   -  Nested Loop  (cost=0.00..3270923.14 rows=4930923
 width=8)
 -  Index Scan Backward using report_date_bank_id_key
 on arrear_import ai  (cost=0.00..936.87 rows=444 width=8)
   Filter: (import_type_id = 1)
 *-  Index Scan using
 sygma_arrear_arrear_import_id_idx
 on sygma_arrear sa  (cost=0.00..6971.15 rows=31495 width=4)**
 **  Index Cond: (sa.arrear_import_id = ai.id
 http://ai.id)**
 **  Filter: (sa.arrear_flag_id = 2)**
 *
 Engine uses index - great.

 On 9.2

 Index Scan using sygma_arrear_credit_id on sygma_arrear sar
 (cost=11.05..381.12 rows=1 width=265)
   Index Cond: (credit_id = 3102309 tel:3102309)
   Filter: ((arrear_import_id = $0) AND (arrear_flag_id = 2))
   InitPlan 1 (returns $0)
 -  Limit  (cost=0.00..11.05 rows=1 width=8)
   -  Nested Loop  (cost=0.00..54731485.84 rows=4953899
 width=8)
 Join Filter: (sa.arrear_import_id = ai.id
 http://ai.id)
 -  Index Scan Backward using report_date_bank_id_key
 on arrear_import ai  (cost=0.00..62.81 rows=469 width=8)
   Filter: (import_type_id = 1)
 *-  Materialize  (cost=0.00..447641.42 rows=6126357
 width=4)**
 **  -  Seq Scan on sygma_arrear sa
 (cost=0.00..393077.64 rows=6126357 width=4)**
 **Filter: (arrear_flag_id = 2)**
 *
 Seq scan... slooow.

 Why that's happens? All configurations are identical. Only engine is
 different.



 How did you do the upgrade?
pg_upgrade and I think that this is source of problem.
I have test database from dump/restore process and works properly.
 Have you tried to run a VACUUM ANALYZE on sygma_arrear?
Yes I did - after upgrade all databases was vacuumed.

vacuumdb -azv

I'll try reindex all indexes at weekend

-- 
Andrzej Zawadzki


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Partition insert trigger using C language

2013-01-11 Thread Heikki Linnakangas

On 10.01.2013 21:48, Matheus de Oliveira wrote:

I have made a small modification to keep the plans, and it got from
33957.768ms to 43782.376ms.


If I'm reading results.txt correctly, the avg runtimes are:

C and SPI_execute_with_args: 58567.708 ms
C and SPI_(prepare/keepplan/execute_plan): 43782.376 ms
C and heap_insert: 33957.768 ms

So switching to prepared plans helped quite a lot, but it's still slower 
than direct heap_inserts.


One thing that caught my eye:


CREATE OR REPLACE FUNCTION partition_insert_trigger_spi()
RETURNS trigger
LANGUAGE C
VOLATILE STRICT
AS 'partition_insert_trigger_spi','partition_insert_trigger_spi'
SET DateStyle TO 'ISO';


Calling a function with SET options has a fair amount of overhead, to 
set/restore the GUC on every invocation. That should be avoided in a 
performance critical function like this.


- Heikki


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Partition insert trigger using C language

2013-01-11 Thread Matheus de Oliveira
On Fri, Jan 11, 2013 at 8:19 AM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:

 On 10.01.2013 21:48, Matheus de Oliveira wrote:

 I have made a small modification to keep the plans, and it got from
 33957.768ms to 43782.376ms.


 If I'm reading results.txt correctly, the avg runtimes are:

 C and SPI_execute_with_args: 58567.708 ms
 C and SPI_(prepare/keepplan/execute_**plan): 43782.376 ms
 C and heap_insert: 33957.768 ms

 So switching to prepared plans helped quite a lot, but it's still slower
 than direct heap_inserts.


Humm... You are right, I misread what it before, sorry. The 33957.768ms was
with heap_insert.



 One thing that caught my eye:

  CREATE OR REPLACE FUNCTION partition_insert_trigger_spi()
 RETURNS trigger
 LANGUAGE C
 VOLATILE STRICT
 AS 'partition_insert_trigger_spi'**,'partition_insert_trigger_**spi'
 SET DateStyle TO 'ISO';


 Calling a function with SET options has a fair amount of overhead, to
 set/restore the GUC on every invocation. That should be avoided in a
 performance critical function like this.


I (stupidly) used SPI_getvalue [1] and expected it to always return as
-MM-DD, but them I remembered it would do that only with DateStyle=ISO.

But the truth is that I couldn't see any overhead, because the function was
without that on my first tests, and after that I saw no difference on the
tests. I think I should use SPI_getbinvalue instead, but I don't know how
to parse the result to get year and month, any help on that?

[1]
https://github.com/matheusoliveira/pg_partitioning_tests/blob/master/src/spi/partition_insert_trigger_spi.c#L103

Regards,
-- 
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres


Re: [PERFORM] Partition insert trigger using C language

2013-01-11 Thread Matheus de Oliveira
On Thu, Jan 10, 2013 at 5:51 PM, Charles Gomes 
charles.go...@benchmarksolutions.com wrote:

  ** **

 *From:* pgsql-performance-ow...@postgresql.org [mailto:
 pgsql-performance-ow...@postgresql.org] *On Behalf Of *Matheus de Oliveira
 *Sent:* Thursday, January 10, 2013 2:12 PM
 *To:* Heikki Linnakangas
 *Cc:* pgsql-performance; Charles Gomes
 *Subject:* Re: [PERFORM] Partition insert trigger using C language

 ** **

 ** **

 On Thu, Jan 10, 2013 at 4:54 PM, Heikki Linnakangas 
 hlinnakan...@vmware.com wrote:

 On 10.01.2013 20:45, Matheus de Oliveira wrote:

 Inspired by Charles' thread and the work of Emmanuel [1], I have made some
 experiments trying to create a trigger to make partitioning using C
 language.

 The first attempt was not good, I tried to use SPI [2] to create a query
 to
 insert into the correct child table, but it took almost no improvement
 compared with the PL/pgSQL code.



 The right way to do this with SPI is to prepare each insert-statement on
 first invocation (SPI_prepare + SPI_keepplan), and reuse the plan after
 that (SPI_execute_with_args).

 If you construct and plan the query on every invocation, it's not
 surprising that it's no different from PL/pgSQL performance.


 Yeah. I thought about that, but the problem was that I assumed the
 INSERTs came with random date, so in the worst scenario I would have to
 keep the plans of all of the child partitions. Am I wrong?

 But thinking better, even with hundreds of partitions, it wouldn't use to
 much memory/resource, would it?

 In fact, I didn't give to much attention to SPI method, because the other
 one is where we can have more fun, =P.

 Anyway, I'll change the code (maybe now), and see if it gets closer to
 the other method (that uses heap_insert), and will post back the results
 here.





 

 Interesting that you got an improvement. In my case I  get almost no
 improvement at all:

 ** **

 PL/SQL – Dynamic Trigger

  4:15:54

 PL/SQL - CASE / WHEN Statements

  4:12:29

 PL/SQL - IF Statements

  4:12:39

 C Trigger

  4:10:49

 ** **

 Here is my code, I’m using heap insert and updating the indexes. With a
 similar approach of yours.

 The trigger is aware of 

 http://www.charlesrg.com/~charles/pgsql/partition2.c

 **


Humm... Looking at your code, I saw no big difference from mine. The only
thing I saw is that you don't fire triggers, but it would be even faster
this way. Another thing that could cause that is the number of partitions,
I tried only with 12.

Could you make a test suite? Or try to run with my function in your
scenario? It would be easy to make it get the partitions by day [1].

[1] https://gist.github.com/4509782

Regards,
-- 
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres


Re: [PERFORM] Partition insert trigger using C language

2013-01-11 Thread Matheus de Oliveira
On Fri, Jan 11, 2013 at 9:02 AM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:

 On 11.01.2013 12:36, Matheus de Oliveira wrote:

 On Fri, Jan 11, 2013 at 8:19 AM, Heikki Linnakangashlinnakangas@**
 vmware.com hlinnakan...@vmware.com

 wrote:


  One thing that caught my eye:

   CREATE OR REPLACE FUNCTION partition_insert_trigger_spi()

 RETURNS trigger
 LANGUAGE C
 VOLATILE STRICT
 AS 'partition_insert_trigger_spi','partition_insert_trigger_***
 *spi'
 SET DateStyle TO 'ISO';


 Calling a function with SET options has a fair amount of overhead, to
 set/restore the GUC on every invocation. That should be avoided in a
 performance critical function like this.


 I (stupidly) used SPI_getvalue [1] and expected it to always return as
 -MM-DD, but them I remembered it would do that only with
 DateStyle=ISO.

 But the truth is that I couldn't see any overhead, because the function
 was
 without that on my first tests, and after that I saw no difference on the
 tests.


 Oh, ok then. I would've expected it to make a measurable difference.


  I think I should use SPI_getbinvalue instead, but I don't know how
 to parse the result to get year and month, any help on that?


 The fastest way is probably to use j2date like date_out does:

   DateADT   date = DatumGetDateADT(x)
   int year, month, mday;

   if (DATE_NOT_FINITE(date))
 elog(ERROR, date must be finite);
   j2date(date + POSTGRES_EPOCH_JDATE, year, month, mday);

 - Heikki


Nice. With the modifications you suggested I did saw a good improvement on
the function using SPI (and a little one with heap_insert). So I was wrong
to think that change the GUC would not make to much difference, the SPI
code now runs almost as fast as the heap_insert:

heap_insert: 31896.098 ms
SPI: 36558.564

Of course I still could make some improvements on it, like using a LRU to
keep the plans, or something like that.

The new code is at github.

Regards,
-- 
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres


[PERFORM] Insert performance for large transaction with multiple COPY FROM

2013-01-11 Thread Horst Dehmer
Hi!

I see a massive performance drop when writing a large transaction. I'm writing 
data for 33 tables with COPY FROM directly from streams in Scala/Java. Over all 
tables there are 2.2M records which are unevenly distributed from 1 record to 
315k records in some tables.

For comparison I ran a test where I used UNLOGGED tables, no PK/FK constraints, 
nor other constraints or indexes and no triggers for all tables. The insert 
rate for this scenario is well above 105k records/second over all tables (which 
I think is really cool!)

Turning everything on (but still with UNLOGGED tables), i.e. PK/FK, additional 
indexes, some column check constraints and a trigger for each table which 
basically insert one additional record to another table, the rates dropped 
expectedly to around 6k to 7k records/second.

Except - and that's the wall I'm hitting - for one table which yielded just 75 
records/second.
The main 'problem' seem to be the FK constraints. Dropping just them restored 
insert performance for this table to 6k records/s. The table in question has a 
composite PK (3 columns), 3 foreign keys and a bunch of indexes (see table 
obj_item_loc at the end of the mail). Compared to the other 32 tables nothing 
unusual.
I'd gladly supply more information if necessary.

Dropping and recreating constraints/indexes is (for now) no viable alternative, 
since I have to write such transaction into an already populated database.
What I'm trying to understand is, which limit it is I'm hitting here. I need 
some advice how to 'profile' this situation.

Configuration is more or less standard, except WAL settings (which should not 
be relevant here).
EnterpriseDB One Click installer.

Any hint is really appreciated. 
Thanks!

--
Horst Dehmer



PostgreSQL 9.2.1 on x86_64-apple-darwin, compiled by 
i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) 
(LLVM build 2336.9.00), 64-bit
OS X 10.8.2
Mid-2012 MacBook Pro 16 GB, 512 GB SSD

bytea_output;escape
checkpoint_completion_target;0.9
checkpoint_segments;32
client_encoding;UNICODE
client_min_messages;notice
lc_collate;en_US.UTF-8
lc_ctype;en_US.UTF-8
listen_addresses;*
log_checkpoints;on
log_destination;stderr
log_line_prefix;%t 
logging_collector;on
max_connections;100
max_stack_depth;2MB
port;5432
server_encoding;UTF8
shared_buffers;24MB
TimeZone;Europe/Vienna
wal_buffers;768kB

/etc/sysctl.conf
kern.sysv.shmmax=1610612736
kern.sysv.shmall=393216
kern.sysv.shmmin=1
kern.sysv.shmmni=256
kern.sysv.shmseg=64
kern.maxprocperuid=512
kern.maxproc=2048

CREATE TABLE obj_item_loc
(
  obj_item_id numeric(20,0) NOT NULL,
  loc_id numeric(20,0) NOT NULL,
  obj_item_loc_ix numeric(20,0) NOT NULL, 
  ver_acc_dim numeric(12,3), 
  horz_acc_dim numeric(12,3), 
  brng_angle numeric(7,4), 
  brng_acc_angle numeric(7,4), 
  brng_precision_code character varying(6), 
  incl_angle numeric(7,4), 
  incl_acc_angle numeric(7,4), 
  incl_precision_code character varying(6), 
  speed_rate numeric(8,4), 
  speed_acc_rate numeric(8,4), 
  speed_precision_code character varying(6), 
  meaning_code character varying(6), 
  rel_speed_code character varying(6), 
  rptd_id numeric(20,0) NOT NULL,
  creator_id numeric(20,0) NOT NULL, 
  update_seqnr numeric(15,0) NOT NULL, 
  rec_id bigint DEFAULT nextval('rec_seq'::regclass),
  CONSTRAINT obj_item_loc_pkey PRIMARY KEY (obj_item_id, loc_id, 
obj_item_loc_ix),
  CONSTRAINT obj_item_loc_4fbc75641175ef1757ca310dd34e34ee_fkey FOREIGN KEY 
(obj_item_id)
  REFERENCES obj_item (obj_item_id) MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT obj_item_loc_7895d64f5557b1e382c36d41212a3696_fkey FOREIGN KEY 
(rptd_id)
  REFERENCES rptd (rptd_id) MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT obj_item_loc_8d919243f69bcc599873caca07ac9888_fkey FOREIGN KEY 
(loc_id)
  REFERENCES loc (loc_id) MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT obj_item_loc_brng_acc_angle_ra_check CHECK 
(br_check_305(brng_acc_angle = 0::numeric AND brng_acc_angle = 359., 
'obj_item_loc'::text, 'brng_acc_angle'::text, brng_acc_angle::text)),
  CONSTRAINT obj_item_loc_brng_angle_ra_check CHECK (br_check_305(brng_angle = 
0::numeric AND brng_angle = 359., 'obj_item_loc'::text, 
'brng_angle'::text, brng_angle::text)),
  CONSTRAINT obj_item_loc_brng_precision_code_check CHECK 
(br_check_305(brng_precision_code::text = ANY (ARRAY['1000MN'::text, 
'100MN'::text, '100SEC'::text, '10DEG'::text, '10MN'::text, '10SEC'::text, 
'DEGREE'::text, 'MIL'::text, 'MINUTE'::text, 'SECOND'::text]), 
'obj_item_loc'::text, 'brng_precision_code'::text, brng_precision_code::text)),
  CONSTRAINT obj_item_loc_incl_acc_angle_ra_check CHECK 
(br_check_305(incl_acc_angle = 0::numeric AND incl_acc_angle = 359., 
'obj_item_loc'::text, 'incl_acc_angle'::text, incl_acc_angle::text)),
  CONSTRAINT obj_item_loc_incl_angle_ra_check CHECK (br_check_305(incl_angle = 
0::numeric AND incl_angle = 359., 

Re: [PERFORM] Insert performance for large transaction with multiple COPY FROM

2013-01-11 Thread Claudio Freire
On Fri, Jan 11, 2013 at 8:55 PM, Horst Dehmer horst.deh...@gmail.com wrote:
 Except - and that's the wall I'm hitting - for one table which yielded just
 75 records/second.
 The main 'problem' seem to be the FK constraints. Dropping just them
 restored insert performance for this table to 6k records/s. The table in
 question has a composite PK (3 columns), 3 foreign keys and a bunch of
 indexes (see table obj_item_loc at the end of the mail). Compared to the
 other 32 tables nothing unusual.
 I'd gladly supply more information if necessary.
...
 CREATE TABLE obj_item_loc
 (
   obj_item_id numeric(20,0) NOT NULL,
   loc_id numeric(20,0) NOT NULL,
   obj_item_loc_ix numeric(20,0) NOT NULL,

That sounds a lot like a missing index on the target relations (or
indices that are unusable).

Those numeric ids look really unusual. Why not bigint? It's close to
the same precision, but native, faster, more compact, and quite
unambiguous when indices are involved. If the types don't match on
both tables, it's quite likely indices won't be used when checking the
FK, and that spells trouble.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance