[PERFORM] Query Performance Problem

2014-10-21 Thread john
Hi all,I'm experimenting with table partitioning though inheritance. I'm testing a query as follows:explain (analyze, buffers)selectresponse.idfrom claim.responsewhere response.account_id = 4766and response.expire_timestamp is nulland response.create_timestamp = DATE '2014-08-01'order by create_timestamp;The response table looks like this:"account_id";"integer""file_type_id";"integer""receiver_inbound_detail_id";"integer""processing_status_id";"integer""processing";"boolean""expire_timestamp";"timestamp without time zone""last_mod_timestamp";"timestamp without time zone""create_timestamp";"timestamp without time zone""response_trace_nbr";"character varying""posted_timestamp";"timestamp without time zone""need_to_post";"boolean""response_message";"text""worked";"boolean""response_status_id";"integer""response_type_id";"integer""outbound_claim_detail_id";"bigint""id";"bigint"Here are some rowcounts:SELECT count(*) from claim_response.response_201408; count-4585746(1 row)Time: 7271.054 msSELECT count(*) from claim_response.response_201409; count-3523370(1 row)Time: 4341.116 msSELECT count(*) from claim_response.response_201410;count--- 154(1 row)Time: 0.258 msThe entire table has225,665,512 rows. I read that a partitioning rule of thumb is that benefits of partitioning occur starting around 100 million rows.SELECT count(*) from claim.response; count---225665512(1 row)Time: 685064.637 msThe partitioning is on the create_timestamp field.The server is Red Hat Enterprise Linux Server release 6.2 (Santiago) on a VM machine - 8 GB RAM with 2 CPUs:Architecture: x86_64CPU op-mode(s):32-bit, 64-bitByte Order:  Little EndianCPU(s):2On-line CPU(s) list:  0,1Thread(s) per core:  1Core(s) per socket:  2CPU socket(s): 1NUMA node(s): 1Vendor ID:   GenuineIntelCPU family:  6Model: 44Stepping:   2CPU MHz:2660.000BogoMIPS:   5320.00L1d cache:   32KL1i cache:   32KL2 cache:   256KL3 cache:   12288KNUMA node0 CPU(s):   0,12 users, load average: 0.00, 0.12, 0.37Please see the following for the explain analysis :http://explain.depesz.com/s/I3SLI'm trying to understand why I'm getting the yellow, orange, and red on the inclusive, and the yellow on the exclusive. (referring to theexplain.depesz.com/s/I3SLpage.)I'm relatively new to PostgreSQL, but I've been an Oracle DBA for some time. I suspect the I/O may be dragging but I don't know how to dig that information out from here. Please point out anything else you can decipher from this.Thanks,John



Re: [PERFORM] Query Performance Problem

2014-10-21 Thread Felipe Santos
2014-10-21 10:57 GMT-02:00 j...@jpm-cola.com:



 Hi all,

 I'm experimenting with table partitioning though inheritance. I'm testing
 a query as follows:

 explain (analyze, buffers)
 select response.id
 from claim.response
 where response.account_id = 4766
 and response.expire_timestamp is null
 and response.create_timestamp = DATE '2014-08-01'
 order by create_timestamp;

 The response table looks like this:
 account_id;integer
 file_type_id;integer
 receiver_inbound_detail_id;integer
 processing_status_id;integer
 processing;boolean
 expire_timestamp;timestamp without time zone
 last_mod_timestamp;timestamp without time zone
 create_timestamp;timestamp without time zone
 response_trace_nbr;character varying
 posted_timestamp;timestamp without time zone
 need_to_post;boolean
 response_message;text
 worked;boolean
 response_status_id;integer
 response_type_id;integer
 outbound_claim_detail_id;bigint
 id;bigint

 Here are some rowcounts:

 SELECT count(*) from claim_response.response_201408;
   count
 -
  4585746
 (1 row)

 Time: 7271.054 ms
 SELECT count(*) from claim_response.response_201409;
   count
 -
  3523370
 (1 row)

 Time: 4341.116 ms
 SELECT count(*) from claim_response.response_201410;
  count
 ---
154
 (1 row)

 Time: 0.258 ms

 The entire table has 225,665,512 rows. I read that a partitioning rule of
 thumb is that benefits of partitioning occur starting around 100 million
 rows.

 SELECT count(*) from claim.response;
count
 ---
  225665512
 (1 row)

 Time: 685064.637 ms


 The partitioning is on the create_timestamp field.

 The server is Red Hat Enterprise Linux Server release 6.2 (Santiago) on a
 VM machine - 8 GB RAM with 2 CPUs:

 Architecture:  x86_64
 CPU op-mode(s):32-bit, 64-bit
 Byte Order:Little Endian
 CPU(s):2
 On-line CPU(s) list:   0,1
 Thread(s) per core:1
 Core(s) per socket:2
 CPU socket(s): 1
 NUMA node(s):  1
 Vendor ID: GenuineIntel
 CPU family:6
 Model: 44
 Stepping:  2
 CPU MHz:   2660.000
 BogoMIPS:  5320.00
 L1d cache: 32K
 L1i cache: 32K
 L2 cache:  256K
 L3 cache:  12288K
 NUMA node0 CPU(s): 0,1



 2 users,  load average: 0.00, 0.12, 0.37


 Please see the following for the explain analysis :

 http://explain.depesz.com/s/I3SL

 I'm trying to understand why I'm getting the yellow, orange, and red on
 the inclusive, and the yellow on the exclusive. (referring to the
 explain.depesz.com/s/I3SL page.)
 I'm relatively new to PostgreSQL, but I've been an Oracle DBA for some
 time. I suspect the I/O may be dragging but I don't know how to dig that
 information out from here. Please point out anything else you can decipher
 from this.

 Thanks,

 John



Hi John,

Dont know about the colors, but the Stats tab looks fine. You've got
yourself 5 Index Scans, which are a very fast way to dig data.

 I noticed you've also cast your filter field (create_timestamp =
'2014-08-01'::date). As far as I know, Postgresql doesn't need this kind
of explicit conversion. You would be fine with just (create_timestamp =
'2014-08-01').

Regards,

Felipe


[PERFORM] Query performance problem in 8.0.0beta1

2004-08-29 Thread Stefano Bonnin



Hi, I have just installed 8.0.0beta1 and I noticed 
that some query are slower than 7.4.2 queries.

After a FULL VACUUM ANALYZE

***With 7.4.2***

explain analyze SELECT count(*) FROM "SNS_DATA" 
WHERE "Data_Arrivo_Campione" BETWEEN '2004-01-01 00:00:00' AND '2004-01-31 
23:59:59' AND "Cod_Par" = '17476'

gives

Aggregate (cost=46817.89..46817.89 
rows=1 width=0) (actual time=401.216..401.217 rows=1 loops=1) 
- Index Scan using snsdata_codpar on "SNS_DATA" 
(cost=0.00..46817.22 rows=268 width=0) (actual time=165.948..400.258 rows=744 
loops=1) Index Cond: 
(("Cod_Par")::text = 
'17476'::text) Filter: 
(("Data_Arrivo_Campione" = '2004-01-01 00:00:00'::timestamp without time 
zone) AND ("Data_Arrivo_Campione" = '2004-01-31 23:59:59'::timestamp without 
time zone))Total runtime: 401.302 ms

***while on 8.0.0***

the same query gives

-Aggregate 
(cost=93932.91..93932.91 rows=1 width=0) (actual time=14916.371..14916.371 
rows=1 loops=1) - Seq Scan on "SNS_DATA" 
(cost=0.00..93930.14 rows=1108 width=0) (actual time=6297.152..14915.330 
rows=744 loops=1) Filter: 
(("Data_Arrivo_Campione" = '2004-01-01 00:00:00'::timestamp without time 
zone) AND ("Data_Arrivo_Campione" = '2004-01-31 23:59:59'::timestamp without 
time zone) AND (("Cod_Par")::text = '17476'::text))Total runtime: 
14916.935 ms
And I if disable the seqscan
SET enable_seqscan = false;

I get the following Aggregate 
(cost=158603.19..158603.19 rows=1 width=0) (actual time=4605.862..4605.863 
rows=1 loops=1) - Index Scan using snsdata_codpar on 
"SNS_DATA" (cost=0.00..158600.41 rows=1108 width=0) (actual 
time=2534.422..4604.865 rows=744 
loops=1) Index Cond: 
(("Cod_Par")::text = 
'17476'::text) Filter: 
(("Data_Arrivo_Campione" = '2004-01-01 00:00:00'::timestamp without time 
zone) AND ("Data_Arrivo_Campione" = '2004-01-31 23:59:59'::timestamp without 
time zone))Total runtime: 4605.965 ms
The total runtime is bigger (x10 !!) than the old 
one.

The memory runtime parameters are 
shared_buffer = 2048
work_mem = sort_mem = 2048

SNS_DATA shema is the following:

 
Table "public.SNS_DATA" 
Column 
| 
Type 
| 
Modifiers--+-+Ordine 
| 
integer 
| not null default 
0Cod_Par 
| character varying(100) | not 
nullCod_Ana 
| character varying(100) | not 
nullValore 
| character varying(255) 
|Descriz 
| character varying(512) 
|Un_Mis 
| character varying(70) 
|hash 
| 
integer 
|valid 
| 
boolean 
| default 
truealarm 
| 
boolean 
| default 
falseCod_Luogo 
| character varying(30) 
|Data_Arrivo_Campione | timestamp without time zone 
|site_id 
| 
integer 
|Cod_Luogo_v | 
character varying(30) 
|repeated_val | 
boolean 
| default falseIndexes: "sns_data2_pkey" PRIMARY KEY, 
btree ("Ordine", "Cod_Ana", "Cod_Par") "sns_datacodluogo2" 
btree ("Cod_Luogo") "sns_datatimefield2" btree 
("Data_Arrivo_Campione") "sns_siteid2" btree 
(site_id) "sns_valid2" btree 
("valid") "snsdata_codana" btree 
("Cod_Ana") "snsdata_codpar" btree 
("Cod_Par")Foreign-key constraints: "$2" FOREIGN KEY 
("Cod_Ana") REFERENCES "SNS_ANA"("Cod_Ana") ON DELETE 
CASCADETriggers: sns_action_tr BEFORE INSERT OR UPDATE 
ON "SNS_DATA" FOR EACH ROW EXECUTE PROCEDURE sns_action()


Can it be a datatype conversion 
problem?
Thanks in advance!
Reds







[PERFORM] Query performance problem

2004-08-20 Thread Danilo Mota








HI All,



I have a big performance issue concerning a PostgreSQL
database.



I have the following server configuration:



Pentium 4 2.4 GHz

1 GB RAM

36 GB SCSI



And the following tables:

 TABLES

--

== r_cliente: 75816 records


CREATE TABLE public.r_cliente
(

 pkcliente INTEGER NOT NULL, 

 cpfcnpj VARCHAR(20) NOT
NULL, 

 PRIMARY KEY(pkcliente)

) WITH OIDS;



CREATE UNIQUE INDEX un_cliente_cpfcnpj ON public.r_cliente

USING btree (cpfcnpj);



== sav_cliente_lg: 65671 records
===

CREATE TABLE public.sav_cliente_lg
(

 codigo INTEGER NOT NULL, 

 cpfcnpj VARCHAR(15) NOT
NULL, 

 PRIMARY KEY(codigo)

) WITH OIDS;



CREATE INDEX ix_savclientelg_cpfcnpj ON public.sav_cliente_lg

USING btree (cpfcnpj);







Which I would like to run the following query:



 QUERY

--

SELECT

rc.pkcliente

FROM r_cliente AS rc

INNER JOIN sav_cliente_lg AS sc ON
sc.cpfcnpj = sc.cpfcnpj;









The problem is, it takes a long time to run, I wait up to
half an hour and I get no result.



So, I executed the explain on the
query and got the following results:









 QUERY PLAN

--

Nested Loop (cost=0.00..16696.87
rows=75816 width=4)

 - Seq Scan on sav_cliente_cf sc
(cost=0.00..3047.55 rows=1 width=0)

 Filter: ((cpfcnpj)::text = (cpfcnpj)::text)

 - Seq Scan on r_cliente rc (cost=0.00..12891.16 rows=75816 width=4)









And made the following modifications
on my POSTGRESQL.CONF file:



 POSTGRESQL.CONF

--

### VERSION: Postgresql 7.4.2 ###

shared_buffers = 7800

sort_mem = 4096

checkpoint_segments = 5 

effective_cache_size = 12000

cpu_operator_cost = 0.0015

stats_start_collector = false





Hope you can help me, I really need to get this running
faster, and I am out of ideas.



Since now, thanks a lot for your attention,



Danilo Mota








Re: [PERFORM] Query performance problem

2004-08-20 Thread Tom Lane
Danilo Mota [EMAIL PROTECTED] writes:
 SELECT
  rc.pkcliente
 FROM r_cliente AS rc
 INNER JOIN sav_cliente_lg AS sc ON sc.cpfcnpj = sc.cpfcnpj;

Surely you meant
  INNER JOIN sav_cliente_lg AS sc ON rc.cpfcnpj = sc.cpfcnpj;

I would also venture that your statistics are desperately out of date,
because if the planner's estimates are close to reality, even this
unconstrained-cross-product join shouldn't have taken that long.
 
regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]