[PERFORM] Query Performance Problem
Hi all,I'm experimenting with table partitioning though inheritance. I'm testing a query as follows:explain (analyze, buffers)selectresponse.idfrom claim.responsewhere response.account_id = 4766and response.expire_timestamp is nulland response.create_timestamp = DATE '2014-08-01'order by create_timestamp;The response table looks like this:"account_id";"integer""file_type_id";"integer""receiver_inbound_detail_id";"integer""processing_status_id";"integer""processing";"boolean""expire_timestamp";"timestamp without time zone""last_mod_timestamp";"timestamp without time zone""create_timestamp";"timestamp without time zone""response_trace_nbr";"character varying""posted_timestamp";"timestamp without time zone""need_to_post";"boolean""response_message";"text""worked";"boolean""response_status_id";"integer""response_type_id";"integer""outbound_claim_detail_id";"bigint""id";"bigint"Here are some rowcounts:SELECT count(*) from claim_response.response_201408; count-4585746(1 row)Time: 7271.054 msSELECT count(*) from claim_response.response_201409; count-3523370(1 row)Time: 4341.116 msSELECT count(*) from claim_response.response_201410;count--- 154(1 row)Time: 0.258 msThe entire table has225,665,512 rows. I read that a partitioning rule of thumb is that benefits of partitioning occur starting around 100 million rows.SELECT count(*) from claim.response; count---225665512(1 row)Time: 685064.637 msThe partitioning is on the create_timestamp field.The server is Red Hat Enterprise Linux Server release 6.2 (Santiago) on a VM machine - 8 GB RAM with 2 CPUs:Architecture: x86_64CPU op-mode(s):32-bit, 64-bitByte Order: Little EndianCPU(s):2On-line CPU(s) list: 0,1Thread(s) per core: 1Core(s) per socket: 2CPU socket(s): 1NUMA node(s): 1Vendor ID: GenuineIntelCPU family: 6Model: 44Stepping: 2CPU MHz:2660.000BogoMIPS: 5320.00L1d cache: 32KL1i cache: 32KL2 cache: 256KL3 cache: 12288KNUMA node0 CPU(s): 0,12 users, load average: 0.00, 0.12, 0.37Please see the following for the explain analysis :http://explain.depesz.com/s/I3SLI'm trying to understand why I'm getting the yellow, orange, and red on the inclusive, and the yellow on the exclusive. (referring to theexplain.depesz.com/s/I3SLpage.)I'm relatively new to PostgreSQL, but I've been an Oracle DBA for some time. I suspect the I/O may be dragging but I don't know how to dig that information out from here. Please point out anything else you can decipher from this.Thanks,John
Re: [PERFORM] Query Performance Problem
2014-10-21 10:57 GMT-02:00 j...@jpm-cola.com: Hi all, I'm experimenting with table partitioning though inheritance. I'm testing a query as follows: explain (analyze, buffers) select response.id from claim.response where response.account_id = 4766 and response.expire_timestamp is null and response.create_timestamp = DATE '2014-08-01' order by create_timestamp; The response table looks like this: account_id;integer file_type_id;integer receiver_inbound_detail_id;integer processing_status_id;integer processing;boolean expire_timestamp;timestamp without time zone last_mod_timestamp;timestamp without time zone create_timestamp;timestamp without time zone response_trace_nbr;character varying posted_timestamp;timestamp without time zone need_to_post;boolean response_message;text worked;boolean response_status_id;integer response_type_id;integer outbound_claim_detail_id;bigint id;bigint Here are some rowcounts: SELECT count(*) from claim_response.response_201408; count - 4585746 (1 row) Time: 7271.054 ms SELECT count(*) from claim_response.response_201409; count - 3523370 (1 row) Time: 4341.116 ms SELECT count(*) from claim_response.response_201410; count --- 154 (1 row) Time: 0.258 ms The entire table has 225,665,512 rows. I read that a partitioning rule of thumb is that benefits of partitioning occur starting around 100 million rows. SELECT count(*) from claim.response; count --- 225665512 (1 row) Time: 685064.637 ms The partitioning is on the create_timestamp field. The server is Red Hat Enterprise Linux Server release 6.2 (Santiago) on a VM machine - 8 GB RAM with 2 CPUs: Architecture: x86_64 CPU op-mode(s):32-bit, 64-bit Byte Order:Little Endian CPU(s):2 On-line CPU(s) list: 0,1 Thread(s) per core:1 Core(s) per socket:2 CPU socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family:6 Model: 44 Stepping: 2 CPU MHz: 2660.000 BogoMIPS: 5320.00 L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0,1 2 users, load average: 0.00, 0.12, 0.37 Please see the following for the explain analysis : http://explain.depesz.com/s/I3SL I'm trying to understand why I'm getting the yellow, orange, and red on the inclusive, and the yellow on the exclusive. (referring to the explain.depesz.com/s/I3SL page.) I'm relatively new to PostgreSQL, but I've been an Oracle DBA for some time. I suspect the I/O may be dragging but I don't know how to dig that information out from here. Please point out anything else you can decipher from this. Thanks, John Hi John, Dont know about the colors, but the Stats tab looks fine. You've got yourself 5 Index Scans, which are a very fast way to dig data. I noticed you've also cast your filter field (create_timestamp = '2014-08-01'::date). As far as I know, Postgresql doesn't need this kind of explicit conversion. You would be fine with just (create_timestamp = '2014-08-01'). Regards, Felipe
[PERFORM] Query performance problem in 8.0.0beta1
Hi, I have just installed 8.0.0beta1 and I noticed that some query are slower than 7.4.2 queries. After a FULL VACUUM ANALYZE ***With 7.4.2*** explain analyze SELECT count(*) FROM "SNS_DATA" WHERE "Data_Arrivo_Campione" BETWEEN '2004-01-01 00:00:00' AND '2004-01-31 23:59:59' AND "Cod_Par" = '17476' gives Aggregate (cost=46817.89..46817.89 rows=1 width=0) (actual time=401.216..401.217 rows=1 loops=1) - Index Scan using snsdata_codpar on "SNS_DATA" (cost=0.00..46817.22 rows=268 width=0) (actual time=165.948..400.258 rows=744 loops=1) Index Cond: (("Cod_Par")::text = '17476'::text) Filter: (("Data_Arrivo_Campione" = '2004-01-01 00:00:00'::timestamp without time zone) AND ("Data_Arrivo_Campione" = '2004-01-31 23:59:59'::timestamp without time zone))Total runtime: 401.302 ms ***while on 8.0.0*** the same query gives -Aggregate (cost=93932.91..93932.91 rows=1 width=0) (actual time=14916.371..14916.371 rows=1 loops=1) - Seq Scan on "SNS_DATA" (cost=0.00..93930.14 rows=1108 width=0) (actual time=6297.152..14915.330 rows=744 loops=1) Filter: (("Data_Arrivo_Campione" = '2004-01-01 00:00:00'::timestamp without time zone) AND ("Data_Arrivo_Campione" = '2004-01-31 23:59:59'::timestamp without time zone) AND (("Cod_Par")::text = '17476'::text))Total runtime: 14916.935 ms And I if disable the seqscan SET enable_seqscan = false; I get the following Aggregate (cost=158603.19..158603.19 rows=1 width=0) (actual time=4605.862..4605.863 rows=1 loops=1) - Index Scan using snsdata_codpar on "SNS_DATA" (cost=0.00..158600.41 rows=1108 width=0) (actual time=2534.422..4604.865 rows=744 loops=1) Index Cond: (("Cod_Par")::text = '17476'::text) Filter: (("Data_Arrivo_Campione" = '2004-01-01 00:00:00'::timestamp without time zone) AND ("Data_Arrivo_Campione" = '2004-01-31 23:59:59'::timestamp without time zone))Total runtime: 4605.965 ms The total runtime is bigger (x10 !!) than the old one. The memory runtime parameters are shared_buffer = 2048 work_mem = sort_mem = 2048 SNS_DATA shema is the following: Table "public.SNS_DATA" Column | Type | Modifiers--+-+Ordine | integer | not null default 0Cod_Par | character varying(100) | not nullCod_Ana | character varying(100) | not nullValore | character varying(255) |Descriz | character varying(512) |Un_Mis | character varying(70) |hash | integer |valid | boolean | default truealarm | boolean | default falseCod_Luogo | character varying(30) |Data_Arrivo_Campione | timestamp without time zone |site_id | integer |Cod_Luogo_v | character varying(30) |repeated_val | boolean | default falseIndexes: "sns_data2_pkey" PRIMARY KEY, btree ("Ordine", "Cod_Ana", "Cod_Par") "sns_datacodluogo2" btree ("Cod_Luogo") "sns_datatimefield2" btree ("Data_Arrivo_Campione") "sns_siteid2" btree (site_id) "sns_valid2" btree ("valid") "snsdata_codana" btree ("Cod_Ana") "snsdata_codpar" btree ("Cod_Par")Foreign-key constraints: "$2" FOREIGN KEY ("Cod_Ana") REFERENCES "SNS_ANA"("Cod_Ana") ON DELETE CASCADETriggers: sns_action_tr BEFORE INSERT OR UPDATE ON "SNS_DATA" FOR EACH ROW EXECUTE PROCEDURE sns_action() Can it be a datatype conversion problem? Thanks in advance! Reds
[PERFORM] Query performance problem
HI All, I have a big performance issue concerning a PostgreSQL database. I have the following server configuration: Pentium 4 2.4 GHz 1 GB RAM 36 GB SCSI And the following tables: TABLES -- == r_cliente: 75816 records CREATE TABLE public.r_cliente ( pkcliente INTEGER NOT NULL, cpfcnpj VARCHAR(20) NOT NULL, PRIMARY KEY(pkcliente) ) WITH OIDS; CREATE UNIQUE INDEX un_cliente_cpfcnpj ON public.r_cliente USING btree (cpfcnpj); == sav_cliente_lg: 65671 records === CREATE TABLE public.sav_cliente_lg ( codigo INTEGER NOT NULL, cpfcnpj VARCHAR(15) NOT NULL, PRIMARY KEY(codigo) ) WITH OIDS; CREATE INDEX ix_savclientelg_cpfcnpj ON public.sav_cliente_lg USING btree (cpfcnpj); Which I would like to run the following query: QUERY -- SELECT rc.pkcliente FROM r_cliente AS rc INNER JOIN sav_cliente_lg AS sc ON sc.cpfcnpj = sc.cpfcnpj; The problem is, it takes a long time to run, I wait up to half an hour and I get no result. So, I executed the explain on the query and got the following results: QUERY PLAN -- Nested Loop (cost=0.00..16696.87 rows=75816 width=4) - Seq Scan on sav_cliente_cf sc (cost=0.00..3047.55 rows=1 width=0) Filter: ((cpfcnpj)::text = (cpfcnpj)::text) - Seq Scan on r_cliente rc (cost=0.00..12891.16 rows=75816 width=4) And made the following modifications on my POSTGRESQL.CONF file: POSTGRESQL.CONF -- ### VERSION: Postgresql 7.4.2 ### shared_buffers = 7800 sort_mem = 4096 checkpoint_segments = 5 effective_cache_size = 12000 cpu_operator_cost = 0.0015 stats_start_collector = false Hope you can help me, I really need to get this running faster, and I am out of ideas. Since now, thanks a lot for your attention, Danilo Mota
Re: [PERFORM] Query performance problem
Danilo Mota [EMAIL PROTECTED] writes: SELECT rc.pkcliente FROM r_cliente AS rc INNER JOIN sav_cliente_lg AS sc ON sc.cpfcnpj = sc.cpfcnpj; Surely you meant INNER JOIN sav_cliente_lg AS sc ON rc.cpfcnpj = sc.cpfcnpj; I would also venture that your statistics are desperately out of date, because if the planner's estimates are close to reality, even this unconstrained-cross-product join shouldn't have taken that long. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]