Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL
Sorry, I didn'tpoint it out because an earlier post included the query with documentation - that post got lost... or at least *I* can't see it. The other half of the union renders the facilities that DO have addresses, and because of the performance problem (which I have finally sorted out by creating indexes which are more explicit - my oversight, really!) The original query was a slightly more complex outer join, which I then decomposed to an explicit union with two halves - one half handling the explicit facility_address_id is null portion, the other half handling the is not null portion (implicitly because of the normal join between facility and facility_address). I hadn't considered the not exists option - it's obvious when you look at the sub-query by itself, but didn't strike me before I broke it out of the union and you mentioned it. I was just under th eimpression that getting this sub-query to work would have produced the most clear, straightforward ANALYZE results. Carlo Shaun Thomas [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] On Monday 16 October 2006 16:37, Carlo Stonebanks wrote: The facility_address_id is null statement is necessary, as this is a sub-query from a union clause and I want to optimise the query with the original logic intact. The value is not hard coded to true but rather to null. Heh, you neglect to mention that this query is discovering faculty who do *not* have an address entry, which makes the is null a major necessity. With that, how did a not exists (blabla faculty_address blabla) subquery to get the same effect treat you? How about an IN (blabla LIMIT 1) ? -- Shaun Thomas Database Administrator Leapfrog Online 807 Greenwood Street Evanston, IL 60201 Tel. 847-440-8253 Fax. 847-570-5750 www.leapfrogonline.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL
you have a two part part key on facility(country code, postal code), right? Well, I'm glad you pointed it out, because I THOUGhT I had created it, but apparently I haven't -- I only noticed that it was missing after I listed all the other indexes. Looks like this query is one of the victims of a db structure corruption I suffered when transferring the schema over from development into production. (Well, that's my excuse and I'm sticking to it!) Thanks for all the help - I've reduced the execution time to 1/10 of its original time. Carlo ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Optimization of this SQL sentence
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 This SQL sentence is very simple. I need to get better results. I have tried some posibilities and I didn't get good results. SELECT max(idcomment) FROM ficha vf INNER JOIN comment c ON (vf.idficha=c.idfile AND (idestado=3 OR idestado=4)) WHERE idstatus=3 AND ctype=1 QUERY PLAN Aggregate (cost=2730.75..2730.76 rows=1 width=4) (actual time=188.463..188.469 rows=1 loops=1) - Hash Join (cost=1403.44..2730.72 rows=11 width=4) (actual time=141.464..185.404 rows=513 loops=1) Hash Cond: (outer.idfile = inner.idficha) - Seq Scan on comment c (cost=0.00..1321.75 rows=1083 width=8) (actual time=0.291..36.112 rows=642 loops=1) Filter: ((idstatus = 3) AND (ctype = 1)) - Hash (cost=1403.00..1403.00 rows=178 width=4) (actual time=141.004..141.004 rows=6282 loops=1) - Seq Scan on ficha vf (cost=0.00..1403.00 rows=178 width=4) (actual time=0.071..97.885 rows=6282 loops=1) Filter: (((idestado)::text = '3'::text) OR ((idestado)::text = '4'::text)) Total runtime: 188.809 ms Thanks in advance, Ruben Rubio -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD4DBQFFNJzfIo1XmbAXRboRAgPRAJ99+S9wL21b+JN14bQbAoREFXYUcQCYpfEZ p1MCcDMWqTxzSdtssUFWOw== =rUHB -END PGP SIGNATURE- ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Optimization of this SQL sentence
Off hanbd I can't recommend anything, bur perhaps you could post the details of the tables (columns, indexes),and some info on what version of postgres you are using. Are the tables recently analyzed ? How many rows in them ? Greg Williamson DBA GlobeXplorer LLC -Original Message- From: [EMAIL PROTECTED] on behalf of Ruben Rubio Sent: Tue 10/17/2006 2:05 AM To: pgsql-performance@postgresql.org Cc: Subject:[PERFORM] Optimization of this SQL sentence -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 This SQL sentence is very simple. I need to get better results. I have tried some posibilities and I didn't get good results. SELECT max(idcomment) FROM ficha vf INNER JOIN comment c ON (vf.idficha=c.idfile AND (idestado=3 OR idestado=4)) WHERE idstatus=3 AND ctype=1 QUERY PLAN Aggregate (cost=2730.75..2730.76 rows=1 width=4) (actual time=188.463..188.469 rows=1 loops=1) - Hash Join (cost=1403.44..2730.72 rows=11 width=4) (actual time=141.464..185.404 rows=513 loops=1) Hash Cond: (outer.idfile = inner.idficha) - Seq Scan on comment c (cost=0.00..1321.75 rows=1083 width=8) (actual time=0.291..36.112 rows=642 loops=1) Filter: ((idstatus = 3) AND (ctype = 1)) - Hash (cost=1403.00..1403.00 rows=178 width=4) (actual time=141.004..141.004 rows=6282 loops=1) - Seq Scan on ficha vf (cost=0.00..1403.00 rows=178 width=4) (actual time=0.071..97.885 rows=6282 loops=1) Filter: (((idestado)::text = '3'::text) OR ((idestado)::text = '4'::text)) Total runtime: 188.809 ms Thanks in advance, Ruben Rubio -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD4DBQFFNJzfIo1XmbAXRboRAgPRAJ99+S9wL21b+JN14bQbAoREFXYUcQCYpfEZ p1MCcDMWqTxzSdtssUFWOw== =rUHB -END PGP SIGNATURE- ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq --- Click link below if it is SPAM [EMAIL PROTECTED] https://mailscanner.globexplorer.com/dspam/dspam.cgi?signatureID=45349c86275246672479766[EMAIL PROTECTED]retrain=spamtemplate=historyhistory_page=1 !DSPAM:45349c86275246672479766! --- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Optimization of this SQL sentence
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Indexes in comment Comment rows: 17.250 CREATE INDEX usuariofichaoncommnet ON comment USING btree (idusuarioficha); Comment structure: CREATE TABLE comment ( idcomment int4 NOT NULL DEFAULT nextval('comment_idcomment_seq'::regclass), score int4, title varchar, ctext varchar, idusuarioficha int4, galleta varchar, navlang int4, cdate timestamp DEFAULT now(), idstatus int4, ctype int4 NOT NULL, idfile int4 NOT NULL, nick varchar, nombre varchar, apellidos varchar, dni varchar, nacionalidad varchar, email varchar, telefono varchar, code varchar, memo varchar, c_ip varchar(30), codpais char(2), replay varchar, replaydate timestamp, advsent int4, usrwarn int4, nouserlink int4, aviso_confirmacion_15 timestamp, aviso_confirmacion_60 timestamp, CONSTRAINT comment_pkey PRIMARY KEY (idcomment) ) Ficha structure: No indexes in ficha Ficha rows: 17.850 CREATE TABLE ficha ( idficha int4 NOT NULL DEFAULT nextval('ficha_idficha_seq'::regclass), email varchar(255), web varchar(255), capacidadmin int4, capacidadmax int4, preciotb float4, preciota float4, cp varchar(20), telefono1 varchar(50), telefono2 varchar(50), fax varchar(50), uprecio varchar, udireccion varchar(512), comentarios varchar, ucapacidad varchar(512), upresentacion varchar, utipoaloj varchar(50), ulugares varchar, ucaracteristica varchar, idusuario int4, idlocacion int4, contacto varchar(255), fuente varchar(512), prefijopais varchar(10), idestado char(1), nombre varchar(255), idtipoalojamiento int4, ulocalidad varchar(255), creado timestamp DEFAULT now(), cachefault int4 DEFAULT 0, idpromotiontype_pc int4 NOT NULL DEFAULT 0, idpromotiontype_ant_pc int4, promostartdate_pc timestamp, promoenddate_pc timestamp, localidadruta varchar(255), urlsufix varchar(32), searchengine1 int4, searchengine2 int4, searchengine3 int4, searchengine4 int4, searchengine5 int4, searchengine6 int4, deseo1 int4, deseo2 int4, deseo3 int4, deseo4 int4, deseo5 int4, deseo6 int4, otherspecs varchar(510), lastchange timestamp, idsubestado int4, environment int4, prefijopais2 varchar, web_agencia varchar(255), lat varchar(25), long varchar(25), zoom int4, swzoombloq bool DEFAULT true, titulomapa_l0 varchar(255), titulomapa_l1 varchar(255), titulomapa_l2 varchar(255), titulomapa_l3 varchar(255), titulomapa_l4 varchar(255), titulomapa_l5 varchar(255), titulomapa_l6 varchar(255), titulomapa_l7 varchar(255), titulomapa_l8 varchar(255), titulomapa_l9 varchar(255), CONSTRAINT pk_ficha PRIMARY KEY (idficha), CONSTRAINT fk_ficha_geonivel6 FOREIGN KEY (idlocacion) REFERENCES geonivel6 (idgeonivel6) ON UPDATE NO ACTION ON DELETE NO ACTION ) Gregory S. Williamson escribió: Off hanbd I can't recommend anything, bur perhaps you could post the details of the tables (columns, indexes),and some info on what version of postgres you are using. Are the tables recently analyzed ? How many rows in them ? Greg Williamson DBA GlobeXplorer LLC -Original Message- From: [EMAIL PROTECTED] on behalf of Ruben Rubio Sent: Tue 10/17/2006 2:05 AM To: pgsql-performance@postgresql.org Cc: Subject: [PERFORM] Optimization of this SQL sentence This SQL sentence is very simple. I need to get better results. I have tried some posibilities and I didn't get good results. SELECT max(idcomment) FROM ficha vf INNER JOIN comment c ON (vf.idficha=c.idfile AND (idestado=3 OR idestado=4)) WHERE idstatus=3 AND ctype=1 QUERY PLAN Aggregate (cost=2730.75..2730.76 rows=1 width=4) (actual time=188.463..188.469 rows=1 loops=1) - Hash Join (cost=1403.44..2730.72 rows=11 width=4) (actual time=141.464..185.404 rows=513 loops=1) Hash Cond: (outer.idfile = inner.idficha) - Seq Scan on comment c (cost=0.00..1321.75 rows=1083 width=8) (actual time=0.291..36.112 rows=642 loops=1) Filter: ((idstatus = 3) AND (ctype = 1)) - Hash (cost=1403.00..1403.00 rows=178 width=4) (actual time=141.004..141.004 rows=6282 loops=1) - Seq Scan on ficha vf (cost=0.00..1403.00 rows=178 width=4) (actual time=0.071..97.885 rows=6282 loops=1) Filter: (((idestado)::text = '3'::text) OR ((idestado)::text = '4'::text)) Total runtime: 188.809 ms Thanks in advance, Ruben Rubio - ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq - --- Click link below if it is SPAM [EMAIL PROTECTED] https://mailscanner.globexplorer.com/dspam/dspam.cgi?signatureID=45349c86275246672479766[EMAIL PROTECTED]retrain=spamtemplate=historyhistory_page=1 !DSPAM:45349c86275246672479766! -
Re: [PERFORM] Optimization of this SQL sentence
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 If just just realized that is a litlle faster (15% faster) with this: CREATE INDEX idx_statustype ON comment USING btree (idstatus, ctype); Any other ideas? Gregory S. Williamson escribió: Off hanbd I can't recommend anything, bur perhaps you could post the details of the tables (columns, indexes),and some info on what version of postgres you are using. Are the tables recently analyzed ? How many rows in them ? Greg Williamson DBA GlobeXplorer LLC -Original Message- From: [EMAIL PROTECTED] on behalf of Ruben Rubio Sent: Tue 10/17/2006 2:05 AM To: pgsql-performance@postgresql.org Cc: Subject: [PERFORM] Optimization of this SQL sentence This SQL sentence is very simple. I need to get better results. I have tried some posibilities and I didn't get good results. SELECT max(idcomment) FROM ficha vf INNER JOIN comment c ON (vf.idficha=c.idfile AND (idestado=3 OR idestado=4)) WHERE idstatus=3 AND ctype=1 QUERY PLAN Aggregate (cost=2730.75..2730.76 rows=1 width=4) (actual time=188.463..188.469 rows=1 loops=1) - Hash Join (cost=1403.44..2730.72 rows=11 width=4) (actual time=141.464..185.404 rows=513 loops=1) Hash Cond: (outer.idfile = inner.idficha) - Seq Scan on comment c (cost=0.00..1321.75 rows=1083 width=8) (actual time=0.291..36.112 rows=642 loops=1) Filter: ((idstatus = 3) AND (ctype = 1)) - Hash (cost=1403.00..1403.00 rows=178 width=4) (actual time=141.004..141.004 rows=6282 loops=1) - Seq Scan on ficha vf (cost=0.00..1403.00 rows=178 width=4) (actual time=0.071..97.885 rows=6282 loops=1) Filter: (((idestado)::text = '3'::text) OR ((idestado)::text = '4'::text)) Total runtime: 188.809 ms Thanks in advance, Ruben Rubio - ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq - --- Click link below if it is SPAM [EMAIL PROTECTED] https://mailscanner.globexplorer.com/dspam/dspam.cgi?signatureID=45349c86275246672479766[EMAIL PROTECTED]retrain=spamtemplate=historyhistory_page=1 !DSPAM:45349c86275246672479766! - --- - ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFNKT4Io1XmbAXRboRAurtAKC8YWjgzytaqkPjLfrohZ1aceZivwCgpDii wzxc4fktzIHTZRhPuJLi2Wc= =Korn -END PGP SIGNATURE- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Optimization of this SQL sentence
am Tue, dem 17.10.2006, um 11:33:18 +0200 mailte Ruben Rubio folgendes: SELECT max(idcomment) FROM ficha vf INNER JOIN comment c ON (vf.idficha=c.idfile AND (idestado=3 OR idestado=4)) WHERE idstatus=3 AND ctype=1 check for indexes on vf.idficha, c.idfile, idstatus and ctype. Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47215, D1: 0160/7141639 (mehr: - Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Optimization of this SQL sentence
On Oct 17, 2006, at 11:33 , Ruben Rubio wrote: CREATE TABLE comment ( idcomment int4 NOT NULL DEFAULT nextval('comment_idcomment_seq'::regclass), [snip 28 columns] CONSTRAINT comment_pkey PRIMARY KEY (idcomment) ) Ficha structure: No indexes in ficha Ficha rows: 17.850 CREATE TABLE ficha ( idficha int4 NOT NULL DEFAULT nextval ('ficha_idficha_seq'::regclass), [snip 67 (!) columns] CONSTRAINT pk_ficha PRIMARY KEY (idficha), CONSTRAINT fk_ficha_geonivel6 FOREIGN KEY (idlocacion) REFERENCES geonivel6 (idgeonivel6) ON UPDATE NO ACTION ON DELETE NO ACTION ) These tables are particularly egregious examples of ignorant database design. You need to understand the relational model (http:// en.wikipedia.org/wiki/Relational_model), specifically data normalization (http://en.wikipedia.org/wiki/Database_normalization) and 3NF (http://en.wikipedia.org/wiki/3NF). These columns are particularly telling: searchengine1 int4, searchengine2 int4, searchengine3 int4, searchengine4 int4, searchengine5 int4, searchengine6 int4, deseo1 int4, deseo2 int4, deseo3 int4, deseo4 int4, deseo5 int4, deseo6 int4, titulomapa_l0 varchar(255), titulomapa_l1 varchar(255), titulomapa_l2 varchar(255), titulomapa_l3 varchar(255), titulomapa_l4 varchar(255), titulomapa_l5 varchar(255), titulomapa_l6 varchar(255), titulomapa_l7 varchar(255), titulomapa_l8 varchar(255), titulomapa_l9 varchar(255), Refactor into three separate tables: create table searchengine ( idficha int references ficha (idficha), searchengine int, primary key (idficha, searchengine) ); create table deseo ( idficha int references ficha (idficha), deseo int, primary key (idficha, deseo) ); create table titulomapa ( idficha int references ficha (idficha), titulomapa int, primary key (idficha, titulomapa) ); Now you can find all search engines for a single ficha row: select searchengine from searchengine where idficha = n This design allows for more than 5 search engines per ficha row, and allows expressive joins such as: select ficha.idficha, searchengine.searchengine inner join searchengine on searchengine.idfciha = ficha.idficha Also, most of your columns are nullable. This alone shows that you don't understand your own data. Lastly, note that in PostgreSQL these length declarations are not necessary: contacto varchar(255), fuente varchar(512), prefijopais varchar(10) Instead, use: contacto text, fuente text, prefijopais text See the PostgreSQL manual for an explanation of varchar vs. text. Alexander. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Optimization of this SQL sentence
You could try rewriting the query like this: SELECT MAX(idcomment) FROM comment c WHERE idstatus=3 AND ctype=1 AND EXISTS (SELECT 1 FROM ficha vf WHERE idestado IN ('3', '4') AND vf.idficha = c.idfile); The planner can then try a backward scan on the comment_pkey index, which should be quicker than the seq scan assuming that there's a lot of rows that match the restrictions (idstatus=3 ctype=1 idestado IN ('3', '4')). But see comments inline below: Ruben Rubio wrote: CREATE TABLE comment ( idcomment int4 NOT NULL DEFAULT nextval('comment_idcomment_seq'::regclass), score int4, title varchar, ctext varchar, idusuarioficha int4, galleta varchar, navlang int4, cdate timestamp DEFAULT now(), idstatus int4, ctype int4 NOT NULL, idfile int4 NOT NULL, nick varchar, nombre varchar, apellidos varchar, dni varchar, nacionalidad varchar, email varchar, telefono varchar, code varchar, memo varchar, c_ip varchar(30), codpais char(2), replay varchar, replaydate timestamp, advsent int4, usrwarn int4, nouserlink int4, aviso_confirmacion_15 timestamp, aviso_confirmacion_60 timestamp, CONSTRAINT comment_pkey PRIMARY KEY (idcomment) ) Without knowing anything about you're application, it looks like there's a some fields in the comment-table that are duplicates of fields in the ficha-table. Telefono and email for example. You should consider doing some normalization. No indexes in ficha Except for the implicit idficha_pkey index. CREATE TABLE ficha ( ... idestado char(1), If idestado contains numbers (codes of some kind, I presume), you're better off using the smallint data type. searchengine1 int4, searchengine2 int4, searchengine3 int4, searchengine4 int4, searchengine5 int4, searchengine6 int4, Normalization?! deseo1 int4, deseo2 int4, deseo3 int4, deseo4 int4, deseo5 int4, deseo6 int4, For these as well... ... lat varchar(25), long varchar(25), Isn't there's a better data type for latitude and longitude? Decimal, perhaps? titulomapa_l0 varchar(255), titulomapa_l1 varchar(255), titulomapa_l2 varchar(255), titulomapa_l3 varchar(255), titulomapa_l4 varchar(255), titulomapa_l5 varchar(255), titulomapa_l6 varchar(255), titulomapa_l7 varchar(255), titulomapa_l8 varchar(255), titulomapa_l9 varchar(255), Again, normalization... - Heikki ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Optimization of this SQL sentence (SOLVED)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi to everyone, First of all I have to say that I now the database is not ok. There was a people before me that didn't do the thinks right. I would like to normalize the database, but it takes too much time (there is is hundred of SQLs to change and there is not enough technical resources). Also, datacolumns in some places has same names, but the data that is stores has different usages. Thanks everyone for all hints, I ll try to do my best performing the database structure. By other hand, I was able to create the corrects index and with this AND EXISTS (SELECT 1 FROM ficha vf WHERE idestado IN ('3', '4') AND vf.idficha = c.idfile); it is really fast. Thanks to everybody. Regards, Ruben Rubio Heikki Linnakangas escribió: You could try rewriting the query like this: SELECT MAX(idcomment) FROM comment c WHERE idstatus=3 AND ctype=1 AND EXISTS (SELECT 1 FROM ficha vf WHERE idestado IN ('3', '4') AND vf.idficha = c.idfile); The planner can then try a backward scan on the comment_pkey index, which should be quicker than the seq scan assuming that there's a lot of rows that match the restrictions (idstatus=3 ctype=1 idestado IN ('3', '4')). But see comments inline below: Ruben Rubio wrote: CREATE TABLE comment ( idcomment int4 NOT NULL DEFAULT nextval('comment_idcomment_seq'::regclass), score int4, title varchar, ctext varchar, idusuarioficha int4, galleta varchar, navlang int4, cdate timestamp DEFAULT now(), idstatus int4, ctype int4 NOT NULL, idfile int4 NOT NULL, nick varchar, nombre varchar, apellidos varchar, dni varchar, nacionalidad varchar, email varchar, telefono varchar, code varchar, memo varchar, c_ip varchar(30), codpais char(2), replay varchar, replaydate timestamp, advsent int4, usrwarn int4, nouserlink int4, aviso_confirmacion_15 timestamp, aviso_confirmacion_60 timestamp, CONSTRAINT comment_pkey PRIMARY KEY (idcomment) ) Without knowing anything about you're application, it looks like there's a some fields in the comment-table that are duplicates of fields in the ficha-table. Telefono and email for example. You should consider doing some normalization. No indexes in ficha Except for the implicit idficha_pkey index. CREATE TABLE ficha ( ... idestado char(1), If idestado contains numbers (codes of some kind, I presume), you're better off using the smallint data type. searchengine1 int4, searchengine2 int4, searchengine3 int4, searchengine4 int4, searchengine5 int4, searchengine6 int4, Normalization?! deseo1 int4, deseo2 int4, deseo3 int4, deseo4 int4, deseo5 int4, deseo6 int4, For these as well... ... lat varchar(25), long varchar(25), Isn't there's a better data type for latitude and longitude? Decimal, perhaps? titulomapa_l0 varchar(255), titulomapa_l1 varchar(255), titulomapa_l2 varchar(255), titulomapa_l3 varchar(255), titulomapa_l4 varchar(255), titulomapa_l5 varchar(255), titulomapa_l6 varchar(255), titulomapa_l7 varchar(255), titulomapa_l8 varchar(255), titulomapa_l9 varchar(255), Again, normalization... - Heikki ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFNK+jIo1XmbAXRboRAu6cAKCMUWHjcAYwN4DhVl1tSjMirgRAawCgvk8c gSB/4p1ZBOrDEwU9EW/yxw8= =yFoD -END PGP SIGNATURE- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Regarding Bitmap Scan
am Tue, dem 17.10.2006, um 17:09:29 +0530 mailte soni de folgendes: I didn't understand the Bitmap Scan and the sentence indexes will be dynamically converted to bitmaps in memory. What does mean by Bitmap Scan in database? For instance, you have a large table with 5 indexes on this and a query that checks conditions on this 5 columns. PG is now able to combine this 5 indexes and performs only 1 bitmap index scan on this table, and not 5 independet nested bitmap scans. A realy very great performance-boost! HTH, Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47215, D1: 0160/7141639 (mehr: - Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Regarding Bitmap Scan
On 10/17/06, soni de [EMAIL PROTECTED] wrote: I didn't understand theBitmap Scan andthe sentence indexes will be dynamically converted to bitmaps in memory. What does mean by Bitmap Scan in database? Can anybody help us regarding above query? Assume you have a table: CREATE TABLE foo ( some_key int, some_time timestamp with time zone, some_data text ); And two indexes: CREATE INDEX foo_key ON foo (some_key); CREATE INDEX foo_time ON foo (some_time); Now, you make a query: SELECT * from foo WHERE some_key 10 AND some_time '2006-10-01'::timestamptz; ...originally planner would choose only one index to use -- and would use the one which it think its best. The 8.1 version does differently: It will scan foo_key index -- make a bitmap out of it, scan foo_time index -- make another bitmap out of it, binary AND these bitmaps, and will read the data from the table using such combined bitmap. It could as well use OR if you used OR in your query. Hence -- it can be faster, especially for large tables and selective queries. Regards, DAwid
Re: [PERFORM] Optimization of this SQL sentence
These tables are particularly egregious examples of ignorant database design. You need to understand the relational model This email is a *particularly* egregious example of rudeness. You owe Mr. Staubo, and the Postgress community, an apology. There is absolutely no reason to insult people who come to this forum for help. That's why the forum is here, to help people who are ignorant and want to improve their knowledge. Craig ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Optimization of this SQL sentence
On Oct 17, 2006, at 17:10 , Craig A. James wrote: These tables are particularly egregious examples of ignorant database design. You need to understand the relational model This email is a *particularly* egregious example of rudeness. You owe Mr. Staubo, and the Postgress community, an apology. I'm sorry you feel that way, but I don't think I was out of line. I did point to several informative sources of documentation, and described some of the problems (but by no means all) with the person's schema and how to solve them. If you think the database design in question is *not* ignorant database design, please do explain why, but on technical grounds. (Ignorance, of course, is not a sin.) Alexander. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Optimization of this SQL sentence
On Oct 17, 2006, at 17:29 , Mario Weilguni wrote: Am Dienstag, 17. Oktober 2006 11:52 schrieb Alexander Staubo: Lastly, note that in PostgreSQL these length declarations are not necessary: contacto varchar(255), fuente varchar(512), prefijopais varchar(10) Enforcing length constraints with varchar(xyz) is good database design, not a bad one. Using text everywhere might be tempting because it works, but it's not a good idea. Enforcing length constraints is generally a bad idea because it assumes you know the data domain as expressed in a quantity of characters. Off the top of your head, do you know the maximum length of a zip code? A street address? The name of a city? In almost all cases the limit you invent is arbitrary, and the probability of being incompatible with any given input is inversely proportional to that arbitrary limit. Encoding specific length constraints in the database makes sense when they relate explicitly to business logic, but I can think of only a few cases where it would make sense: restricting the length of passwords, user names, and so on. In a few cases you do know with 100% certainty the limit of your field, such as with standardized abbreviations: ISO 3166 country codes, for example. And sometimes you want to cap data due to storage or transmission costs. The length constraint on text fields is primarily a historical artifact stemming from the way databases have traditionally been implemented, as fixed-length fields in fixed-length row structures. The inexplicable, improbable space-padded (!) character data type in ANSI SQL is a vestige of this legacy. PostgreSQL's variable-length rows and TOAST mechanism makes the point moot. Quoth the PostgreSQL manual, section 8.3: There are no performance differences between these three types, apart from the increased storage size when using the blank-padded type. While character(n) has performance advantages in some other database systems, it has no such advantages in PostgreSQL. In most situations text or character varying should be used instead. Alexander. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Optimization of this SQL sentence
Am Dienstag, 17. Oktober 2006 11:52 schrieb Alexander Staubo: Lastly, note that in PostgreSQL these length declarations are not necessary: contacto varchar(255), fuente varchar(512), prefijopais varchar(10) Instead, use: contacto text, fuente text, prefijopais text See the PostgreSQL manual for an explanation of varchar vs. text. Enforcing length constraints with varchar(xyz) is good database design, not a bad one. Using text everywhere might be tempting because it works, but it's not a good idea. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Optimization of this SQL sentence
On 10/17/06, Mario Weilguni [EMAIL PROTECTED] wrote: Am Dienstag, 17. Oktober 2006 11:52 schrieb Alexander Staubo: Lastly, note that in PostgreSQL these length declarations are not necessary: contacto varchar(255), fuente varchar(512), prefijopais varchar(10) Instead, use: contacto text, fuente text, prefijopais text See the PostgreSQL manual for an explanation of varchar vs. text. Enforcing length constraints with varchar(xyz) is good database design, not a bad one. Using text everywhere might be tempting because it works, but it's not a good idea. while you are correct, i think the spirit of the argument is wrong becuase there is no constraint to be enforced in those fields. a length constraint of n is only valid is n + 1 characters are an error and should be rejected by the database. anything else is IMO bad form. There are practial exceptions to this rule though, for example client technology that might require a length. so, imo alexander is correct: contacto varchar(255) ...is a false constraint, why exactly 255? is that were the dart landed? specifically limiting text fields so users 'don't enter too much data' is a manifestation c programmer's disease :) note I am not picking on the OP here, just weighing in on the constraint argument. merlin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Optimization of this SQL sentence
[EMAIL PROTECTED] (Merlin Moncure) writes: On 10/17/06, Mario Weilguni [EMAIL PROTECTED] wrote: Am Dienstag, 17. Oktober 2006 11:52 schrieb Alexander Staubo: Lastly, note that in PostgreSQL these length declarations are not necessary: contacto varchar(255), fuente varchar(512), prefijopais varchar(10) Instead, use: contacto text, fuente text, prefijopais text See the PostgreSQL manual for an explanation of varchar vs. text. Enforcing length constraints with varchar(xyz) is good database design, not a bad one. Using text everywhere might be tempting because it works, but it's not a good idea. while you are correct, i think the spirit of the argument is wrong becuase there is no constraint to be enforced in those fields. a length constraint of n is only valid is n + 1 characters are an error and should be rejected by the database. anything else is IMO bad form. There are practial exceptions to this rule though, for example client technology that might require a length. so, imo alexander is correct: contacto varchar(255) ...is a false constraint, why exactly 255? is that were the dart landed? Yeah, 255 seems silly to me. If I'm going to be arbitrary, there are two better choices: 1. 80, because that's how many characters one can fit across a piece of paper whilst keeping things pretty readable; 2. 64, because that will fit on a screen, and leave some space for a field name/description. specifically limiting text fields so users 'don't enter too much data' is a manifestation c programmer's disease :) No, I can't agree. I'm pretty accustomed to languages that don't pinch you the ways C does, and I still dislike having over-wide columns because it makes it more difficult to generate readable reports. -- output = (cbbrowne @ linuxfinances.info) http://linuxdatabases.info/info/unix.html Instant coffee is like pouring hot water over the cremated remains of a good friend. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Optimization of this SQL sentence
[EMAIL PROTECTED] (Alexander Staubo) writes: On Oct 17, 2006, at 17:29 , Mario Weilguni wrote: Am Dienstag, 17. Oktober 2006 11:52 schrieb Alexander Staubo: Lastly, note that in PostgreSQL these length declarations are not necessary: contacto varchar(255), fuente varchar(512), prefijopais varchar(10) Enforcing length constraints with varchar(xyz) is good database design, not a bad one. Using text everywhere might be tempting because it works, but it's not a good idea. Enforcing length constraints is generally a bad idea because it assumes you know the data domain as expressed in a quantity of characters. Off the top of your head, do you know the maximum length of a zip code? A street address? The name of a city? In the case of a zip code? Sure. US zip codes are integer values either 5 or 9 characters long. In the case of some of our internal applications, we need to conform to some IETF and ITU standards which actually do enforce some maximum lengths on these sorts of things. In almost all cases the limit you invent is arbitrary, and the probability of being incompatible with any given input is inversely proportional to that arbitrary limit. I'd be quite inclined to limit things like addresses to somewhat smaller sizes than you might expect. If addresses are to be used to generate labels for envelopes, for instance, it's reasonably important to limit sizes to those that might fit on a label or an envelope. Encoding specific length constraints in the database makes sense when they relate explicitly to business logic, but I can think of only a few cases where it would make sense: restricting the length of passwords, user names, and so on. In a few cases you do know with 100% certainty the limit of your field, such as with standardized abbreviations: ISO 3166 country codes, for example. And sometimes you want to cap data due to storage or transmission costs. There's another reason: Open things up wide, and some people will fill the space with rubbish. -- cbbrowne,@,acm.org http://linuxfinances.info/info/internet.html The Amiga is proof that if you build a better mousetrap, the rats will gang up on you. -- Bill Roberts [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Optimization of this SQL sentence
Chris Browne wrote: In the case of a zip code? Sure. US zip codes are integer values either 5 or 9 characters long. So your app will only work in the US? And only for US companies that only have US clients? Sorry had to dig at that ;-P -- Shane Ambler [EMAIL PROTECTED] Get Sheeky @ http://Sheeky.Biz ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Optimization of this SQL sentence
The world rejoiced as [EMAIL PROTECTED] (Shane Ambler) wrote: Chris Browne wrote: In the case of a zip code? Sure. US zip codes are integer values either 5 or 9 characters long. So your app will only work in the US? And only for US companies that only have US clients? Sorry had to dig at that ;-P Heh. I'm not in the US, so that's not the sort of mistake I'd be likely to make... The thing is, the only place where they call this sort of thing a zip code is the US. Elsewhere, it's called a postal code. -- (reverse (concatenate 'string gro.mca @ enworbbc)) http://linuxfinances.info/info/finances.html Rules of the Evil Overlord #159. If I burst into rebel headquarters and find it deserted except for an odd, blinking device, I will not walk up and investigate; I'll run like hell. http://www.eviloverlord.com/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Optimization of this SQL sentence
Christopher Browne wrote: The world rejoiced as [EMAIL PROTECTED] (Shane Ambler) wrote: Chris Browne wrote: In the case of a zip code? Sure. US zip codes are integer values either 5 or 9 characters long. So your app will only work in the US? And only for US companies that only have US clients? Sorry had to dig at that ;-P Heh. I'm not in the US, so that's not the sort of mistake I'd be likely to make... The thing is, the only place where they call this sort of thing a zip code is the US. Elsewhere, it's called a postal code. Same meaning/use different name (that's a locale issue for the client displaying the data) they will all use the same column for that data. -- Shane Ambler [EMAIL PROTECTED] Get Sheeky @ http://Sheeky.Biz ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Jdbc/postgres performance
Hi We are facing performance problems in postgres while executing a query. When I execute this query on the server it takes 5-10 seconds. Also I get good performance while executing this query from my code in java with the hard codes values. I face severe performance problems when I run it using a prepared statement. The query is as follows: Select events.event_id, ctrl.real_name, events.tsds, events.value, events.lds, events.correction, ctrl.type, ctrl.freq from iso_midw_data_update_events events, iso_midw_control ctrl where events.obj_id = ctrl.obj_id and events.event_id 68971124 order by events.event_id limit 2000 The above query executes in 5-10 seconds. However the below query executes in 8 mins: Select events.event_id, ctrl.real_name, events.tsds, events.value, events.lds, events.correction, ctrl.type, ctrl.freq from table events, iso_midw_control ctrl where events.obj_id = ctrl.obj_id and events.event_id ?::bigint order by events.event_id limit ? setLong(1, 68971124); setInt(2, 2000); The table has close to 5 million rows. The table has the following index: iso_midw_data_update_events_event_id_key iso_midw_data_update_events_lds_idx iso_midw_data_update_events_obj_id_idx The table is described as follows: Columns_name data_type type_namecolumn_size lds 2 numeric 13 obj_id 2 numeric 6 tsds2 numeric 13 value 12 varchar 22 correction 2 numeric 1 delta_lds_tsds 2 numeric 13 event_id-5 bigserial 8 Please tell me what I am missing while setting the prepared statement. I am using postgres7.4.2. and postgresql-8.1-407.jdbc3.jar. Thanks Regards Rohit CAUTION - Disclaimer * This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system. ***INFOSYS End of Disclaimer INFOSYS*** ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Jdbc/postgres performance
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-performance- [EMAIL PROTECTED] On Behalf Of Merlin Moncure Sent: Tuesday, October 17, 2006 4:29 PM To: Rohit_Behl Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Jdbc/postgres performance On 10/17/06, Rohit_Behl [EMAIL PROTECTED] wrote: Select events.event_id, ctrl.real_name, events.tsds, events.value, events.lds, events.correction, ctrl.type, ctrl.freq from table events, iso_midw_control ctrl where events.obj_id = ctrl.obj_id and events.event_id ?::bigint order by events.event_id limit ? unfortunately parameterized limit statements cause problems due to the fact the planner has a hard coded 'guess' of 10% of rows returned when the plan is generated. I mention this everyime query hints proposal comes up :-). I'm not sure that this has anything to do with hints (yes, I know hints are a popular topic as of late..) but from the 8.1 Manual: This is because when the statement is planned and the planner attempts to determine the optimal query plan, the actual values of any parameters specified in the statement are unavailable. After a quick search on the JDBC list, it looks like there's some recent discussion on the subject of how to give the planner better insight for prepared statements (the subject is Blind Message if you're looking...). So, I'm off to go read there and perhaps join the jdbc mailing list too. But, a more general postgres question. I assume if I want to turn prepared statements off altogether (say I'm using a jdbc abstraction layer that likes parameterized statements, and there's other benefits to parameterizing other than just saving on db parse/plan) can I set max_prepared_transactions to 0? Is there any other option outside of JDBC? (I'll be moving my other questions over to the JDBC list...) Also, others might be interested in the JDBC documentation, which is separate from the main Postgres manual and can be found at: http://jdbc.postgresql.org/documentation/ - Bucky best you can do is to try turning off seqscan and possibly bitmap scan when the plan is generated. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Hints proposal
On Thursday 12 October 2006 12:40, Bucky Jordan wrote: What is it about hinting that makes it so easily breakable with new versions? I don't have any experience with Oracle, so I'm not sure how they screwed logic like this up. I don't have a ton of experience with oracle either, mostly DB2, MSSQL and PG. So, I thought I'd do some googling, and maybe others might find this useful info. http://asktom.oracle.com/pls/ask/f?p=4950:8:2177642270773127589::NO::F4950_ P8_DISPLAYID,F4950_P8_CRITERIA:7038986332061 Interesting quote: In Oracle Applications development (11i apps - HR, CRM, etc) Hints are strictly forbidden. We find the underlying cause and fix it. and Hints -- only useful if you are in RBO and you want to make use of an access path. Maybe because I haven't had access to hints before, I've never been tempted to use them. However, I can't remember having to re-write SQL due to a PG upgrade either. When it happens it tends to look something like this: http://archives.postgresql.org/pgsql-performance/2006-01/msg00154.php Funny that for all the people who claim that improving the planner should be the primary goal that no one ever took interest in the above case. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PERFORM] Hints proposal
On Friday 13 October 2006 12:46, Gregory Stark wrote: Josh Berkus josh@agliodbs.com writes: I actually think the way to attack this issue is to discuss the kinds of errors the planner makes, and what tweaks we could do to correct them. Here's the ones I'm aware of: -- Incorrect selectivity of WHERE clause -- Incorrect selectivity of JOIN -- Wrong estimate of rows returned from SRF -- Incorrect cost estimate for index use Can you think of any others? -- Incorrect estimate for result of DISTINCT or GROUP BY. Yeah, that one is bad. I also ran into one the other day where the planner did not seem to understand the distinctness of a columns values across table partitions... -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Jdbc/postgres performance
On 10/18/06, Bucky Jordan [EMAIL PROTECTED] wrote: On 10/17/06, Rohit_Behl [EMAIL PROTECTED] wrote: Select events.event_id, ctrl.real_name, events.tsds, events.value, events.lds, events.correction, ctrl.type, ctrl.freq from table events, iso_midw_control ctrl where events.obj_id = ctrl.obj_id and events.event_id ?::bigint order by events.event_id limit ? After a quick search on the JDBC list, it looks like there's some recent discussion on the subject of how to give the planner better insight for prepared statements (the subject is Blind Message if you're looking...). So, I'm off to go read there and perhaps join the jdbc mailing list too. this is not really a jdbc issue, just a practical problem with prepared statements...except for the mechanism if any the jdbc driver allows you to choose if a statement is prepared. But, a more general postgres question. I assume if I want to turn prepared statements off altogether (say I'm using a jdbc abstraction you turn off prepared statements by not invoking sql prepare or PQprepare. (or, if jdbc implements its own protocol client, it's version of PQprepare). layer that likes parameterized statements, and there's other benefits to parameterizing other than just saving on db parse/plan) can I set max_prepared_transactions to 0? Is there any other option outside of this setting is for 2pc and is not relevent to the discussion :) even if it were, im not so sure about a setting designed to enforce a partcular method of querying. yes, you are correct this is not exactly the use case for hints being discussed in -hackers. however, imho, this is much more important and relevant so long as prepared statements continue to work the way they do. merlin ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Jdbc/postgres performance
Merlin Moncure [EMAIL PROTECTED] writes: this is not really a jdbc issue, just a practical problem with prepared statements... Specifically, that the OP is running a 7.4 backend, which was our first venture into prepared parameterized statements. PG 8.1 will do better, 8.2 should do better yet. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match