Re: [PERFORM] Horribly slow query/ sequential scan
Forget abount IN. Its horribly slow. try : select w.appid, w.rate, w.is_subscribed, sum(w.hits) AS Hits, sum(w.sius) AS IUs, sum(w.total_amnt) AS Total, sum(w.hits) * w.rate AS ByHits, sum(w.sius) * w.rate AS BYIUS from bill_rpt_work w where (select b.report_id from billing_reports b where b.report_s_date = '2006-09-30' and w.report_id = b.report_id) and w.client_id IN ('22741','227410') group by 1,2,3 order by 1,2,3; should by faster; assuming : index on report_id in b; index on report_id, client_id in w to enforce useage of indexes on grouping (depends on result size), consider extending w with cols 1,2,3. regards, marcus -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Auftrag von [EMAIL PROTECTED] Gesendet: Dienstag, 9. Januar 2007 13:36 An: Gregory S. Williamson Cc: pgsql-performance@postgresql.org Betreff: Re: [PERFORM] Horribly slow query/ sequential scan I don't think I understand the idea behind this query. Do you really need billing_reports twice? The query: explain analyze select w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs, sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits, sum(w.sius) * w.rate AS BYIUS from bill_rpt_work w, billing_reports b where w.report_id in (select b.report_id from billing_reports where b.report_s_date = '2006-09-30') and (w.client_id = '22741' or w.client_id = '227410') group by 1,2,3 order by 1,2,3; Maybe this is the query you want instead? select w.appid, w.rate, w.is_subscribed, sum(w.hits) AS Hits, sum(w.sius) AS IUs, sum(w.total_amnt) AS Total, sum(w.hits) * w.rate AS ByHits, sum(w.sius) * w.rate AS BYIUS from bill_rpt_work w where w.report_id in (select b.report_id from billing_reports b where b.report_s_date = '2006-09-30') and (w.client_id = '22741' or w.client_id = '227410') group by 1,2,3 order by 1,2,3; /Dennis ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Read only transactions - Commit or Rollback
afaik, this should be completely neglectable. starting a transaction implies write access. if there is none, You do not need to think about transactions, because there are none. postgres needs to schedule the writing transactions with the reading ones, anyway. But I am not that performance profession anyway ;-) regards, Marcus -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Auftrag von Markus Schaber Gesendet: Dienstag, 20. Dezember 2005 11:41 An: PostgreSQL Performance List Betreff: [PERFORM] Read only transactions - Commit or Rollback Hello, We have a database containing PostGIS MAP data, it is accessed mainly via JDBC. There are multiple simultaneous read-only connections taken from the JBoss connection pooling, and there usually are no active writers. We use connection.setReadOnly(true). Now my question is what is best performance-wise, if it does make any difference at all: Having autocommit on or off? (I presume off) Using commit or rollback? Committing / rolling back occasionally (e. G. when returning the connection to the pool) or not at all (until the pool closes the connection)? Thanks, Markus -- Markus Schaber | Logical TrackingTracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Read only transactions - Commit or Rollback
Mmmm, good question. MVCC blocks reading processes when data is modified. using autocommit implies that each modification statement is an atomic operation. on a massive readonly table, where no data is altered, MVCC shouldn't have any effect (but this is only an assumption) basing on http://en.wikipedia.org/wiki/Mvcc using rowlevel locks with write access should make most of the mostly available to reading-only sessions, but this is an assumption only, too. maybe the community knows a little more ;-) regards, marcus -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Auftrag von Grega Bremec Gesendet: Dienstag, 20. Dezember 2005 12:41 An: PostgreSQL Performance List Betreff: Re: [PERFORM] Read only transactions - Commit or Rollback -BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 Nörder-Tuitje wrote: | We have a database containing PostGIS MAP data, it is accessed | mainly via JDBC. There are multiple simultaneous read-only | connections taken from the JBoss connection pooling, and there | usually are no active writers. We use connection.setReadOnly(true). | | Now my question is what is best performance-wise, if it does make | any difference at all: | | Having autocommit on or off? (I presume off) | | Using commit or rollback? | | Committing / rolling back occasionally (e. G. when returning the | connection to the pool) or not at all (until the pool closes the | connection)? | | afaik, this should be completely neglectable. | | starting a transaction implies write access. if there is none, You do | not need to think about transactions, because there are none. | | postgres needs to schedule the writing transactions with the reading | ones, anyway. | | But I am not that performance profession anyway ;-) Hello, Marcus, Nörder, list. What about isolation? For several dependent calculations, MVCC doesn't happen a bit with autocommit turned on, right? Cheers, - -- ~Grega Bremec ~gregab at p0f dot net -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.0 (GNU/Linux) iD8DBQFDp+2afu4IwuB3+XoRA6j3AJ0Ri0/NrJtHg4xBNcFsVFFW0XvCoQCfereo aX6ThZIlPL0RhETJK9IcqtU= =xalw -END PGP SIGNATURE- ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] INSERTs becoming slower and slower
Title: INSERTs becoming slower and slower Hi, I am breaking up huge texts (between 25K and 250K words) into single words using PgPlsql. For this I am using a temp table in the first step : LOOP vLeft := vRight; vTmp := vLeft; LOOP vChr := SUBSTRING ( pText FROM vTmp FOR 1); vTmp := vTmp + 1; EXIT WHEN (vChr = ' ' OR vChr IS NULL OR vTmp = cBorder); END LOOP; vRight := vTmp; vLit := SUBSTRING(pText FROM vLeft FOR (vRight - vLeft - 1)); IF (LENGTH(vLit) 0) THEN WRDCNT := WRDCNT +1; INSERT INTO DEX_TEMPDOC(TMP_DOO_ID , TMP_SEQ_ID , TMP_RAWTEXT) VALUES (pDOO_ID , I , vLIT ); END IF; I := I + 1; vTmp := LENGTH(vLIT); IF ((WRDCNT % 100) = 0) THEN PROGRESS = ROUND((100 * I) / DOCLEN,0); RAISE NOTICE '[PROC] % WORDS -- LAST LIT % (Len %) [% PCT / % of %]', WRDCNT, vLIT, vTMP, PROGRESS, I, DOCLEN; END IF; EXIT WHEN vRight = cBorder; END LOOP; The doc is preprocessed, between each word only a single blank can be. My problem is : The first 25K words are quite quick, but the insert become slower and slower. starting with 1K words per sec I end up with 100 words in 10 sec (when I reach 80K-100K words) the only (nonunique index) on tempdoc is on RAWTEXT. What can I do ? Should I drop the index ? Here is my config: shared_buffers = 2000 # min 16, at least max_connections*2, 8KB each work_mem = 32768 # min 64, size in KB maintenance_work_mem = 16384 # min 1024, size in KB max_stack_depth = 8192 # min 100, size in KB enable_hashagg = true enable_hashjoin = true enable_indexscan = true enable_mergejoin = true enable_nestloop = true enable_seqscan = false The machine is a XEON 3GHz, 1GB RAM, SATA RAID 1 Array running 8.0.4 i686 precompiled Thanks ! Mit freundlichen Grüßen Dipl.Inform.Marcus Noerder-Tuitje Entwickler software technology AG Kortumstraße 16 44787 Bochum Tel: 0234 / 52 99 6 26 Fax: 0234 / 52 99 6 22 E-Mail: [EMAIL PROTECTED] Internet: www.technology.de
Re: [PERFORM] (View and SQL) VS plpgsql
snip FOR r_record IN SELECT count(cid) AS hits,src, bid, tid,NULL::int8 as min_time,NULL::int8 as max_time FROM archive_event WHERE inst=\'3\' AND (utctime BETWEEN \'111492\' AND \'1131512399\') GROUP BY src, bid, tid LOOP SELECT INTO one_record MIN(utctime) as timestart,MAX(utctime) as timestop from archive_event where src =r_record.src AND bid =r_record.bid AND tid = r_record.tid AND inst =\'3\' AND (utctime BETWEEN \'111492\' AND \'1131512399\'); /snip (it seems to me, that you might combine both queries) 1. have you ever tried to select the min/max within the first stmt ? as i see you are reducing data in second stmt using same key as in stmt 1. 2. you are querying data using two keys (int, utctime). you may create a combined index speeding up your query 3. same for grouping. you are grouping over three fields. composite indexing may helps (8.1 supports index based grouping) regards, marcus -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Auftrag von Eric Lauzon Gesendet: Mittwoch, 9. November 2005 21:43 An: pgsql-performance@postgresql.org Betreff: [PERFORM] (View and SQL) VS plpgsql Hello all , i post this question here because i wasen't able to find answer to my question elsewhere , i hope someone can answer. Abstract: The function that can be found at the end of the e-mail emulate two thing. First it will fill a record set of result with needed column from a table and two empty result column a min and a max. Those two column are then filled by a second query on the same table that will do a min and a max on an index idx_utctime. The function loop for the first recordset and return a setof record that is casted by caller to the function. The goald of this is to enabled the application that will receive the result set to minimise its work by having to group internaly two matching rowset. We use to handle two resultset but i am looking toward improving performances and at first glance it seem to speed up the process. Questions: 1. How could this be done in a single combinasion of SQL and view? 2. In a case like that is plpgsql really givig significant overhead? 3. Performance difference [I would need a working pure-SQL version to compare PLANNER and Explain results ] STUFF: --TABLE INDEX CREATE TABLE archive_event ( inst int4 NOT NULL, cid int8 NOT NULL, src int8 NOT NULL, dst int8 NOT NULL, bid int8 NOT NULL, tid int4 NOT NULL, utctime int4 NOT NULL, CONSTRAINT ids_archives_event_pkey PRIMARY KEY (inst, cid), CONSTRAINT ids_archives_event_cid_index UNIQUE (cid) ) --index CREATE INDEX idx_archive_utctime ON archive_event USING btree (utctime); CREATE INDEX idx_archive_src ON archive_event USING btree (src); CREATE INDEX idx_archive_bid_tid ON archive_event USING btree (tid, bid); --FUNCTION CREATE OR REPLACE FUNCTION console_get_source_rule_level_1() RETURNS SETOF RECORD AS ' DECLARE one_record record; r_record record; BEGIN FOR r_record IN SELECT count(cid) AS hits,src, bid, tid,NULL::int8 as min_time,NULL::int8 as max_time FROM archive_event WHERE inst=\'3\' AND (utctime BETWEEN \'111492\' AND \'1131512399\') GROUP BY src, bid, tid LOOP SELECT INTO one_record MIN(utctime) as timestart,MAX(utctime) as timestop from archive_event where src =r_record.src AND bid =r_record.bid AND tid = r_record.tid AND inst =\'3\' AND (utctime BETWEEN \'111492\' AND \'1131512399\'); r_record.min_time := one_record.timestart; r_record.max_time := one_record.timestop; RETURN NEXT r_record; END LOOP; RETURN; END; ' LANGUAGE 'plpgsql' VOLATILE; GRANT EXECUTE ON FUNCTION console_get_source_rule_level_1() TO console WITH GRANT OPTION; --FUNCTION CALLER SELECT * from get_source_rule_level_1() AS (hits int8,src int8,bid int8,tid int4,min_time int8,max_time int8) Eric Lauzon [Recherche Développement] Above Sécurité / Above Security Tél : (450) 430-8166 Fax : (450) 430-1858 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Inefficient escape codes.
I guess, You should check, if a blob field and large object access is suitable for you - no escaping etc, just raw binary large objects. AFAIK, PQExecParams is not the right solution for You. Refer the Large object section: 28.3.5. Writing Data to a Large Object The function int lo_write(PGconn *conn, int fd, const char *buf, size_t len);writes len bytes from buf to large object descriptor fd. The fd argument must have been returned by a previous lo_open. The number of bytes actually written is returned. In the event of an error, the return value is negative. regards, Narcus -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Auftrag von Michael Fuhr Gesendet: Dienstag, 18. Oktober 2005 22:47 An: Rodrigo Madera Cc: pgsql-performance@postgresql.org Betreff: Re: [PERFORM] Inefficient escape codes. [Please copy the mailing list on replies so others can participate in and learn from the discussion.] On Tue, Oct 18, 2005 at 07:09:08PM +, Rodrigo Madera wrote: What language and API are you using? I'm using libpqxx. A nice STL-style library for C++ (I am 101% C++). I've only dabbled with libpqxx; I don't know if or how you can make it send data in binary instead of text. See the documentation or ask in a mailing list like libpqxx-general or pgsql-interfaces. Binary transfer sends data in binary, not by automatically converting to and from text. Uh, I'm sorry I didn't get that... If I send: insert into foo values('\\001\\002') will libpq send 0x01, 0x02 or 001002?? If you do it that way libpq will send the string as text with escape sequences; you can use a sniffer like tcpdump or ethereal to see this for yourself. To send the data in binary you'd call PQexecParams() with a query like INSERT INTO foo VALUES ($1). The $1 is a placeholder; the other arguments to PQexecParams() provide the data itself, the data type and length, and specify whether the data is in text format or binary. See the libpq documentation for details. -- Michael Fuhr ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Deleting Records
what about firing a DELETE FROM qc_session S WHERE EXISTS (SELECT * FROM item_qc_doer i WHERE i.item_id = s.item_id); and DELETE FROM item_qc_doer S WHERE EXISTS (SELECT * FROM item_qc_doer i WHERE i.item_id = s.item_id); this might be faster. another way to speed up deletes might be disabling foreign keys. also a SET ENABLE_SEQSCAN=FALSE; can speed up queries (force use of indices for access) do you have a EXPLAIN for us ? do you have a index on item_id on your tables ? questions by questions ;-) mfg -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Auftrag von Christian Paul B. Cosinas Gesendet: Donnerstag, 20. Oktober 2005 10:44 An: pgsql-performance@postgresql.org Betreff: [PERFORM] Deleting Records Hi! I'm experiencing a very slow deletion of records. Which I thin is not right. I have a Dual Xeon Server with 6gig Memory. I am only deleting about 22,000 records but it took me more than 1 hour to finish this. What could possibly I do so that I can make this fast? Here is the code inside my function: FOR temp_rec IN SELECT * FROM item_qc_doer LOOP DELETE FROM qc_session WHERE item_id = temp_rec.item_id; DELETE FROM item_qc_doer WHERE item_id = temp_rec.item_id; END LOOP; Item_qc_oder table contains 22,000 records. I choose Polesoft Lockspam to fight spam, and you? http://www.polesoft.com/refer.html ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Optimizer misconfigured ?
Hi, meanwhile I have received the hint to make postgres use the index via SET ENABLE_SEQSCAN=FALSE; which fits perfectly. The execution plan now indicates full use of index. Nevertheless this is merely a workaround. Maybe the io-costs are configured to cheap. thanks :-) -Ursprüngliche Nachricht- Von: Richard Huxton [mailto:[EMAIL PROTECTED] Gesendet: Donnerstag, 13. Oktober 2005 12:22 An: Nörder-Tuitje, Marcus Cc: pgsql-performance@postgresql.org Betreff: Re: [PERFORM] Optimizer misconfigured ? Nörder-Tuitje wrote: Hello, I have a strange effect on upcoming structure : People will be wanting the output of EXPLAIN ANALYSE on that query. They'll also ask whether you've VACUUMed, ANALYSEd and configured your postgresql.conf correctly. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly