Re: [PERFORM] Horribly slow query/ sequential scan

2007-01-09 Thread Nörder-Tuitje , Marcus
Forget abount IN. Its horribly slow.

try :

select w.appid,
   w.rate,
   w.is_subscribed,
   sum(w.hits) AS Hits,
   sum(w.sius) AS IUs,
   sum(w.total_amnt) AS Total,
   sum(w.hits) * w.rate AS ByHits,
   sum(w.sius) * w.rate AS BYIUS
  from bill_rpt_work w
 where (select b.report_id from billing_reports b where b.report_s_date = 
'2006-09-30' and w.report_id = b.report_id)
   and w.client_id IN ('22741','227410')
group by 1,2,3
order by 1,2,3;



should by faster; 

assuming : index on report_id in b; index on report_id, client_id in w

to enforce useage of indexes on grouping (depends on result size), consider 
extending w with cols 1,2,3.


regards, 
marcus

-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Auftrag von
[EMAIL PROTECTED]
Gesendet: Dienstag, 9. Januar 2007 13:36
An: Gregory S. Williamson
Cc: pgsql-performance@postgresql.org
Betreff: Re: [PERFORM] Horribly slow query/ sequential scan


I don't think I understand the idea behind this query. Do you really need
billing_reports twice?

 The query:
 explain analyze select
 w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs,
 sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits,
 sum(w.sius) * w.rate AS BYIUS
 from bill_rpt_work w, billing_reports b
 where w.report_id in
 (select b.report_id from billing_reports where b.report_s_date =
 '2006-09-30')
 and (w.client_id = '22741' or w.client_id = '227410')
 group by 1,2,3
 order by 1,2,3;

Maybe this is the query you want instead?

select w.appid,
   w.rate,
   w.is_subscribed,
   sum(w.hits) AS Hits,
   sum(w.sius) AS IUs,
   sum(w.total_amnt) AS Total,
   sum(w.hits) * w.rate AS ByHits,
   sum(w.sius) * w.rate AS BYIUS
  from bill_rpt_work w
 where w.report_id in
   (select b.report_id from billing_reports b where b.report_s_date =
'2006-09-30')
   and (w.client_id = '22741' or w.client_id = '227410')
group by 1,2,3
order by 1,2,3;

/Dennis


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Read only transactions - Commit or Rollback

2005-12-20 Thread Nörder-Tuitje , Marcus
afaik, this should be completely neglectable.

starting a transaction implies write access. if there is none, You do not need 
to think about transactions, because there are none.

postgres needs to schedule the writing transactions with the reading ones, 
anyway.

But I am not that performance profession anyway ;-)


regards,
Marcus

-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Auftrag von Markus
Schaber
Gesendet: Dienstag, 20. Dezember 2005 11:41
An: PostgreSQL Performance List
Betreff: [PERFORM] Read only transactions - Commit or Rollback


Hello,

We have a database containing PostGIS MAP data, it is accessed mainly
via JDBC. There are multiple simultaneous read-only connections taken
from the JBoss connection pooling, and there usually are no active
writers. We use connection.setReadOnly(true).

Now my question is what is best performance-wise, if it does make any
difference at all:

Having autocommit on or off? (I presume off)

Using commit or rollback?

Committing / rolling back occasionally (e. G. when returning the
connection to the pool) or not at all (until the pool closes the
connection)?

Thanks,
Markus

-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 6: explain analyze is your friend



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Read only transactions - Commit or Rollback

2005-12-20 Thread Nörder-Tuitje , Marcus
Mmmm, good question.

MVCC blocks reading processes when data is modified. using autocommit implies 
that each modification statement is an atomic operation.

on a massive readonly table, where no data is altered, MVCC shouldn't have any 
effect (but this is only an assumption) basing on

http://en.wikipedia.org/wiki/Mvcc

using rowlevel locks with write access should make most of the mostly available 
to reading-only sessions, but this is an assumption only, too.

maybe the community knows a little more ;-)

regards,
marcus


-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Auftrag von Grega Bremec
Gesendet: Dienstag, 20. Dezember 2005 12:41
An: PostgreSQL Performance List
Betreff: Re: [PERFORM] Read only transactions - Commit or Rollback


-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160

Nörder-Tuitje wrote:
| We have a database containing PostGIS MAP data, it is accessed
| mainly via JDBC. There are multiple simultaneous read-only
| connections taken from the JBoss connection pooling, and there
| usually are no active writers. We use connection.setReadOnly(true).
|
| Now my question is what is best performance-wise, if it does make
| any difference at all:
|
| Having autocommit on or off? (I presume off)
|
| Using commit or rollback?
|
| Committing / rolling back occasionally (e. G. when returning the
| connection to the pool) or not at all (until the pool closes the
| connection)?
|
| afaik, this should be completely neglectable.
|
| starting a transaction implies write access. if there is none, You do
| not need to think about transactions, because there are none.
|
| postgres needs to schedule the writing transactions with the reading
| ones, anyway.
|
| But I am not that performance profession anyway ;-)

Hello, Marcus, Nörder, list.

What about isolation? For several dependent calculations, MVCC doesn't
happen a bit with autocommit turned on, right?

Cheers,
- --
~Grega Bremec
~gregab at p0f dot net
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFDp+2afu4IwuB3+XoRA6j3AJ0Ri0/NrJtHg4xBNcFsVFFW0XvCoQCfereo
aX6ThZIlPL0RhETJK9IcqtU=
=xalw
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] INSERTs becoming slower and slower

2005-12-08 Thread Nörder-Tuitje , Marcus
Title: INSERTs becoming slower and slower








Hi,


I am breaking up huge texts (between 25K and 250K words) into single words using PgPlsql.


For this I am using a temp table in the first step :


 LOOP 


  vLeft := vRight;

  vTmp := vLeft;

  

  LOOP

   vChr := SUBSTRING ( pText FROM vTmp FOR 1);

   vTmp := vTmp + 1;

   EXIT WHEN (vChr = ' ' OR vChr IS NULL OR vTmp = cBorder);

  END LOOP;

   

  vRight := vTmp;

  

  vLit := SUBSTRING(pText FROM vLeft FOR (vRight - vLeft - 1));


  IF (LENGTH(vLit)  0) THEN

   WRDCNT := WRDCNT +1;

   INSERT INTO DEX_TEMPDOC(TMP_DOO_ID

  , TMP_SEQ_ID

  , TMP_RAWTEXT)

   VALUES   (pDOO_ID

  , I

  , vLIT

   ); 

  END IF;

  

  I := I + 1;

  vTmp := LENGTH(vLIT);


  

  IF ((WRDCNT % 100) = 0) THEN

   PROGRESS = ROUND((100 * I) / DOCLEN,0); 

   RAISE NOTICE '[PROC] % WORDS -- LAST LIT % (Len %) [% PCT / % of %]', WRDCNT, vLIT, vTMP, PROGRESS, I, DOCLEN;

  END IF;

   

  

  EXIT WHEN vRight = cBorder;

 END LOOP;



The doc is preprocessed, between each word only a single blank can be.


My problem is : The first 25K words are quite quick, but the insert become slower and slower. starting with 1K words per sec I end up with 100 words in 10 sec (when I reach 80K-100K words)

the only (nonunique index) on tempdoc is on RAWTEXT.


What can I do ? Should I drop the index ?


Here is my config:


shared_buffers = 2000 # min 16, at least max_connections*2, 8KB each

work_mem = 32768 # min 64, size in KB

maintenance_work_mem = 16384 # min 1024, size in KB

max_stack_depth = 8192 # min 100, size in KB


enable_hashagg = true

enable_hashjoin = true

enable_indexscan = true

enable_mergejoin = true

enable_nestloop = true

enable_seqscan = false


The machine is a XEON 3GHz, 1GB RAM, SATA RAID 1 Array running 8.0.4 i686 precompiled



Thanks !




Mit freundlichen Grüßen
Dipl.Inform.Marcus Noerder-Tuitje
Entwickler


software technology AG

Kortumstraße 16 

44787 Bochum

Tel: 0234 / 52 99 6 26

Fax: 0234 / 52 99 6 22

E-Mail: [EMAIL PROTECTED] 

Internet: www.technology.de 






Re: [PERFORM] (View and SQL) VS plpgsql

2005-11-10 Thread Nörder-Tuitje , Marcus



snip
FOR r_record IN SELECT count(cid) AS hits,src, bid, tid,NULL::int8 as 
min_time,NULL::int8 as max_time FROM archive_event WHERE inst=\'3\' AND 
(utctime BETWEEN \'111492\' AND \'1131512399\') GROUP BY src, bid, tid LOOP

SELECT INTO one_record MIN(utctime) as timestart,MAX(utctime) as 
timestop from archive_event  where src =r_record.src AND bid =r_record.bid  AND 
tid = r_record.tid AND inst =\'3\' AND (utctime BETWEEN \'111492\' AND 
\'1131512399\');
/snip


(it seems to me, that you might combine both queries)

1. have you ever tried to select the min/max within the first stmt ? as i see  
you are reducing data in second stmt using same key as in stmt 1.
2. you are querying data using two keys  (int, utctime). you may create a 
combined index speeding up your query
3. same for grouping. you are grouping over three fields. composite indexing 
may helps (8.1 supports index based grouping)

regards,

marcus





-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Auftrag von Eric
Lauzon
Gesendet: Mittwoch, 9. November 2005 21:43
An: pgsql-performance@postgresql.org
Betreff: [PERFORM] (View and SQL) VS plpgsql



Hello all , i post this question here because i wasen't able to find
answer to my question elsewhere , i hope someone can answer.


Abstract:

The function that can be found at the end of the e-mail emulate two thing.

First it will fill a record set of result with needed column from a table and 
two empty result column a min and a max.

Those two column are then filled by a second query on the same table that will 
do a min and a max

on an index idx_utctime.

The function loop for the first recordset and return a setof record that is 
casted by caller to the function.


The goald of this is to enabled the application that will receive the result 
set to minimise its

work by having to group internaly two matching rowset. We use to handle two 
resultset but i am looking

toward improving performances and at first glance it seem to speed up the 
process.


Questions:

1. How could this be done in a single combinasion of SQL and view? 

2. In a case like that is plpgsql really givig significant overhead?

3. Performance difference [I would need a working pure-SQL version to compare 
PLANNER and Explain results ]

STUFF:

--TABLE  INDEX


CREATE TABLE archive_event
(
  inst int4 NOT NULL,
  cid int8 NOT NULL,
  src int8 NOT NULL,
  dst int8 NOT NULL,
  bid int8 NOT NULL,
  tid int4 NOT NULL,
  utctime int4 NOT NULL,
  CONSTRAINT ids_archives_event_pkey PRIMARY KEY (inst, cid),
  CONSTRAINT ids_archives_event_cid_index UNIQUE (cid)
) 

--index

CREATE INDEX idx_archive_utctime
  ON archive_event
  USING btree
  (utctime);

CREATE INDEX idx_archive_src
  ON archive_event
  USING btree
  (src);

CREATE INDEX idx_archive_bid_tid
  ON archive_event
  USING btree
  (tid, bid);




--FUNCTION
CREATE OR REPLACE FUNCTION console_get_source_rule_level_1()
  RETURNS SETOF RECORD AS
'
DECLARE

one_record record;
r_record record;

BEGIN

FOR r_record IN SELECT count(cid) AS hits,src, bid, tid,NULL::int8 as 
min_time,NULL::int8 as max_time FROM archive_event WHERE inst=\'3\' AND 
(utctime BETWEEN \'111492\' AND \'1131512399\') GROUP BY src, bid, tid LOOP

SELECT INTO one_record MIN(utctime) as timestart,MAX(utctime) as 
timestop from archive_event  where src =r_record.src AND bid =r_record.bid  AND 
tid = r_record.tid AND inst =\'3\' AND (utctime BETWEEN \'111492\' AND 
\'1131512399\');

r_record.min_time := one_record.timestart;
r_record.max_time := one_record.timestop;
  
RETURN NEXT r_record;

END LOOP;

RETURN;

END;
'
  LANGUAGE 'plpgsql' VOLATILE;
GRANT EXECUTE ON FUNCTION console_get_source_rule_level_1() TO console WITH 
GRANT OPTION;


--FUNCTION CALLER
SELECT * from get_source_rule_level_1() AS (hits int8,src int8,bid int8,tid 
int4,min_time int8,max_time int8)



Eric Lauzon
[Recherche  Développement]
Above Sécurité / Above Security
Tél  : (450) 430-8166
Fax : (450) 430-1858 

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Inefficient escape codes.

2005-10-20 Thread Nörder-Tuitje , Marcus
I guess, You should check, if a blob field and large object access is suitable 
for you - no escaping etc, just raw binary large objects.

AFAIK, PQExecParams is not the right solution for You. Refer the Large object 
section:

28.3.5. Writing Data to a Large Object
The function
int lo_write(PGconn *conn, int fd, const char *buf, size_t len);writes len 
bytes from buf to large object descriptor fd. The fd argument must have been 
returned by a previous lo_open. The number of bytes actually written is 
returned. In the event of an error, the return value is negative.

regards,
Narcus

-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Auftrag von Michael
Fuhr
Gesendet: Dienstag, 18. Oktober 2005 22:47
An: Rodrigo Madera
Cc: pgsql-performance@postgresql.org
Betreff: Re: [PERFORM] Inefficient escape codes.


[Please copy the mailing list on replies so others can participate
in and learn from the discussion.]

On Tue, Oct 18, 2005 at 07:09:08PM +, Rodrigo Madera wrote:
  What language and API are you using?
 
 I'm using libpqxx. A nice STL-style library for C++ (I am 101% C++).

I've only dabbled with libpqxx; I don't know if or how you can make
it send data in binary instead of text.  See the documentation or
ask in a mailing list like libpqxx-general or pgsql-interfaces.

  Binary transfer sends data in binary, not by automatically converting
  to and from text.
 
 Uh, I'm sorry I didn't get that... If I send: insert into foo
 values('\\001\\002') will libpq send 0x01, 0x02 or 001002??

If you do it that way libpq will send the string as text with escape
sequences; you can use a sniffer like tcpdump or ethereal to see this
for yourself.  To send the data in binary you'd call PQexecParams()
with a query like INSERT INTO foo VALUES ($1).  The $1 is a
placeholder; the other arguments to PQexecParams() provide the data
itself, the data type and length, and specify whether the data is in
text format or binary.  See the libpq documentation for details.

-- 
Michael Fuhr

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Deleting Records

2005-10-20 Thread Nörder-Tuitje , Marcus

what about firing a 

DELETE FROM qc_session S 
   WHERE EXISTS (SELECT * 
   FROM item_qc_doer i
  WHERE i.item_id = s.item_id);

and 

DELETE FROM item_qc_doer S 
   WHERE EXISTS (SELECT * 
   FROM item_qc_doer i
  WHERE i.item_id = s.item_id);


this might be faster.

another way to speed up deletes might be disabling foreign keys.

also a SET ENABLE_SEQSCAN=FALSE; can speed up queries (force use of indices for 
access)


do you have a EXPLAIN for us ? do you have a index on item_id on your tables ?

questions by questions ;-)



mfg


-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Auftrag von Christian
Paul B. Cosinas
Gesendet: Donnerstag, 20. Oktober 2005 10:44
An: pgsql-performance@postgresql.org
Betreff: [PERFORM] Deleting Records


Hi!

I'm experiencing a very slow deletion of records. Which I thin is not right.
I have a Dual Xeon Server with 6gig Memory.
I am only deleting about 22,000 records but it took me more than 1 hour to
finish this.

What could possibly I do so that I can make this fast?

Here is the code inside my function:

FOR temp_rec IN SELECT * FROM item_qc_doer LOOP
DELETE FROM qc_session WHERE item_id = temp_rec.item_id;
DELETE FROM item_qc_doer WHERE item_id = temp_rec.item_id;
END LOOP;

Item_qc_oder table contains 22,000 records.


I choose Polesoft Lockspam to fight spam, and you?
http://www.polesoft.com/refer.html


---(end of broadcast)---
TIP 6: explain analyze is your friend



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Optimizer misconfigured ?

2005-10-14 Thread Nörder-Tuitje , Marcus
Hi,

meanwhile I have received the hint to make postgres use the index via

SET ENABLE_SEQSCAN=FALSE;

which fits perfectly. The execution plan now indicates full use of index.

Nevertheless this is merely a workaround. Maybe the io-costs are configured to 
cheap.

thanks :-)


-Ursprüngliche Nachricht-
Von: Richard Huxton [mailto:[EMAIL PROTECTED]
Gesendet: Donnerstag, 13. Oktober 2005 12:22
An: Nörder-Tuitje, Marcus
Cc: pgsql-performance@postgresql.org
Betreff: Re: [PERFORM] Optimizer misconfigured ?


Nörder-Tuitje wrote:
 
 Hello, 
 
 I have a strange effect on upcoming structure :

People will be wanting the output of EXPLAIN ANALYSE on that query.

They'll also ask whether you've VACUUMed, ANALYSEd and configured your 
postgresql.conf correctly.

-- 
   Richard Huxton
   Archonet Ltd





---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly