date:20050907

Re: [PERFORM] insert performance for win32

2005-09-07 Thread Merlin Moncure

> > One thing I did notice that in a 250k insert transaction the insert
time
> > grows with #recs inserted.  Time to insert first 50k recs is about
27
> > sec and last 50 k recs is 77 sec.  I also confimed that size of
table is
> > not playing a role here.
> >
> > Marc, can you do select timeofday() every 50k recs from linux?  Also
a
> > gprof trace from linux would be helpful.
> >
> 
> Here's the timeofday ... i'll do the gprof as soon as I can.
> Every 5 rows...
> 
Were those all in a single transaction?

Merlin

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] insert performance for win32

2005-09-07 Thread Marc Cousin

On Tuesday 06 September 2005 19:11, Merlin Moncure wrote:
> > This makes me wonder if we are looking in the wrong place.  Maybe the
> > problem is coming from psql?  More results to follow.
>
> problem is not coming from psql.
>
> One thing I did notice that in a 250k insert transaction the insert time
> grows with #recs inserted.  Time to insert first 50k recs is about 27
> sec and last 50 k recs is 77 sec.  I also confimed that size of table is
> not playing a role here.
>
> Marc, can you do select timeofday() every 50k recs from linux?  Also a
> gprof trace from linux would be helpful.
>

Here's the timeofday ... i'll do the gprof as soon as I can.
Every 5 rows...

Wed Sep 07 13:58:13.860378 2005 CEST
Wed Sep 07 13:58:20.926983 2005 CEST
Wed Sep 07 13:58:27.928385 2005 CEST
Wed Sep 07 13:58:35.472813 2005 CEST
Wed Sep 07 13:58:42.825709 2005 CEST
Wed Sep 07 13:58:50.789486 2005 CEST
Wed Sep 07 13:58:57.553869 2005 CEST
Wed Sep 07 13:59:04.298136 2005 CEST
Wed Sep 07 13:59:11.066059 2005 CEST
Wed Sep 07 13:59:19.368694 2005 CEST





> Merlin
>
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>choose an index scan if your joining column's datatypes do not
>match

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] insert performance for win32

2005-09-07 Thread Merlin Moncure

> On Tuesday 06 September 2005 19:11, Merlin Moncure wrote:
> Here's the timeofday ... i'll do the gprof as soon as I can.
> Every 5 rows...
> 
> Wed Sep 07 13:58:13.860378 2005 CEST
> Wed Sep 07 13:58:20.926983 2005 CEST
> Wed Sep 07 13:58:27.928385 2005 CEST
> Wed Sep 07 13:58:35.472813 2005 CEST
> Wed Sep 07 13:58:42.825709 2005 CEST
> Wed Sep 07 13:58:50.789486 2005 CEST
> Wed Sep 07 13:58:57.553869 2005 CEST
> Wed Sep 07 13:59:04.298136 2005 CEST
> Wed Sep 07 13:59:11.066059 2005 CEST
> Wed Sep 07 13:59:19.368694 2005 CEST

ok, I've been in crunching profile profile graphs, and so far have been
only been able to draw following conclusions.

For bulk, 'in-transaction' insert:
1. win32 is slower than linux.  win32 time for each insert grows with #
inserts in xact, linux does not (or grows much slower).  Win32 starts
out about 3x slower and grows to 10x slower after 250k inserts.

2. ran a 50k profile vs. 250k profile.  Nothing jumps out as being
slower or faster: most time is spent in yyparse on either side.  From
this my preliminary conclusion is that there is something going on in
the win32 api which is not showing in the profile.

3. The mingw gprof cumulative seconds does not show measurable growth in
cpu time/insert in 50k/250k profile.

I'm now talking suggestions about where to look for performance problems
:(.
Merlin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [pgsql-hackers-win32] [PERFORM] insert performance for win32

2005-09-07 Thread Tom Lane

"Merlin Moncure" <[EMAIL PROTECTED]> writes:
> ok, I've been in crunching profile profile graphs, and so far have been
> only been able to draw following conclusions.

> For bulk, 'in-transaction' insert:
> 1. win32 is slower than linux.  win32 time for each insert grows with #
> inserts in xact, linux does not (or grows much slower).  Win32 starts
> out about 3x slower and grows to 10x slower after 250k inserts.

Just to be clear: what you were testing was
BEGIN;
INSERT ... VALUES (...);
repeat insert many times
COMMIT;
with each statement issued as a separate PQexec() operation, correct?
Was this set up as a psql script, or specialized C code?  (If a psql
script, I wonder whether it's psql that's chewing the time.)

> 2. ran a 50k profile vs. 250k profile.  Nothing jumps out as being
> slower or faster: most time is spent in yyparse on either side.  From
> this my preliminary conclusion is that there is something going on in
> the win32 api which is not showing in the profile.

Hmm.  Client/server data transport maybe?  It would be interesting to
try inserting the same data in other ways:
* COPY from client
* COPY from disk file
* INSERT/SELECT from another table
and see whether you see a similar slowdown.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[PERFORM] Query take 101 minutes, help, please

2005-09-07 Thread Christian Compagnon

Hello,

I'm a newbie in postgresql, I've installed it on a Windows XP machine
( I can't use linux, it's a company machine ), I'm courious why this
query takes so long

SELECT "Rut Cliente"
FROM "Internet_Abril"
WHERE "Rut Cliente"  NOT IN ((SELECT "Rut Cliente"  FROM
"Internet_Enero") UNION (SELECT "Rut Cliente"  FROM
"Internet_Febrero") UNION (SELECT "Rut Cliente"  FROM
"Internet_Marzo"));

it takes about 100 minutes to complete the query.
All tables has index created ( Rut Cliente is a VarChar ), and tables
has 50.000 records each.

The explain for the query tells the following

"QUERY PLAN
 Seq Scan on "Internet_Abril"  (cost=19406.67..62126112.70 rows=24731 width=13)
  Filter: (NOT (subplan))
   SubPlan
   ->  Materialize  (cost=19406.67..21576.07 rows=136740 width=13)
->  Unique  (cost=17784.23..18467.93 rows=136740 width=13)
->  Sort  (cost=17784.23..18126.08 rows=136740 width=13)
   Sort
Key: "Rut Cliente"
  ->  Append  (cost=0.00..3741.80 rows=136740 width=13)
  ->  Subquery Scan "*SELECT* 1"  (cost=0.00..1233.38
rows=45069 width=13)
   ->  Seq Scan on "Internet_Enero"  (cost=0.00..782.69
rows=45069 width=13)
   ->  Subquery Scan "*SELECT* 2"  (cost=0.00..1104.06
rows=40353 width=13)
 ->  Seq Scan on "Internet_Febrero"  (cost=0.00..700.53
rows=40353 width=13)
   ->  Subquery Scan "*SELECT* 3"  (cost=0.00..1404.36
rows=51318 width=13)
->  Seq Scan on "Internet_Marzo"  (cost=0.00..891.18
rows=51318 width=13)

Any help will be apreciated, It's for my thesis


saludos
Christian

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] Query take 101 minutes, help, please

2005-09-07 Thread Steinar H. Gunderson

On Wed, Sep 07, 2005 at 12:22:27PM -0400, Christian Compagnon wrote:
> I'm a newbie in postgresql, I've installed it on a Windows XP machine
> ( I can't use linux, it's a company machine ), I'm courious why this
> query takes so long

It sounds like you've set work_mem too low; increasing it might help. Also
try rewriting your query to

  SELECT "Rut Cliente"
  FROM "Internet_Abril"
  WHERE
"Rut Cliente" NOT IN ( SELECT "Rut Cliente" FROM "Internet_Enero" )
AND "Rut Cliente" NOT IN ( SELECT "Rut Cliente" FROM "Internet_Febrero" )
AND "Rut Cliente" NOT IN ( SELECT "Rut Cliente" FROM "Internet_Marzo" )

(I'm not sure how optimized UNION inside an IN/NOT IN is.)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Query take 101 minutes, help, please

2005-09-07 Thread Meetesh Karia

PG is creating the union of January, February and March tables first
and that doesn't have an index on it.  If you're going to do many
queries using the union of those three tables, you might want to place
their contents into one table and create an index on it.

Otherwise, try something like this:

SELECT "Rut Cliente"
FROM "Internet_Abril"
WHERE "Rut Cliente"  NOT IN (SELECT "Rut Cliente"  FROM
"Internet_Enero")
AND "Rut Cliente" NOT IN (SELECT "Rut Cliente"  FROM
"Internet_Febrero")
AND "Rut Cliente" NOT IN (SELECT "Rut Cliente"  FROM
"Internet_Marzo");

You could also compare the performance of that to this and choose the one that works the best:

SELECT "Rut Cliente"

FROM "Internet_Abril" a
LEFT JOIN "Internet_Enero" e ON a."Rut Cliente" = e."Rut Cliente"
LEFT JOIN "Internet_Febrero" f ON a."Rut Cliente" = f."Rut Cliente"
LEFT JOIN "Internet_Marzo" m ON a."Rut Cliente" = m."Rut Cliente"
WHERE e."Rut Cliente" IS NULL AND f."Rut Cliente" IS NULL and m."Rut Cliente" IS NULL;

MeeteshOn 9/7/05, Christian Compagnon <[EMAIL PROTECTED]> wrote:
Hello,I'm a newbie in postgresql, I've installed it on a Windows XP machine( I can't use linux, it's a company machine ), I'm courious why thisquery takes so longSELECT "Rut Cliente"FROM "Internet_Abril"
WHERE "Rut Cliente"  NOT IN ((SELECT "Rut Cliente"  FROM"Internet_Enero") UNION (SELECT "Rut Cliente"  FROM"Internet_Febrero") UNION (SELECT "Rut Cliente"  FROM
"Internet_Marzo"));it takes about 100 minutes to complete the query.All tables has index created ( Rut Cliente is a VarChar ), and tableshas 50.000 records each.The explain for the query tells the following
"QUERY PLAN Seq Scan on "Internet_Abril"  (cost=19406.67..62126112.70 rows=24731 width=13)  Filter: (NOT (subplan))   SubPlan   ->  Materialize  (cost=19406.67..21576.07 rows=136740 width=13)
->  Unique  (cost=17784.23..18467.93
rows=136740 width=13)->  Sort  (cost=17784.23..18126.08 rows=136740 width=13)  
SortKey: "Rut Cliente"  ->  Append  (cost=0.00..3741.80
rows=136740 width=13)  ->  Subquery
Scan "*SELECT* 1"  (cost=0.00..1233.38rows=45069 width=13)  
->  Seq Scan on
"Internet_Enero"  (cost=0.00..782.69rows=45069 width=13)  
->  Subquery Scan "*SELECT*
2"  (cost=0.00..1104.06rows=40353 width=13)
->  Seq Scan on
"Internet_Febrero"  (cost=0.00..700.53rows=40353 width=13)  
->  Subquery Scan "*SELECT*
3"  (cost=0.00..1404.36rows=51318 width=13)->  Seq
Scan on "Internet_Marzo"  (cost=0.00..891.18rows=51318 width=13)Any help will be apreciated, It's for my thesissaludosChristian---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate   subscribe-nomail command to [EMAIL PROTECTED] so that your   message can get through to the mailing list cleanly

Re: [PERFORM] Query take 101 minutes, help, please

2005-09-07 Thread Alex Hayward

On Wed, 7 Sep 2005, Meetesh Karia wrote:

> PG is creating the union of January, February and March tables first and
> that doesn't have an index on it. If you're going to do many queries using
> the union of those three tables, you might want to place their contents into
> one table and create an index on it.
>
> Otherwise, try something like this:
>
> SELECT "Rut Cliente"
> FROM "Internet_Abril"
> WHERE "Rut Cliente" NOT IN (SELECT "Rut Cliente" FROM
> "Internet_Enero")
> AND "Rut Cliente" NOT IN (SELECT "Rut Cliente" FROM
> "Internet_Febrero")
> AND "Rut Cliente" NOT IN (SELECT "Rut Cliente" FROM
> "Internet_Marzo");

You may also wish to try:

SELECT "Rut Cliente"
FROM "Internet_Abril"
WHERE NOT EXISTS
  (SELECT 1 FROM "Internet_Enero"
   WHERE "Internet_Enero"."Rut Cliente"="Internet_Abril"."Rut Cliente")
AND NOT EXISTS
  (SELECT 1 FROM "Internet_Febrero"
   WHERE "Internet_Febrero"."Rut Cliente"="Internet_Abril"."Rut Cliente")
AND NOT EXISTS
  (SELECT 1 FROM "Internet_Marzo"
   WHERE "Internet_Marzo"."Rut Cliente"="Internet_Abril"."Rut Cliente")

which will probably scan the indexes on the January, February and March
indexes once for each row in the April table.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] Query take 101 minutes, help, please

2005-09-07 Thread Tom Lane

"Steinar H. Gunderson" <[EMAIL PROTECTED]> writes:
> (I'm not sure how optimized UNION inside an IN/NOT IN is.)

NOT IN is pretty nonoptimal, period.  It'd help a lot to boost work_mem
to the point where the planner figures it can use a hashtable (look for
EXPLAIN to say "hashed subplan" rather than just "subplan").  Of course,
if there's enough stuff in the UNION that that drives you into swapping,
it's gonna be painful anyway.

Using UNION ALL instead of UNION might save a few cycles too.

If you're willing to rewrite the query wholesale, you could try the old
trick of a LEFT JOIN where you discard rows for which there's a match,
ie, the righthand join value isn't NULL.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Poor performance of delete by primary key

2005-09-07 Thread Jim C. Nasby

On Wed, Sep 07, 2005 at 11:07:04AM +0800, Christopher Kings-Lynne wrote:
> >Unfortunately there's no very simple way to determine which FK is the
> >problem.  (In 8.1 it'll be possible to do that with EXPLAIN ANALYZE,
> >but in existing releases EXPLAIN doesn't break out the time spent in
> >each trigger ...)  You have to just eyeball the schema :-(.
> 
> phpPgAdmin has a handy info feature where you can see all tables that 
> refer to the current one.  You can always go and steal that query to 
> find them...

You can also use pg_user_foreighn_key* from
http://pgfoundry.org/projects/newsysviews/.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] insert performance for win32

Re: [PERFORM] insert performance for win32

Re: [PERFORM] insert performance for win32

Re: [pgsql-hackers-win32] [PERFORM] insert performance for win32

[PERFORM] Query take 101 minutes, help, please

Re: [PERFORM] Query take 101 minutes, help, please

Re: [PERFORM] Query take 101 minutes, help, please

Re: [PERFORM] Query take 101 minutes, help, please

Re: [PERFORM] Query take 101 minutes, help, please

Re: [PERFORM] Poor performance of delete by primary key

10 matches

Site Navigation

Mail list logo

Footer information