Re: [HACKERS] [PERFORM] insert performance for win32

2005-11-06 Thread Simon Riggs
On Fri, 2005-11-04 at 13:21 -0500, Bruce Momjian wrote:
 David Fetter wrote:
  On Fri, Nov 04, 2005 at 01:01:20PM -0500, Tom Lane wrote:
   I'm inclined to treat this as an outright bug, not just a minor
   performance issue, because it implies that a sufficiently long psql
   script would probably crash a Windows machine.
  
  Ouch.  In light of this, are we *sure* what we've got a is a candidate
  for release?
 
 Good point.  It is something we would fix in a minor release, so it
 doesn't seem worth doing another RC just for that.

Will this be documented in the release notes? If we put unimplemented
features in TODO, where do we list things we regard as bugs?

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] insert performance for win32

2005-11-04 Thread Merlin Moncure
 You mean something like the attached?
not quite:  attached is a file to generate test.
to do it:

psql yadda
\i timeit.sql
\t
\o dump.sql
select make_dump(5, false);
\q
cat dump.sql  | psql -q yadda

and see what pops out.  I had to do it that way because redirecting psql
to dump file caused psql sit forever waiting on more with cpu load...

Merlin


timeit.sql
Description: timeit.sql

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] insert performance for win32

2005-11-04 Thread Merlin Moncure
  You mean something like the attached?

oh, btw I ran timeit.c and performance is flat and fairly fast.  I'm
pretty sure psql is the culprit here.

Merlin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] insert performance for win32

2005-11-04 Thread Tom Lane
Merlin Moncure [EMAIL PROTECTED] writes:
 You mean something like the attached?

 not quite:  attached is a file to generate test.

 cat dump.sql  | psql -q yadda

Ah.  Does your psql have readline support?  if so, does adding -n to
that command change anything?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] insert performance for win32

2005-11-04 Thread Merlin Moncure
  not quite:  attached is a file to generate test.
 
  cat dump.sql  | psql -q yadda
 
 Ah.  Does your psql have readline support?  if so, does adding -n to
 that command change anything?
 

It doesn't, and it doesn't. :/  Ok, here's where it gets interesting. I
removed all the newlines from the test output (dump.sql) and got flat
times ;).  

Merlin


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] insert performance for win32

2005-11-04 Thread Tom Lane
Merlin Moncure [EMAIL PROTECTED] writes:
 It doesn't, and it doesn't. :/  Ok, here's where it gets interesting. I
 removed all the newlines from the test output (dump.sql) and got flat
 times ;).  

That's bizarre ... I'd have thought a very long line would be more
likely to trigger internal performance problems than the original.

What happens if you read the file with psql -f dump.sql instead
of cat/stdin?

BTW, I get flat times for your psql test case on Unix, again both with
local and remote client.  So whatever is going on here, it's
Windows-specific.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] insert performance for win32

2005-11-04 Thread Merlin Moncure
 That's bizarre ... I'd have thought a very long line would be more
 likely to trigger internal performance problems than the original.
 
 What happens if you read the file with psql -f dump.sql instead
 of cat/stdin?

non-flat.  Also ran via \i and got non flat times.

 BTW, I get flat times for your psql test case on Unix, again both with
 local and remote client.  So whatever is going on here, it's
 Windows-specific.

yeah.  I'm guessing problem is in the mingw flex/bison (which I really,
really hope is not the case) or some other win32 specific block of code.
I'm snooping around there...

Merlin 

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] insert performance for win32

2005-11-04 Thread Tom Lane
Merlin Moncure [EMAIL PROTECTED] writes:
 yeah.  I'm guessing problem is in the mingw flex/bison (which I really,
 really hope is not the case) or some other win32 specific block of code.
 I'm snooping around there...

Maybe I'm confused here, but I thought we had established that the local
and remote cases behave differently for you?  If so I'd suppose that it
must be a networking issue, and there's little point in looking inside
psql.

If the problem is internal to psql, gprof or similar tool would be
helpful ... got anything like that on Windows?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] insert performance for win32

2005-11-04 Thread Merlin Moncure
 
 Merlin Moncure [EMAIL PROTECTED] writes:
  yeah.  I'm guessing problem is in the mingw flex/bison (which I
really,
  really hope is not the case) or some other win32 specific block of
code.
  I'm snooping around there...
 
 Maybe I'm confused here, but I thought we had established that the
local
 and remote cases behave differently for you?  If so I'd suppose that
it
 must be a networking issue, and there's little point in looking inside
 psql.
 
The local case is *worse*...presumably because psql is competing with
the server for cpu time...cpu load is pegged at 100%.  On the remote
case, I'm getting 50-60% cpu load which is way to high.  The problem is
definitely in psql.

Merlin


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] insert performance for win32

2005-11-04 Thread Merlin Moncure
ok, here is gprof output from newlines/no newlines 
[newlines]
  %   cumulative   self  self total   
 time   seconds   secondscalls   s/call   s/call  name
 19.03  0.67 0.671 0.67 3.20  MainLoop
 17.61  1.29 0.62   500031 0.00 0.00  yylex
 15.63  1.84 0.55  1500094 0.00 0.00  GetVariable
 11.08  2.23 0.39   250018 0.00 0.00  SendQuery
  4.26  2.38 0.15   750051 0.00 0.00  GetVariableBool
  3.41  2.50 0.12   250024 0.00 0.00  SetVariable
  2.56  2.59 0.09   250015 0.00 0.00  gets_fromFile
  2.27  2.67 0.08   750044 0.00 0.00
yy_switch_to_buffer
  2.27  2.75 0.08   500031 0.00 0.00  psql_scan
  2.27  2.83 0.08 pg_strcasecmp
  1.70  2.89 0.06  4250078 0.00 0.00  emit
  1.70  2.95 0.06   500031 0.00 0.00  VariableEquals
  1.70  3.01 0.06   250018 0.00 0.00  AcceptResult
  1.42  3.06 0.05   250018 0.00 0.00  ResetCancelConn

[no newlines]
  %   cumulative   self  self total   
 time   seconds   secondscalls   s/call   s/call  name
 23.01  0.26 0.26   250019 0.00 0.00  yylex
 19.47  0.48 0.22   250018 0.00 0.00  SendQuery
 11.50  0.61 0.13  170 0.00 0.00  GetVariable
  9.73  0.72 0.11   250042 0.00 0.00  pg_strdup
  9.73  0.83 0.11   250024 0.00 0.00  SetVariable
  6.19  0.90 0.07   500039 0.00 0.00  GetVariableBool
  5.31  0.96 0.06 pg_strcasecmp
  4.42  1.01 0.05  4250078 0.00 0.00  emit
  2.65  1.04 0.031 0.03 1.01  MainLoop

ok, mingw gprof is claiming MainLoop is a culprit here, along with
general efficiency penalty otherwise in several things (twice many calls
to yylex, 33%more to getvariable, etc).  Just for fun I double checked
string len of query input to SendQuery and everything is the right
length.

Same # calls to SendQuery, but 2.5 times call time in newlines
case...anything jump out?  

Merlin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] insert performance for win32

2005-11-04 Thread Tom Lane
Merlin Moncure [EMAIL PROTECTED] writes:
 ok, mingw gprof is claiming MainLoop is a culprit here,

The only thing I can see that would be different for Windows is the
SetConsoleCtrlHandler kernel call ... could that be expensive?  Why
do we have either sigsetjmp or setup_cancel_handler inside the per-line
loop, rather than just before it?

There is a lot of stuff in MainLoop that doesn't seem like it really
needs to be done on every single line, particularly not the repeated
fetching of psql variables that couldn't possibly change except inside
HandleSlashCmds.  But that all ought to be the same on Unix or Windows.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] insert performance for win32

2005-11-04 Thread Merlin Moncure
Nailed it.

problem is in mainloop.c - setup_cancel_handler.  Apparently you can
have multiple handlers and windows keeps track of them all, even if they
do the same thing.  Keeping track of so many system handles would
naturally slow the whole process down.  Commenting that line times are
flat as a pancake.  I am thinking keeping track of a global flag would
be appropriate.  


Merlin


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] insert performance for win32

2005-11-04 Thread Tom Lane
Merlin Moncure [EMAIL PROTECTED] writes:
 Nailed it.

 problem is in mainloop.c - setup_cancel_handler.  Apparently you can
 have multiple handlers and windows keeps track of them all, even if they
 do the same thing.  Keeping track of so many system handles would
 naturally slow the whole process down.

Yipes.  So we really want to do that only once.

AFAICS it is appropriate to move the sigsetjmp and setup_cancel_handler
calls in front of the per-line loop inside MainLoop --- can anyone see
a reason not to?

I'm inclined to treat this as an outright bug, not just a minor
performance issue, because it implies that a sufficiently long psql
script would probably crash a Windows machine.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] insert performance for win32

2005-11-04 Thread Merlin Moncure
 Merlin Moncure [EMAIL PROTECTED] writes:
  Nailed it.
 
  problem is in mainloop.c - setup_cancel_handler.  Apparently you
can
  have multiple handlers and windows keeps track of them all, even if
they
  do the same thing.  Keeping track of so many system handles would
  naturally slow the whole process down.
 
 Yipes.  So we really want to do that only once.
 
 AFAICS it is appropriate to move the sigsetjmp and
setup_cancel_handler
 calls in front of the per-line loop inside MainLoop --- can anyone see
 a reason not to?

hm. mainloop is re-entrant, right?  That means each \i would reset the
handler...what is downside to keeping global flag?


 I'm inclined to treat this as an outright bug, not just a minor
certainly...

 performance issue, because it implies that a sufficiently long psql
 script would probably crash a Windows machine.

actually, it's worse than that, it's more of a dos on the whole system,
as windows will eventually stop granting handles, but there is a good
chance of side effects on other applications.

Merlin

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] insert performance for win32

2005-11-04 Thread Tom Lane
Merlin Moncure [EMAIL PROTECTED] writes:
 AFAICS it is appropriate to move the sigsetjmp and
 setup_cancel_handler
 calls in front of the per-line loop inside MainLoop --- can anyone see
 a reason not to?

 hm. mainloop is re-entrant, right?  That means each \i would reset the
 handler...what is downside to keeping global flag?

Ah, right, and in fact I'd missed the comment at line 325 pointing out
that we're relying on the sigsetjmp to be re-executed every time
through.  That could be improved on, likely, but not right before a
release.

Does the flag need to be global?  I'm thinking

  void
  setup_cancel_handler(void)
  {
+   static bool done = false;
+
+   if (!done)
SetConsoleCtrlHandler(consoleHandler, TRUE);
+   done = true;
  }


regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] insert performance for win32

2005-11-04 Thread Bruce Momjian
Tom Lane wrote:
 Merlin Moncure [EMAIL PROTECTED] writes:
  Nailed it.
 
  problem is in mainloop.c - setup_cancel_handler.  Apparently you can
  have multiple handlers and windows keeps track of them all, even if they
  do the same thing.  Keeping track of so many system handles would
  naturally slow the whole process down.
 
 Yipes.  So we really want to do that only once.
 
 AFAICS it is appropriate to move the sigsetjmp and setup_cancel_handler
 calls in front of the per-line loop inside MainLoop --- can anyone see
 a reason not to?

Nope.

 I'm inclined to treat this as an outright bug, not just a minor
 performance issue, because it implies that a sufficiently long psql
 script would probably crash a Windows machine.

Agreed.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PERFORM] insert performance for win32

2005-11-04 Thread Tom Lane
David Fetter [EMAIL PROTECTED] writes:
 On Fri, Nov 04, 2005 at 01:01:20PM -0500, Tom Lane wrote:
 I'm inclined to treat this as an outright bug, not just a minor
 performance issue, because it implies that a sufficiently long psql
 script would probably crash a Windows machine.

 Ouch.  In light of this, are we *sure* what we've got a is a candidate
 for release?

Sure.  This problem exists in 8.0.* too.  Pre-existing bugs don't
disqualify an RC in my mind --- we fix them and move on, same as we
would do at any other time.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PERFORM] insert performance for win32

2005-11-04 Thread Bruce Momjian
David Fetter wrote:
 On Fri, Nov 04, 2005 at 01:01:20PM -0500, Tom Lane wrote:
  Merlin Moncure [EMAIL PROTECTED] writes:
   Nailed it.
  
   problem is in mainloop.c - setup_cancel_handler.  Apparently you
   can have multiple handlers and windows keeps track of them all,
   even if they do the same thing.  Keeping track of so many system
   handles would naturally slow the whole process down.
  
  Yipes.  So we really want to do that only once.
  
  AFAICS it is appropriate to move the sigsetjmp and
  setup_cancel_handler calls in front of the per-line loop inside
  MainLoop --- can anyone see a reason not to?
  
  I'm inclined to treat this as an outright bug, not just a minor
  performance issue, because it implies that a sufficiently long psql
  script would probably crash a Windows machine.
 
 Ouch.  In light of this, are we *sure* what we've got a is a candidate
 for release?

Good point.  It is something we would fix in a minor release, so it
doesn't seem worth doing another RC just for that.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] insert performance for win32

2005-11-04 Thread Magnus Hagander
  I'm inclined to treat this as an outright bug, not just a minor
 certainly...
 
  performance issue, because it implies that a sufficiently long psql 
  script would probably crash a Windows machine.
 
 actually, it's worse than that, it's more of a dos on the 
 whole system, as windows will eventually stop granting 
 handles, but there is a good chance of side effects on other 
 applications.

Does it actually use up *handles* there? I don't see anything in the
docs that says it should do that - and they usually do document when
handles are used. You should be seeing a *huge* increase in system
handles very fast if it does, right? 

That said, I definitly agree with calling it a bug :-)

//Magnus

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] insert performance for win32

2005-11-04 Thread Qingqing Zhou

Merlin Moncure [EMAIL PROTECTED] wrote
 ok, here is gprof output from newlines/no newlines
 [newlines]
  %   cumulative   self  self total
 time   seconds   secondscalls   s/call   s/call  name
 19.03  0.67 0.671 0.67 3.20  MainLoop
 17.61  1.29 0.62   500031 0.00 0.00  yylex
 15.63  1.84 0.55  1500094 0.00 0.00  GetVariable
 11.08  2.23 0.39   250018 0.00 0.00  SendQuery
  4.26  2.38 0.15   750051 0.00 0.00  GetVariableBool
  3.41  2.50 0.12   250024 0.00 0.00  SetVariable
  2.56  2.59 0.09   250015 0.00 0.00  gets_fromFile
  2.27  2.67 0.08   750044 0.00 0.00
 yy_switch_to_buffer
  2.27  2.75 0.08   500031 0.00 0.00  psql_scan
  2.27  2.83 0.08 pg_strcasecmp
  1.70  2.89 0.06  4250078 0.00 0.00  emit
  1.70  2.95 0.06   500031 0.00 0.00  VariableEquals
  1.70  3.01 0.06   250018 0.00 0.00  AcceptResult
  1.42  3.06 0.05   250018 0.00 0.00  ResetCancelConn


Maybe I missed some threads  do you think it is interesting to test the 
*absoulte* time difference of the same machine on Windows/Linux by using 
timeit.c? I wonder if windows is slower than Linux ...

Regards,
Qingqing 



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PERFORM] insert performance for win32

2005-11-04 Thread Magnus Hagander
  AFAICS it is appropriate to move the sigsetjmp and 
  setup_cancel_handler calls in front of the per-line loop inside 
  MainLoop --- can anyone see a reason not to?
 
  hm. mainloop is re-entrant, right?  That means each \i 
 would reset the 
  handler...what is downside to keeping global flag?
 
 Ah, right, and in fact I'd missed the comment at line 325 
 pointing out that we're relying on the sigsetjmp to be 
 re-executed every time through.  That could be improved on, 
 likely, but not right before a release.
 
 Does the flag need to be global?  I'm thinking
 
   void
   setup_cancel_handler(void)
   {
 + static bool done = false;
 +
 + if (!done)
   SetConsoleCtrlHandler(consoleHandler, TRUE);
 + done = true;
   }
 

Seems like a simple enough solution, don't see why it shouldn't work. As
long as psql is single-threaded, which it is...
(Actually, that code seems to re-set done=true on every call which seems
unnecessary - but that might be optimised away, I guess)

//Magnus


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] insert performance for win32

2005-11-04 Thread Qingqing Zhou

Magnus Hagander [EMAIL PROTECTED] wrote

 I'd like to use the win32 provided recv(), send() functions
 instead of redirect them to pgwin32_recv()/pgwin32_send(),
 just like libpq does. If we do this, we will lose some
 functionalities, but I'd like to see the performance
 difference first. -- do you think that will be any difference?

 I doubt there will be much performance difference, as you hav eto hit
 the kernel anyway (in the recv/send call). But that part is just a guess
 :-)


On a separate line -- I verified Magnus's doubt -- revert pgwin32_recv() to 
recv() does not improve performance visiblly.

Regards,
Qingqing 



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Merlin Moncure
 On Wed, 2 Nov 2005, Merlin Moncure wrote:
 If you put client/server on the same machine, then we don't know how
the
 CPU is splitted. Can you take a look at the approximate number by
 observing the task manager data while running?

ok, I generated a test case which was 250k inserts to simple two column
table all in single transaction.  Every 50k inserts, time is recorded
via timeofday().  

Running from remote, Time progression is:
First  50k: 20 sec
Second: 29 sec
[...]
final:: 66 sec

so, clear upward progression of time/rec.  Initial time is 2.5k
inserts/sec which is decent but not great for such a narrow table.  CPU
time on server starts around 50% and drops in exact proportion to insert
performance.  My earlier gprof test also suggest there is no smoking gun
sucking down all the cpu time.

cpu time on the client is very volatile but with a clear increase over
time starting around 20 and ending perhaps 60. My client box is pretty
quick, 3ghz p4.

Running the script locally, from the server, cpu time is pegged at 100%
and stays...first 50k is 23 sec with a much worse decomposition to
almost three minutes for final 50k.

Merlin



 
 If communication code is the suspect, can we measure the difference if
we
 disable the redefinition of recv()/send() etc in port/win32.h (may
require
 change related code a little bit as well). In this way, the socket
will
 not be able to pickup signals, but let see if there is any performance
 difference first.
 
 Regards,
 Qingqing
 
 
 
  [OK, I'm bringing this back on-list, and bringing it to QingQing's
  attention, who I secretly hope is the right person to be looking at
this
  problem :)]
 
 P.s. You scared me ;-)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Merlin Moncure
 Both win32 send/recv have pgwin32_poll_signals() in them.  This is
 glorified WaitForSingleObjectEx on global pgwin32_signal_event.  This
is
 probably part of the problem.  Can we work some of the same magic you
put
 into check interrupts macro?

Whoop! following a cvs update I see this is already nailed :) Back to
the drawing board...

Merlin

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Qingqing Zhou


On Thu, 3 Nov 2005, Magnus Hagander wrote:


 Sorry, I don't follow you here - what do you mean to do? Remove the
 event completely so we can't wait on it?


I'd like to use the win32 provided recv(), send() functions instead of
redirect them to pgwin32_recv()/pgwin32_send(), just like libpq does. If
we do this, we will lose some functionalities, but I'd like to see the
performance difference first. -- do you think that will be any difference?

Regards,
Qingqing

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Magnus Hagander
   Both win32 send/recv have pgwin32_poll_signals() in them.
   This is glorified WaitForSingleObjectEx on global 
   pgwin32_signal_event.  This is probably part of the problem.
   Can we work some of the same magic you put into check interrupts 
   macro?
  
 
  Uh, we already do that, don't we?
  
 http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/port/win3
  2/
  socket.c?rev=1.10
  has:
 
 
 Yeah, we did this. I am thinking of just use simple mechanism 
 of the win32 sockets, which could not pick up signals, but I 
 would like to see if there is any difference -- do you think 
 there is any point to try this?

Sorry, I don't follow you here - what do you mean to do? Remove the
event completely so we can't wait on it?

//Magnus

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Qingqing Zhou


On Thu, 3 Nov 2005, Magnus Hagander wrote:

   Sorry, I don't follow you here - what do you mean to do? Remove the
   event completely so we can't wait on it?
  
 
  I'd like to use the win32 provided recv(), send() functions
  instead of redirect them to pgwin32_recv()/pgwin32_send(),
  just like libpq does. If we do this, we will lose some
  functionalities, but I'd like to see the performance
  difference first. -- do you think that will be any difference?

 Doesn't work, really. It will no longer be possible to send a signal to
 an idle backend. The idle backend will be blocking on recv(), that's how
 it works. So unless we can get around that somehow, it's a non-starter I
 think.

Yeah, agreed.  An alternative is set tiemout like 100 ms or so. When
timeout happens, check the signals. But I guess you will be strongly
against it.


 I doubt there will be much performance difference, as you hav eto hit
 the kernel anyway (in the recv/send call). But that part is just a guess
 :-)

I know what you mean ...  I will take a look -- if the patch (not
including fix signaling problem), if doesn't change much, I will give it a
try.

Regards,
Qingqing

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Magnus Hagander
   I'd like to use the win32 provided recv(), send() 
 functions instead 
   of redirect them to pgwin32_recv()/pgwin32_send(), just 
 like libpq 
   does. If we do this, we will lose some functionalities, 
 but I'd like 
   to see the performance difference first. -- do you think 
 that will 
   be any difference?
 
  Doesn't work, really. It will no longer be possible to send 
 a signal 
  to an idle backend. The idle backend will be blocking on recv(), 
  that's how it works. So unless we can get around that 
 somehow, it's a 
  non-starter I think.
 
 Yeah, agreed.  An alternative is set tiemout like 100 ms or 
 so. When timeout happens, check the signals. But I guess you 
 will be strongly against it.

Not on principle, but I don't think it'll give us enough gain for the
cost. But if it does, I'm certainly not against it.



//Magnus

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Qingqing Zhou

Merlin Moncure [EMAIL PROTECTED] wrote

 Running from remote, Time progression is:
 First  50k: 20 sec
 Second: 29 sec
 [...]
 final:: 66 sec

This may due to the maintainence cost of a big transaction, I am not sure 
... Tom?

 so, clear upward progression of time/rec.  Initial time is 2.5k
 inserts/sec which is decent but not great for such a narrow table.  CPU
 time on server starts around 50% and drops in exact proportion to insert
 performance.  My earlier gprof test also suggest there is no smoking gun
 sucking down all the cpu time.


Not to 100%, so this means the server is always starving. It is waiting on 
something -- of couse not lock. That's why I think there is some problem on 
network communication. Another suspect will be the write - I knwo NTFS 
system will issue an internal log when extending a file. To remove the 
second suspect, I will try to hack the source to do a fake write ...

Regards,
Qingqing 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Qingqing Zhou

Qingqing Zhou [EMAIL PROTECTED] wrote

 Not to 100%, so this means the server is always starving. It is waiting on 
 something -- of couse not lock. That's why I think there is some problem 
 on network communication. Another suspect will be the write - I knwo NTFS 
 system will issue an internal log when extending a file. To remove the 
 second suspect, I will try to hack the source to do a fake write ...


To patch:
-
Here is a quite straight hack to implement fake write for both relation 
and xlog. Now the server becomes pure CPU play.

1. RelationGetBufferForTuple()/hio.c: remove line (if you do not enable 
cassert, then doesn't matter):
- Assert(PageIsNew((PageHeader) pageHeader));

2. ReadBuffer()/bufmgr.c: remove line
- smgrextend(reln-rd_smgr, blockNum, (char *) bufBlock,
- reln-rd_istemp);

3. XLogWrite()/xlog.c
   errno = 0;
+  goto fake;
   if (write(openLogFile, from, nbytes) != nbytes)
   {
/* if write didn't set errno, assume no disk space */
...
   }
+
+ fake:
   /* Update state for write */


To use it:
-
1. have several copies of a correct data;

2. patch the server;

3. when you startup postmaster, use the following parameters:
postmaster -ccheckpoint_timeout=3600 -cbgwriter_all_percent=0 -Ddata

Note now the database server is one-shoot usable -- after you shutdown, it 
won't startup again. Just run
begin;
many inserts;
end;

To observe:
-
(1) In this case, what's the remote server CPU usage -- 100%? I don't have 
several machines to test it. In my single machine, I run 35000 insert 
commands from psql by cut and paste into it and could observe that:
---
25% kernel time
75% user time

20% postgresql (--enable-debug --enable-cassert)
65% psql (as same above)
10% csrss (system process, manage graphics commands (not sure, just googled 
it), etc)
5%  system (system process)
---

(2) In this case, Linux still keeps almost 10 times faster?

After this, we may need more observations like comparison of simple select 
1; to reduce the code space we may want to explore ...

Regards,
Qingqing 



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Qingqing Zhou


On Thu, 3 Nov 2005, Magnus Hagander wrote:

  Both win32 send/recv have pgwin32_poll_signals() in them.
  This is glorified WaitForSingleObjectEx on global
  pgwin32_signal_event.  This is probably part of the problem.
  Can we work some of the same magic you put into check
  interrupts macro?
 

 Uh, we already do that, don't we?
 http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/port/win32/
 socket.c?rev=1.10
 has:


Yeah, we did this. I am thinking of just use simple mechanism of the win32
sockets, which could not pick up signals, but I would like to see if there
is any difference -- do you think there is any point to try this?

Regards,
Qingqing

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Tom Lane
Merlin Moncure [EMAIL PROTECTED] writes:
 ok, I generated a test case which was 250k inserts to simple two column
 table all in single transaction.  Every 50k inserts, time is recorded
 via timeofday().  

You mean something like the attached?

 Running from remote, Time progression is:
 First  50k: 20 sec
 Second: 29 sec
 [...]
 final:: 66 sec

On Unix I get a dead flat line (within measurement noise), both local
loopback and across my LAN.

after 5 30.20 sec
after 10 31.67 sec
after 15 30.98 sec
after 20 29.64 sec
after 25 29.83 sec

top shows nearly constant CPU usage over the run, too.  With a local
connection it's pretty well pegged, with LAN connection the server's
about 20% idle and the client about 90% (client machine is much faster
than server which may affect this, but I'm too lazy to try it in the
other direction).

I think it's highly likely that you are looking at some strange behavior
of the Windows TCP stack.

regards, tom lane



binU7zmyKjiMS.bin
Description: timeit.c

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Qingqing Zhou


On Thu, 3 Nov 2005, Tom Lane wrote:


 On Unix I get a dead flat line (within measurement noise), both local
 loopback and across my LAN.

 after 5 30.20 sec
 after 10 31.67 sec
 after 15 30.98 sec
 after 20 29.64 sec
 after 25 29.83 sec


Confirmed in Linux. And on a winxp machine(sp2) with server, client
together, with (see almost no performance difference) or without my fake
write, the observation is still hold for both cases:

after 5 25.21 sec
after 10 26.26 sec
after 15 25.23 sec
after 20 26.25 sec
after 25 26.58 sec

In both cases, postgres 67% cpu, psql 15~20%, rest: system process. Kernel
time is 40+% -- where from?

Regards,
Qingqing

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] insert performance for win32

2005-11-02 Thread Merlin Moncure
 I've done the tests with rc1. This is still as slow on windows ...
about
 6-10
 times slower thant linux (via Ip socket). (depending on using prepared
 queries, etc...)
 
 By the way, we've tried to insert into the windows database from a
linux
 psql
 client, via the network. In this configuration, inserting is only
about 2
 times slower than inserting locally (the linux client had a slower CPU
 1700Mhz agains 3000).
 Could it be related to a problem in the windows psql client ?

[OK, I'm bringing this back on-list, and bringing it to QingQing's
attention, who I secretly hope is the right person to be looking at this
problem :)]

Just to recap Marc and I have been looking at the performance disparity
between windows and linux for a single transaction statement by
statement insert on a very narrow table with no keys from a remote
client.  Marc's observations showed (and I verified) that windows is
much slower in this case than it should be.  I gprof'ed both the psql
client and the server during the insert and didn't see anything
seriously out of order...unfortunately QQ's latest win32 performance
tweaks haven't helped.

Marc's observation that by switching to a linux client drops time down
drastically is really intersing!

Merlin



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] insert performance for win32

2005-11-02 Thread Magnus Hagander
  I've done the tests with rc1. This is still as slow on windows ...
 about
  6-10
  times slower thant linux (via Ip socket). (depending on 
 using prepared 
  queries, etc...)
  
  By the way, we've tried to insert into the windows database from a
 linux
  psql
  client, via the network. In this configuration, inserting is only
 about 2
  times slower than inserting locally (the linux client had a 
 slower CPU 
  1700Mhz agains 3000).
  Could it be related to a problem in the windows psql client ?
 
 [OK, I'm bringing this back on-list, and bringing it to 
 QingQing's attention, who I secretly hope is the right person 
 to be looking at this problem :)]
 
 Just to recap Marc and I have been looking at the performance 
 disparity between windows and linux for a single transaction 
 statement by statement insert on a very narrow table with no 
 keys from a remote client.  Marc's observations showed (and I 
 verified) that windows is much slower in this case than it 
 should be.  I gprof'ed both the psql client and the server 
 during the insert and didn't see anything seriously out of 
 order...unfortunately QQ's latest win32 performance tweaks 
 haven't helped.
 
 Marc's observation that by switching to a linux client drops 
 time down drastically is really intersing!

Could this be a case of the network being slow, as we've seen a couple
of times before? And if you run psql on the local box, you get it
double.

Do you get a speed difference between the local windows box and a remote
wnidows box?

//Magnus

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] insert performance for win32

2005-11-02 Thread Marc Cousin
Le Mercredi 02 Novembre 2005 14:54, Magnus Hagander a écrit :
   I've done the tests with rc1. This is still as slow on windows ...
 
  about
 
   6-10
   times slower thant linux (via Ip socket). (depending on
 
  using prepared
 
   queries, etc...)
  
   By the way, we've tried to insert into the windows database from a
 
  linux
 
   psql
   client, via the network. In this configuration, inserting is only
 
  about 2
 
   times slower than inserting locally (the linux client had a
 
  slower CPU
 
   1700Mhz agains 3000).
   Could it be related to a problem in the windows psql client ?
 
  [OK, I'm bringing this back on-list, and bringing it to
  QingQing's attention, who I secretly hope is the right person
  to be looking at this problem :)]
 
  Just to recap Marc and I have been looking at the performance
  disparity between windows and linux for a single transaction
  statement by statement insert on a very narrow table with no
  keys from a remote client.  Marc's observations showed (and I
  verified) that windows is much slower in this case than it
  should be.  I gprof'ed both the psql client and the server
  during the insert and didn't see anything seriously out of
  order...unfortunately QQ's latest win32 performance tweaks
  haven't helped.
 
  Marc's observation that by switching to a linux client drops
  time down drastically is really intersing!

 Could this be a case of the network being slow, as we've seen a couple
 of times before? And if you run psql on the local box, you get it
 double.

 Do you get a speed difference between the local windows box and a remote
 wnidows box?

 //Magnus
The Windows-Windows test is local (via loopback interface)
The Linux (client) - Windows (server) is via network (100Mbits)

I can't test with 2 windows box ... I haven't got that much (all machines 
linux, except the test one...)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] insert performance for win32

2005-11-02 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 Marc's observation that by switching to a linux client drops 
 time down drastically is really intersing!

 Could this be a case of the network being slow,

I'm wondering about nonstandard junk lurking in the TCP stack of the
Windows client machine.  Also, I seem to recall something about a QOS
patch that people are supposed to apply, or not apply as the case may
be, to get Windows' TCP stack to behave reasonably ... ring a bell?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] insert performance for win32

2005-11-02 Thread Qingqing Zhou


On Wed, 2 Nov 2005, Merlin Moncure wrote:

 
  By the way, we've tried to insert into the windows database from a
  linux psql client, via the network. In this configuration, inserting
  is only about 2 times slower than inserting locally (the linux client
  had a slower CPU 1700Mhz agains 3000). Could it be related to a
  problem in the windows psql client ?
 

If you put client/server on the same machine, then we don't know how the
CPU is splitted. Can you take a look at the approximate number by
observing the task manager data while running?

If communication code is the suspect, can we measure the difference if we
disable the redefinition of recv()/send() etc in port/win32.h (may require
change related code a little bit as well). In this way, the socket will
not be able to pickup signals, but let see if there is any performance
difference first.

Regards,
Qingqing



 [OK, I'm bringing this back on-list, and bringing it to QingQing's
 attention, who I secretly hope is the right person to be looking at this
 problem :)]

P.s. You scared me ;-)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] insert performance for win32

2005-09-07 Thread Merlin Moncure
  One thing I did notice that in a 250k insert transaction the insert
time
  grows with #recs inserted.  Time to insert first 50k recs is about
27
  sec and last 50 k recs is 77 sec.  I also confimed that size of
table is
  not playing a role here.
 
  Marc, can you do select timeofday() every 50k recs from linux?  Also
a
  gprof trace from linux would be helpful.
 
 
 Here's the timeofday ... i'll do the gprof as soon as I can.
 Every 5 rows...
 
Were those all in a single transaction?

Merlin

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] insert performance for win32

2005-09-07 Thread Marc Cousin
On Tuesday 06 September 2005 19:11, Merlin Moncure wrote:
  This makes me wonder if we are looking in the wrong place.  Maybe the
  problem is coming from psql?  More results to follow.

 problem is not coming from psql.

 One thing I did notice that in a 250k insert transaction the insert time
 grows with #recs inserted.  Time to insert first 50k recs is about 27
 sec and last 50 k recs is 77 sec.  I also confimed that size of table is
 not playing a role here.

 Marc, can you do select timeofday() every 50k recs from linux?  Also a
 gprof trace from linux would be helpful.


Here's the timeofday ... i'll do the gprof as soon as I can.
Every 5 rows...

Wed Sep 07 13:58:13.860378 2005 CEST
Wed Sep 07 13:58:20.926983 2005 CEST
Wed Sep 07 13:58:27.928385 2005 CEST
Wed Sep 07 13:58:35.472813 2005 CEST
Wed Sep 07 13:58:42.825709 2005 CEST
Wed Sep 07 13:58:50.789486 2005 CEST
Wed Sep 07 13:58:57.553869 2005 CEST
Wed Sep 07 13:59:04.298136 2005 CEST
Wed Sep 07 13:59:11.066059 2005 CEST
Wed Sep 07 13:59:19.368694 2005 CEST





 Merlin

 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] insert performance for win32

2005-09-07 Thread Merlin Moncure
 On Tuesday 06 September 2005 19:11, Merlin Moncure wrote:
 Here's the timeofday ... i'll do the gprof as soon as I can.
 Every 5 rows...
 
 Wed Sep 07 13:58:13.860378 2005 CEST
 Wed Sep 07 13:58:20.926983 2005 CEST
 Wed Sep 07 13:58:27.928385 2005 CEST
 Wed Sep 07 13:58:35.472813 2005 CEST
 Wed Sep 07 13:58:42.825709 2005 CEST
 Wed Sep 07 13:58:50.789486 2005 CEST
 Wed Sep 07 13:58:57.553869 2005 CEST
 Wed Sep 07 13:59:04.298136 2005 CEST
 Wed Sep 07 13:59:11.066059 2005 CEST
 Wed Sep 07 13:59:19.368694 2005 CEST

ok, I've been in crunching profile profile graphs, and so far have been
only been able to draw following conclusions.

For bulk, 'in-transaction' insert:
1. win32 is slower than linux.  win32 time for each insert grows with #
inserts in xact, linux does not (or grows much slower).  Win32 starts
out about 3x slower and grows to 10x slower after 250k inserts.

2. ran a 50k profile vs. 250k profile.  Nothing jumps out as being
slower or faster: most time is spent in yyparse on either side.  From
this my preliminary conclusion is that there is something going on in
the win32 api which is not showing in the profile.

3. The mingw gprof cumulative seconds does not show measurable growth in
cpu time/insert in 50k/250k profile.

I'm now talking suggestions about where to look for performance problems
:(.
Merlin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] insert performance for win32

2005-09-06 Thread Marc Cousin
Hi, 

I usually use PostgreSQL coupled with Linux, but I have to use Windows for a 
perticular project.

So I wanted to do some tests to know if the performance will be acceptable (I 
don't need PostgreSQL to be as fast with windows as with linux, but it has to 
be usable...).

I started with trying to do lots of inserts, and I'm quite astonished by the 
catastrophics results ...

The test :
The computer was the same (my workstation, a PIV Dell with SATA disk), dual 
boot

The windows OS is XP.

Both Oses are PostgreSQL 8.0.3

Both PostgreSQL clusters (windows and linux) have the same tuning 
(shared_buffers=2, wal_buffers=128, checkpoint_segments=10)

Before each test, the clusters are vacuum analyzed, and the test database is 
recreated.

The script is quite dumb :
BEGIN;
CREATE TABLE test (col1 serial, col2 text);
INSERT INTO test (col2) values ('test');
INSERT INTO test (col2) values ('test');
INSERT INTO test (col2) values ('test');
INSERT INTO test (col2) values ('test');
INSERT INTO test (col2) values ('test');
.. 500,000 times
Then COMMIT.

I know it isn't realistic, but I needed to start with something :)

The results are as follows :
Linux : 1'9''
Windows : 9'38''

What I've tried to solve, and didn't work :

- Deactivate antivirus on windows
- fsync=no
- raise the checkpoint_segments value (32)
- remove hyperthreading (who knows...)

I don't know what could cause this (I'm not a windows admin...at all). All I 
see is a very high kernel load during the execution of this script, but I 
can't determine where it comes from.


I'd like to know if this is a know problem, if there is something I can do, 
etc...

Thanks a lot.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] insert performance for win32

2005-09-06 Thread Merlin Moncure
 Hi,
 
 I usually use PostgreSQL coupled with Linux, but I have to use Windows
for
 a
 perticular project.
 
 So I wanted to do some tests to know if the performance will be
acceptable
 (I
 don't need PostgreSQL to be as fast with windows as with linux, but it
has
 to
 be usable...).

In my experience win32 is par with linux generally with a few gotchas on
either side.  Are your times with fsync=no? It's much harder to give
apples-apples comparison with fsync=on for various reasons.

Are you running stats_command_string=on?  Try disabling and compare
results.
Is your loading app running locally or on the server?

I am very interesting in discovering sources of high cpu load problems
on win32.  If you are still having problems could you get a gprof
profile together?  There is a recent thread on win32-hackers discussing
how to do this.

Merlin



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] insert performance for win32

2005-09-06 Thread Marc Cousin

 In my experience win32 is par with linux generally with a few gotchas on
 either side.  Are your times with fsync=no? It's much harder to give
 apples-apples comparison with fsync=on for various reasons.
It is with fsync=off on windows, fsync=on on linux


 Are you running stats_command_string=on?  Try disabling and compare
 results.
Deactivated on windows, activated on linux

 Is your loading app running locally or on the server?
Yes

 I am very interesting in discovering sources of high cpu load problems
 on win32.  If you are still having problems could you get a gprof
 profile together?  There is a recent thread on win32-hackers discussing
 how to do this.
I'll give it a look

 Merlin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] insert performance for win32

2005-09-06 Thread Merlin Moncure
  In my experience win32 is par with linux generally with a few gotchas on
  either side.  Are your times with fsync=no? It's much harder to give
  apples-apples comparison with fsync=on for various reasons.
 It is with fsync=off on windows, fsync=on on linux

well, inside a transaction this shouldn't have mattered anyways.
 
  Are you running stats_command_string=on?  Try disabling and compare
  results.
 Deactivated on windows, activated on linux
 
  Is your loading app running locally or on the server?
 Yes

hm :(.  Well, you had me curious so I went ahead and re-ran your test case and 
profiled it (on windows).  I got similar results time wise.  It's interesting 
to note that the command I used to generate the test table before dumping 
w/inserts

insert into test select nextval('test_id_seq'), 'test' from 
generate_series(1,50) 

ran in just a few seconds.  

Well, I cut the #recs down to 50k and here is profile trace:
  %   cumulative   self  self total   
 time   seconds   secondscalls   s/call   s/call  name
 10.78  0.62 0.6250001 0.00 0.00  yyparse
  5.39  0.93 0.31  5101422 0.00 0.00  AllocSetAlloc
  4.52  1.19 0.26   799970 0.00 0.00  base_yylex
  2.78  1.35 0.16   28 0.00 0.00  SearchCatCache
  2.43  1.49 0.14   554245 0.00 0.00  hash_search
  2.26  1.62 0.1349998 0.00 0.00  XLogInsert
  1.74  1.72 0.10   453363 0.00 0.00  LWLockAcquire
  1.74  1.82 0.10   299988 0.00 0.00  ScanKeywordLookup

This makes me wonder if we are looking in the wrong place.  Maybe the problem 
is coming from psql?  More results to follow.

Merlin

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] insert performance for win32

2005-09-06 Thread Merlin Moncure
 This makes me wonder if we are looking in the wrong place.  Maybe the
 problem is coming from psql?  More results to follow.

problem is not coming from psql.  

One thing I did notice that in a 250k insert transaction the insert time
grows with #recs inserted.  Time to insert first 50k recs is about 27
sec and last 50 k recs is 77 sec.  I also confimed that size of table is
not playing a role here.

Marc, can you do select timeofday() every 50k recs from linux?  Also a
gprof trace from linux would be helpful.

Merlin

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match