Re: [PERFORM] Sorted union

2005-11-03 Thread Merlin Moncure
  selectwhen_stopped as when_happened,
1 as order_hint
  from  transaction t
  where '2005-10-25 15:00:00' = when_stopped
and when_stopped = '2005-10-26 10:00:00'
  union all
  selectwhen_stopped as when_happened,
2 as order_hint
  from  transaction t
  where '2005-10-25 15:00:00' = when_stopped
and when_stopped = '2005-10-26 10:00:00'
  order by  when_happened, order_hint;

hmm, try pushing the union into a subquery...this is better style
because it's kind of ambiguous if the ordering will apply before/after
the union.

select q.when from
(
 select 1 as hint, start_time as when [...]
 union all
 select 2 as hint, end_time as when [...]
) q order by q.seq, when

question: why do you want to flatten the table...is it not easier to
work with as records?

Merlin
 

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] Function with table%ROWTYPE globbing

2005-11-03 Thread Sven Willenberger
Postgresql 8.0.4 using plpgsql

The basic function is set up as:
CREATE FUNCTION add_data(t_row mytable) RETURNS VOID AS $func$
DECLARE
newtable text;
thesql text;
BEGIN
INSERT INTO newtable thename from mytable where lookup.id =
t_row.id;
thesql := 'INSERT INTO ' || newtable || VALUES (' || t_row.* ')';
EXECUTE thesql;
RETURN;
END;
$func$ LANGUAGE plpgsql VOLATILE;

SELECT add_data(t.*) FROM mytable t where 
ERROR:  column * not found in data type mytable

Now I have tried to drop the * but then there is no concatenation
function to join text to a table%ROWTYPE. So my question is how can I
make this dynamic insert statement without listing out every
t_row.colname? Or, alternatively, is there a better way to parse out
each row of a table into subtables based on a column value?

Sven



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Function with table%ROWTYPE globbing

2005-11-03 Thread Merlin Moncure
 Postgresql 8.0.4 using plpgsql
 
 The basic function is set up as:
 CREATE FUNCTION add_data(t_row mytable) RETURNS VOID AS $func$
 DECLARE
 newtable text;
 thesql text;
 BEGIN
 INSERT INTO newtable thename from mytable where lookup.id =
 t_row.id;
 thesql := 'INSERT INTO ' || newtable || VALUES (' || t_row.* ')';
 EXECUTE thesql;
 RETURN;
 END;
 $func$ LANGUAGE plpgsql VOLATILE;
 
 SELECT add_data(t.*) FROM mytable t where 
 ERROR:  column * not found in data type mytable
 
 Now I have tried to drop the * but then there is no concatenation
 function to join text to a table%ROWTYPE. So my question is how can I
 make this dynamic insert statement without listing out every
 t_row.colname? Or, alternatively, is there a better way to parse out
 each row of a table into subtables based on a column value?

I don't think it's possible.  Rowtypes, etc are not first class yet (on
to do).  What I would do is pass the table name, where clause, etc into
the add_data function and rewrite as insert...select and do the whole
thing in one operation.

Merlin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Sorted union

2005-11-03 Thread Scott Lamb

Merlin Moncure wrote:

hmm, try pushing the union into a subquery...this is better style
because it's kind of ambiguous if the ordering will apply before/after
the union.


Seems to be a little slower. There's a new subquery scan step.

explain analyze
selectq.when_happened from (
selectwhen_stopped as when_happened,
  1 as order_hint
from  transaction t
where '2005-10-25 15:00:00' = when_stopped
  and when_stopped = '2005-10-26 10:00:00'
union all
selectwhen_stopped as when_happened,
  2 as order_hint
from  transaction t
where '2005-10-25 15:00:00' = when_stopped
  and when_stopped = '2005-10-26 10:00:00'
) q order by  when_happened, order_hint;


   QUERY PLAN 


---
 Sort  (cost=713013.96..721751.25 rows=3494916 width=12) (actual 
time=34392.264..37237.148 rows=3364006 loops=1)

   Sort Key: when_happened, order_hint
   -  Subquery Scan q  (cost=0.00..229474.11 rows=3494916 width=12) 
(actual time=0.194..20283.452 rows=3364006 loops=1)
 -  Append  (cost=0.00..194524.95 rows=3494916 width=8) 
(actual time=0.191..14967.632 rows=3364006 loops=1)
   -  Subquery Scan *SELECT* 1  (cost=0.00..97262.48 
rows=1747458 width=8) (actual time=0.189..5535.139 rows=1682003 loops=1)
 -  Index Scan using transaction_stopped on 
transaction t  (cost=0.00..79787.90 rows=1747458 width=8) (actual 
time=0.186..3097.268 rows=1682003 loops=1)
   Index Cond: (('2005-10-25 
15:00:00'::timestamp without time zone = when_stopped) AND 
(when_stopped = '2005-10-26 10:00:00'::timestamp without time zone))
   -  Subquery Scan *SELECT* 2  (cost=0.00..97262.48 
rows=1747458 width=8) (actual time=0.173..5625.155 rows=1682003 loops=1)
 -  Index Scan using transaction_stopped on 
transaction t  (cost=0.00..79787.90 rows=1747458 width=8) (actual 
time=0.169..3146.714 rows=1682003 loops=1)
   Index Cond: (('2005-10-25 
15:00:00'::timestamp without time zone = when_stopped) AND 
(when_stopped = '2005-10-26 10:00:00'::timestamp without time zone))

 Total runtime: 39775.225 ms
(11 rows)


question: why do you want to flatten the table...is it not easier to
work with as records?


For most things, yes. But I'm making a bunch of different graphs from 
these data, and a few of them are much easier with events. The best 
example is my concurrency graph. Whenever there's a start event, it goes 
up one. Whenever there's a stop event, it goes down one. It's completely 
trivial once you have it separated into events.


Thanks,
Scott

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Sorted union

2005-11-03 Thread Merlin Moncure

 Merlin Moncure wrote:
  hmm, try pushing the union into a subquery...this is better style
  because it's kind of ambiguous if the ordering will apply
before/after
  the union.
 
 Seems to be a little slower. There's a new subquery scan step.

I figured.  However it's more correct, I'm not sure if the original
query is necessarily guaranteed to give the right answer (in terms of
ordering).  It might though.

 
  question: why do you want to flatten the table...is it not easier to
  work with as records?
 
 For most things, yes. But I'm making a bunch of different graphs from
 these data, and a few of them are much easier with events. The best
 example is my concurrency graph. Whenever there's a start event, it
goes
 up one. Whenever there's a stop event, it goes down one. It's
completely
 trivial once you have it separated into events.

well, if you don't mind attempting things that are not trivial, how
about trying: 

select t, (select count(*) from  transaction where t between happened
and when_stopped) from
(
select ((generate_series(1,60) * scale)::text::interval) + '12:00
pm'::time as t
) q;
for example, to check concurrency at every second for a minute (starting
from 1 second) after 12:00 pm, (scale is zero in this case),

select t, (select count(*) from  transaction where t between happened
and when_stopped) from
(
select (generate_series(1,60)::text::interval) + '12:00 pm'::time as
t
) q;

this could be a win depending on how much data you pull into your
concurrency graph.  maybe not though.  

Merlin

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Sorted union

2005-11-03 Thread Scott Lamb

On Nov 3, 2005, at 8:20 AM, Merlin Moncure wrote:

select t, (select count(*) from  transaction where t between happened
and when_stopped) from
(
select ((generate_series(1,60) * scale)::text::interval) + '12:00
pm'::time as t
) q;


Wow. I hadn't known about generate_series, but there are a bunch of  
places I've needed it.


As cool as this is, though, I don't think it helps me. There's  
another event-driven graph that I need. For lack of a better name, I  
call it the slot graph. Every single transaction is graphed as a  
horizontal line from its start time to its end time, with a vertical  
line at the start and stop. Successful, timed out, and failed  
transactions are green, black, and red, respectively. I use it in a  
couple different ways:


(1) on short timescales, it's nice to look at individual  
transactions. My tester will max out at either a rate or a  
concurrency. If I'm having problems, I'll get bursts of timeouts.  
This graph is the one that makes it clear why - it shows how things  
align, etc. Actually, even for longer timespans, this is still  
helpful - it's nice to see that most of the slots are filled with  
timing-out transactions when the rate falls.


(2) It can show you if something affects all of the transactions at  
once. When we did a database failover test, we saw a bunch of  
failures (as expected; our application isn't responsible for  
retries). This graph is the one that showed us that _all_  
transactions that were active at a specific time failed and that no  
other transactions failed. (There was a sharp vertical line of reds  
and blacks in the larger block of greens).


I wish I could just show these to you, rather than describing them.  
It's all proprietary data, though. Maybe soon I'll have similar  
graphs of my open source SSL proxy.


But the point is, I don't think I can represent this information  
without sending every data point to my application. I assign slots by  
the start time and free them by the stop time.


But I think there is something I can do: I can just do a query of the  
transaction table sorted by start time. My graph tool can keep a  
priority queue of all active transactions, keyed by the stop time.  
Whenever it grabs a new event, it can peek at the next start time but  
check if there are any stop times before it. Then at the end, it can  
pick up the rest of the stop times. The concurrency will never exceed  
a few thousand, so the additional CPU time and memory complexity are  
not a problem. As a bonus, I will no longer need my index on the stop  
time. Dropping it will save a lot of disk space.


Thanks for getting me off the I need a fast query that returns these  
exact results mindset. It is good to step back and look at the big  
picture.


Mind you, I still think PostgreSQL should be able to perform that  
sorted union fast. Maybe sometime I'll have enough free time to take  
my first plunge into looking at a database query planner.


Regards,
Scott

--
Scott Lamb http://www.slamb.org/



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Merlin Moncure
 On Wed, 2 Nov 2005, Merlin Moncure wrote:
 If you put client/server on the same machine, then we don't know how
the
 CPU is splitted. Can you take a look at the approximate number by
 observing the task manager data while running?

ok, I generated a test case which was 250k inserts to simple two column
table all in single transaction.  Every 50k inserts, time is recorded
via timeofday().  

Running from remote, Time progression is:
First  50k: 20 sec
Second: 29 sec
[...]
final:: 66 sec

so, clear upward progression of time/rec.  Initial time is 2.5k
inserts/sec which is decent but not great for such a narrow table.  CPU
time on server starts around 50% and drops in exact proportion to insert
performance.  My earlier gprof test also suggest there is no smoking gun
sucking down all the cpu time.

cpu time on the client is very volatile but with a clear increase over
time starting around 20 and ending perhaps 60. My client box is pretty
quick, 3ghz p4.

Running the script locally, from the server, cpu time is pegged at 100%
and stays...first 50k is 23 sec with a much worse decomposition to
almost three minutes for final 50k.

Merlin



 
 If communication code is the suspect, can we measure the difference if
we
 disable the redefinition of recv()/send() etc in port/win32.h (may
require
 change related code a little bit as well). In this way, the socket
will
 not be able to pickup signals, but let see if there is any performance
 difference first.
 
 Regards,
 Qingqing
 
 
 
  [OK, I'm bringing this back on-list, and bringing it to QingQing's
  attention, who I secretly hope is the right person to be looking at
this
  problem :)]
 
 P.s. You scared me ;-)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Sorted union

2005-11-03 Thread Merlin Moncure
 Wow. I hadn't known about generate_series, but there are a bunch of
 places I've needed it.

It's a wonder tool :).
 
 But I think there is something I can do: I can just do a query of the
 transaction table sorted by start time. My graph tool can keep a

Reading the previous paragraphs I was just about to suggest this.  This
is a much more elegant method...you are reaping the benefits of having
normalized your working set.  You were trying to denormalize it back to
what you were used to.  Yes, now you can drop your index and simplify
your queries...normalized data is always more 'natural'.

 Mind you, I still think PostgreSQL should be able to perform that
 sorted union fast. Maybe sometime I'll have enough free time to take
 my first plunge into looking at a database query planner.

I'm not so sure I agree, by using union you were basically pulling two
independent sets (even if they were from the same table) that needed to
be ordered.  There is zero chance of using the index here for ordering
because you are ordering a different set than the one being indexed.
Had I not been able to talk you out of de-normalizing your table I was
going to suggest rigging up a materialized view and indexing that:

http://jonathangardner.net/PostgreSQL/materialized_views/matviews.html

Merlin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Sorted union

2005-11-03 Thread Kevin Grittner
The ANSI/ISO specs are not at all ambiguous on this.  An
ORDER BY is not allowed for the SELECT statements within
a UNION.  It must come at the end and applied to the resulting
UNION.

Similarly, the column names in the result come from the first
query in the UNION.  Column names in the query on the right
side of a UNION are immaterial.

Unless we have reason to believe that PostgreSQL is
non-compliant on this point, I don't think it is a good idea to
slow the query down with the subquery.

-Kevin


 Merlin Moncure [EMAIL PROTECTED]  

 Merlin Moncure wrote:
  hmm, try pushing the union into a subquery...this is better style
  because it's kind of ambiguous if the ordering will apply
before/after
  the union.
 
 Seems to be a little slower. There's a new subquery scan step.

I figured.  However it's more correct, I'm not sure if the original
query is necessarily guaranteed to give the right answer (in terms of
ordering).  It might though.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Sorted union

2005-11-03 Thread Merlin Moncure
 The ANSI/ISO specs are not at all ambiguous on this.  An
 ORDER BY is not allowed for the SELECT statements within
 a UNION.  It must come at the end and applied to the resulting
 UNION.

Interesting :/ 

Merlin

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Merlin Moncure
 Both win32 send/recv have pgwin32_poll_signals() in them.  This is
 glorified WaitForSingleObjectEx on global pgwin32_signal_event.  This
is
 probably part of the problem.  Can we work some of the same magic you
put
 into check interrupts macro?

Whoop! following a cvs update I see this is already nailed :) Back to
the drawing board...

Merlin

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Sorted union

2005-11-03 Thread Scott Lamb

On Nov 3, 2005, at 10:21 AM, Merlin Moncure wrote:
Reading the previous paragraphs I was just about to suggest this.   
This

is a much more elegant method...you are reaping the benefits of having
normalized your working set.  You were trying to denormalize it  
back to

what you were used to.  Yes, now you can drop your index and simplify
your queries...normalized data is always more 'natural'.


I'm not sure normalized is the right word. In either case, I'm  
storing it in the same form. In either case, my ConcurrencyProcessor  
class gets the same form. The only difference is if the database  
splits the rows or if my application does so.


But we're essentially agreed. This is the algorithm I'm going to try  
implementing, and I think it will work out well. It also means  
sending about half as much data from the database to the application.



Mind you, I still think PostgreSQL should be able to perform that
sorted union fast. Maybe sometime I'll have enough free time to take
my first plunge into looking at a database query planner.


I'm not so sure I agree, by using union you were basically pulling two
independent sets (even if they were from the same table) that  
needed to

be ordered.


Yes.


  There is zero chance of using the index here for ordering
because you are ordering a different set than the one being indexed.


I don't think that's true. It just needs to look at the idea of  
independently ordering each element of the union and then merging  
that, compared to the cost of grabbing the union and then ordering  
it. In this case, the former cost is about 0 - it already has  
independently ordered them, and the merge algorithm is trivial.  
http://en.wikipedia.org/wiki/Merge_algorithm


Regards,
Scott

--
Scott Lamb http://www.slamb.org/



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Qingqing Zhou


On Thu, 3 Nov 2005, Magnus Hagander wrote:


 Sorry, I don't follow you here - what do you mean to do? Remove the
 event completely so we can't wait on it?


I'd like to use the win32 provided recv(), send() functions instead of
redirect them to pgwin32_recv()/pgwin32_send(), just like libpq does. If
we do this, we will lose some functionalities, but I'd like to see the
performance difference first. -- do you think that will be any difference?

Regards,
Qingqing

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Magnus Hagander
   Both win32 send/recv have pgwin32_poll_signals() in them.
   This is glorified WaitForSingleObjectEx on global 
   pgwin32_signal_event.  This is probably part of the problem.
   Can we work some of the same magic you put into check interrupts 
   macro?
  
 
  Uh, we already do that, don't we?
  
 http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/port/win3
  2/
  socket.c?rev=1.10
  has:
 
 
 Yeah, we did this. I am thinking of just use simple mechanism 
 of the win32 sockets, which could not pick up signals, but I 
 would like to see if there is any difference -- do you think 
 there is any point to try this?

Sorry, I don't follow you here - what do you mean to do? Remove the
event completely so we can't wait on it?

//Magnus

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Qingqing Zhou


On Thu, 3 Nov 2005, Magnus Hagander wrote:

   Sorry, I don't follow you here - what do you mean to do? Remove the
   event completely so we can't wait on it?
  
 
  I'd like to use the win32 provided recv(), send() functions
  instead of redirect them to pgwin32_recv()/pgwin32_send(),
  just like libpq does. If we do this, we will lose some
  functionalities, but I'd like to see the performance
  difference first. -- do you think that will be any difference?

 Doesn't work, really. It will no longer be possible to send a signal to
 an idle backend. The idle backend will be blocking on recv(), that's how
 it works. So unless we can get around that somehow, it's a non-starter I
 think.

Yeah, agreed.  An alternative is set tiemout like 100 ms or so. When
timeout happens, check the signals. But I guess you will be strongly
against it.


 I doubt there will be much performance difference, as you hav eto hit
 the kernel anyway (in the recv/send call). But that part is just a guess
 :-)

I know what you mean ...  I will take a look -- if the patch (not
including fix signaling problem), if doesn't change much, I will give it a
try.

Regards,
Qingqing

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Magnus Hagander
   I'd like to use the win32 provided recv(), send() 
 functions instead 
   of redirect them to pgwin32_recv()/pgwin32_send(), just 
 like libpq 
   does. If we do this, we will lose some functionalities, 
 but I'd like 
   to see the performance difference first. -- do you think 
 that will 
   be any difference?
 
  Doesn't work, really. It will no longer be possible to send 
 a signal 
  to an idle backend. The idle backend will be blocking on recv(), 
  that's how it works. So unless we can get around that 
 somehow, it's a 
  non-starter I think.
 
 Yeah, agreed.  An alternative is set tiemout like 100 ms or 
 so. When timeout happens, check the signals. But I guess you 
 will be strongly against it.

Not on principle, but I don't think it'll give us enough gain for the
cost. But if it does, I'm certainly not against it.



//Magnus

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Qingqing Zhou

Merlin Moncure [EMAIL PROTECTED] wrote

 Running from remote, Time progression is:
 First  50k: 20 sec
 Second: 29 sec
 [...]
 final:: 66 sec

This may due to the maintainence cost of a big transaction, I am not sure 
... Tom?

 so, clear upward progression of time/rec.  Initial time is 2.5k
 inserts/sec which is decent but not great for such a narrow table.  CPU
 time on server starts around 50% and drops in exact proportion to insert
 performance.  My earlier gprof test also suggest there is no smoking gun
 sucking down all the cpu time.


Not to 100%, so this means the server is always starving. It is waiting on 
something -- of couse not lock. That's why I think there is some problem on 
network communication. Another suspect will be the write - I knwo NTFS 
system will issue an internal log when extending a file. To remove the 
second suspect, I will try to hack the source to do a fake write ...

Regards,
Qingqing 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] Encoding on 8.0.4

2005-11-03 Thread Don Drake
I recently upgraded my DB from 7.4.3 to 8.0.4 and I've noticed the following errors appearing in my serverlog:





2005-11-03 05:56:57 CST 127.0.0.1(38858) ERROR: Unicode characters greater than or equal to 0x1 are not supported

2005-11-03 06:04:09 CST 127.0.0.1(38954) ERROR: invalid byte sequence for encoding UNICODE: 0xe02d76

2005-11-03 06:04:21 CST 127.0.0.1(38964) ERROR: invalid byte sequence for encoding UNICODE: 0xe02d76

2005-11-03 06:11:35 CST 127.0.0.1(39072) ERROR: Unicode characters greater than or equal to 0x1 are not supported

2005-11-03 06:23:23 CST 127.0.0.1(39657) ERROR: invalid byte sequence for encoding UNICODE: 0xd40d

2005-11-03 08:10:02 CST 127.0.0.1(44073) ERROR: invalid byte sequence for encoding UNICODE: 0xe46973

2005-11-03 08:21:13 CST 127.0.0.1(44711) ERROR: Unicode characters greater than or equal to 0x1 are not supported

2005-11-03 08:26:36 CST 127.0.0.1(44745) ERROR: invalid byte sequence for encoding UNICODE: 0xc447

2005-11-03 08:40:59 CST 127.0.0.1(45087) ERROR: invalid byte sequence for encoding UNICODE: 0xdd20

2005-11-03 09:14:52 CST 127.0.0.1(46009) ERROR: Unicode characters greater than or equal to 0x1 are not supported



I never received these errors on when running 7.4.3. I used the
default encodings on 7.4.3 and I tried chaning client_encoding from
sql_ascii to UNICODE and I'm still seeing this. I'm storing in a text
data type email that contains other characterset
characters. 



Any ideas on how to resolve this?



-Don-- Donald DrakePresidentDrake Consultinghttp://www.drakeconsult.com/

http://www.MailLaunder.com/
http://www.mobilemeridian.com/312-560-1574




Re: [PERFORM] insert performance for win32

2005-11-03 Thread Qingqing Zhou

Qingqing Zhou [EMAIL PROTECTED] wrote

 Not to 100%, so this means the server is always starving. It is waiting on 
 something -- of couse not lock. That's why I think there is some problem 
 on network communication. Another suspect will be the write - I knwo NTFS 
 system will issue an internal log when extending a file. To remove the 
 second suspect, I will try to hack the source to do a fake write ...


To patch:
-
Here is a quite straight hack to implement fake write for both relation 
and xlog. Now the server becomes pure CPU play.

1. RelationGetBufferForTuple()/hio.c: remove line (if you do not enable 
cassert, then doesn't matter):
- Assert(PageIsNew((PageHeader) pageHeader));

2. ReadBuffer()/bufmgr.c: remove line
- smgrextend(reln-rd_smgr, blockNum, (char *) bufBlock,
- reln-rd_istemp);

3. XLogWrite()/xlog.c
   errno = 0;
+  goto fake;
   if (write(openLogFile, from, nbytes) != nbytes)
   {
/* if write didn't set errno, assume no disk space */
...
   }
+
+ fake:
   /* Update state for write */


To use it:
-
1. have several copies of a correct data;

2. patch the server;

3. when you startup postmaster, use the following parameters:
postmaster -ccheckpoint_timeout=3600 -cbgwriter_all_percent=0 -Ddata

Note now the database server is one-shoot usable -- after you shutdown, it 
won't startup again. Just run
begin;
many inserts;
end;

To observe:
-
(1) In this case, what's the remote server CPU usage -- 100%? I don't have 
several machines to test it. In my single machine, I run 35000 insert 
commands from psql by cut and paste into it and could observe that:
---
25% kernel time
75% user time

20% postgresql (--enable-debug --enable-cassert)
65% psql (as same above)
10% csrss (system process, manage graphics commands (not sure, just googled 
it), etc)
5%  system (system process)
---

(2) In this case, Linux still keeps almost 10 times faster?

After this, we may need more observations like comparison of simple select 
1; to reduce the code space we may want to explore ...

Regards,
Qingqing 



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Qingqing Zhou


On Thu, 3 Nov 2005, Magnus Hagander wrote:

  Both win32 send/recv have pgwin32_poll_signals() in them.
  This is glorified WaitForSingleObjectEx on global
  pgwin32_signal_event.  This is probably part of the problem.
  Can we work some of the same magic you put into check
  interrupts macro?
 

 Uh, we already do that, don't we?
 http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/port/win32/
 socket.c?rev=1.10
 has:


Yeah, we did this. I am thinking of just use simple mechanism of the win32
sockets, which could not pick up signals, but I would like to see if there
is any difference -- do you think there is any point to try this?

Regards,
Qingqing

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Tom Lane
Merlin Moncure [EMAIL PROTECTED] writes:
 ok, I generated a test case which was 250k inserts to simple two column
 table all in single transaction.  Every 50k inserts, time is recorded
 via timeofday().  

You mean something like the attached?

 Running from remote, Time progression is:
 First  50k: 20 sec
 Second: 29 sec
 [...]
 final:: 66 sec

On Unix I get a dead flat line (within measurement noise), both local
loopback and across my LAN.

after 5 30.20 sec
after 10 31.67 sec
after 15 30.98 sec
after 20 29.64 sec
after 25 29.83 sec

top shows nearly constant CPU usage over the run, too.  With a local
connection it's pretty well pegged, with LAN connection the server's
about 20% idle and the client about 90% (client machine is much faster
than server which may affect this, but I'm too lazy to try it in the
other direction).

I think it's highly likely that you are looking at some strange behavior
of the Windows TCP stack.

regards, tom lane



binU7zmyKjiMS.bin
Description: timeit.c

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] insert performance for win32

2005-11-03 Thread Qingqing Zhou


On Thu, 3 Nov 2005, Tom Lane wrote:


 On Unix I get a dead flat line (within measurement noise), both local
 loopback and across my LAN.

 after 5 30.20 sec
 after 10 31.67 sec
 after 15 30.98 sec
 after 20 29.64 sec
 after 25 29.83 sec


Confirmed in Linux. And on a winxp machine(sp2) with server, client
together, with (see almost no performance difference) or without my fake
write, the observation is still hold for both cases:

after 5 25.21 sec
after 10 26.26 sec
after 15 25.23 sec
after 20 26.25 sec
after 25 26.58 sec

In both cases, postgres 67% cpu, psql 15~20%, rest: system process. Kernel
time is 40+% -- where from?

Regards,
Qingqing

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings