Re: [HACKERS] Performance monitor signal handler

2001-03-22 Thread Jan Wieck

Bruce Momjian wrote:
 I have talked to Jan over the phone, and he has convinced me that UDP is
 the proper way to communicate stats to the collector, rather than my
 shared memory idea.

 The advantages of his UDP approach is that the collector can sleep on
 the UDP socket rather than having the collector poll the shared memory
 area.  It also has the auto-discard option.  He will make logging
 configurable on a per-database level, so it can be turned off when not
 in use.

 He has a trial UDP implementation that he will post soon.  Also, I asked
 him to try DGRAM Unix-domain sockets for performance reasons.  My
 Steven's book says it they should be supported.  He can put the socket
 file in /data.

"Trial" implementation attached :-)

First  attachment  is  a patch for various backend files plus
generating two new source files. If your patch(1) doesn't put
'em   automatically,  they  go  to  src/include/pgstat.h  and
src/backend/postmaster/pgstat.c.

BTW:  tgl  on  2/99  was  right,  the  hash_destroy()  really
crashes.  Maybe  we  want  to  pull  out  the  fix  I've done
(includes some new feature for hash table memory  allocation)
and apply that to 7.1?

Second   attachment  is  a  tarfile  that  should  unpack  to
contrib/pgstat_tmp.  I've placed the SQL level functions into
a shared module for now. The sql script also creates a couple
of views.

-   pgstat_all_tables shows scan- and tuple based  statistics
for all tables.  pgstat_sys_tables and pgstat_user_tables
filter out (you guess what) system or user tables.

-   pgstatio_all_tables,   pgstatio_sys_tablesand
pgstatio_user_tables   show   buffer  IO  statistics  for
tables.

-   pgstat_*_indexes and pgstatio_*_indexes are similar  like
the  above,  just that they give detailed info about each
single index.

-   pgstatio_*_sequences shows buffer IO statistics  about  -
right,   sequences.Since   sequences  aren't  scanned
regularely, they have no scan- and tuple related view.

-   pgstat_activity shows informations  about  all  currently
running  backends  of the entire instance. The underlying
function for displaying the  actual  query  returns  NULL
allways for non-superusers.

-   pgstat_database shows transaction commit/abort counts and
cumulated  buffer  IO   statistics   for   all   existing
databases.

The  collector  writes  frequently  a  file  data/pgstat.stat
(approx. every 500 milliseconds as long as there is something
to  tell,  so  nothing  is  done  if  the entire installation
sleeps). He also reads this file  on  startup,  so  collected
statistics survive postmaster restarts.

TODO:

-   Are  PF_UNIX  SOCK_DGRAM  sockets  supported  on  all the
platforms we do?  If not, what's wrong with  the  current
implementation?

-   There  is  no way yet to tell the collector about objects
(relations and  databases)  removed  from  the  database.
Basically  that  could be done with messages too, but who
will send them and how can we guarantee that  they'll  be
generated  even if somebody never queries the statistics?
Thus, the current collector will grow, and grow, and grow
until   you   remove   the  pgstat.stat  file  while  the
postmaster is down.

-   Also there aren't functions or  messages  implemented  to
explicitly reset statistics.

-   Possible additions would be to remember when the backends
started and collect resource usage (rstat(2)) information
as well.

-   The   entire  thing  needs  an  additional  attribute  in
pg_database that tells the  backends  what  to  tell  the
collector at all. Just to make them quiet again.

So far for an actual snapshot. Comments?


Jan

--

#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #



 pgstat_tmp.tar.gz
 pgstat.diff.gz


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Performance monitor signal handler

2001-03-20 Thread Bruce Momjian

I have talked to Jan over the phone, and he has convinced me that UDP is
the proper way to communicate stats to the collector, rather than my
shared memory idea.

The advantages of his UDP approach is that the collector can sleep on
the UDP socket rather than having the collector poll the shared memory
area.  It also has the auto-discard option.  He will make logging
configurable on a per-database level, so it can be turned off when not
in use.

He has a trial UDP implementation that he will post soon.  Also, I asked
him to try DGRAM Unix-domain sockets for performance reasons.  My
Steven's book says it they should be supported.  He can put the socket
file in /data.



   I figured it could just wake up every few seconds and check.  It will
   remember the loop counter and current pointer, and read any new
   information.  I was thinking of a 20k buffer, which could cover about 4k
   events.
  
  Here  I  wonder what your EVENT is. With an Oid as identifier
  and a 1 byte (even if it'd be anoter 32-bit value), how  many
  messages do you want to generate to get these statistics:
  
  -   Number of sequential scans done per table.
  -   Number of tuples returned via sequential scans per table.
  -   Number of buffer cache lookups  done  through  sequential
  scans per table.
  -   Number  of  buffer  cache  hits  for sequential scans per
  table.
  -   Number of tuples inserted per table.
  -   Number of tuples updated per table.
  -   Number of tuples deleted per table.
  -   Number of index scans done per index.
  -   Number of index tuples returned per index.
  -   Number of buffer cache lookups  done  due  to  scans  per
  index.
  -   Number of buffer cache hits per index.
  -   Number  of  valid heap tuples returned via index scan per
  index.
  -   Number of buffer cache lookups done for heap fetches  via
  index scan per index.
  -   Number  of  buffer  cache hits for heap fetches via index
  scan per index.
  -   Number of buffer cache lookups not accountable for any of
  the above.
  -   Number  of  buffer  cache hits not accountable for any of
  the above.
  
  What I see is that there's a difference in what we  two  want
  to see in the statistics. You're talking about looking at the
  actual querystring and such. That's  information  useful  for
  someone   actually  looking  at  a  server,  to  see  what  a
  particular backend  is  doing.  On  my  notebook  a  parallel
  regression  test  (containing 4,000 queries) passes by under
  1:30, that's more than 40 queries per second. So that doesn't
  tell me much.
  
  What I'm after is to collect the above data over a week or so
  and then generate a report to identify the hot spots  of  the
  schema.  Which tables/indices cause the most disk I/O, what's
  the average percentage of tuples returned in scans (not  from
  the  query, I mean from the single scan inside of the joins).
  That's the information I need  to  know  where  to  look  for
  possibly  better  qualifications, useless indices that aren't
  worth to maintain and the like.
  
 
 I was going to have the per-table stats insert a stat record every time
 it does a sequential scan, so it sould be [oid][sequential_scan_value]
 and allow the collector to gather that and aggregate it.
 
 I didn't think we wanted each backend to do the aggregation per oid. 
 Seems expensive. Maybe we would need a count for things like "number of
 rows returned" so it would be [oid][stat_type][value].
 
 -- 
   Bruce Momjian|  http://candle.pha.pa.us
   [EMAIL PROTECTED]   |  (610) 853-3000
   +  If your life is a hard drive, |  830 Blythe Avenue
   +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026
 
 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly
 


-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Performance monitor signal handler

2001-03-19 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 Only shared memory gives us near-zero cost for write/read.  99% of
 backends will not be using stats, so it has to be cheap.

Not with a circular buffer it's not cheap, because you need interlocking
on writes.  Your claim that you can get away without that is simply
false.  You won't just get lost messages, you'll get corrupted messages.

 The collector program can read the shared memory stats and keep hashed
 values of accumulated stats.  It uses the "Loops" variable to know if it
 has read the current information in the buffer.

And how does it sleep until the counter has been advanced?  Seems to me
it has to busy-wait (bad) or sleep (worse; if the minimum sleep delay
is 10 ms then it's guaranteed to miss a lot of data under load).

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Performance monitor signal handler

2001-03-19 Thread Bruce Momjian

 Bruce Momjian [EMAIL PROTECTED] writes:
  Only shared memory gives us near-zero cost for write/read.  99% of
  backends will not be using stats, so it has to be cheap.
 
 Not with a circular buffer it's not cheap, because you need interlocking
 on writes.  Your claim that you can get away without that is simply
 false.  You won't just get lost messages, you'll get corrupted messages.

How do I get corrupt messages if they are all five bytes?  If I write
five bytes, and another does the same, I guess the assembler could
intersperse the writes so the oid gets to be a corrupt value.  Any cheap
way around this, perhaps by skiping/clearing the write on a collision?

 
  The collector program can read the shared memory stats and keep hashed
  values of accumulated stats.  It uses the "Loops" variable to know if it
  has read the current information in the buffer.
 
 And how does it sleep until the counter has been advanced?  Seems to me
 it has to busy-wait (bad) or sleep (worse; if the minimum sleep delay
 is 10 ms then it's guaranteed to miss a lot of data under load).

I figured it could just wake up every few seconds and check.  It will
remember the loop counter and current pointer, and read any new
information.  I was thinking of a 20k buffer, which could cover about 4k
events.

Should we think about doing these writes into an OS file, and only
enabling the writes when we know there is a collector reading them,
perhaps using a /tmp file to activate recording.  We could allocation
1MB and be sure not to miss anything, even with a circular setup.


-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance monitor signal handler

2001-03-19 Thread Bruce Momjian

I have a new statistics collection proposal.

I suggest three shared memory areas:

One per backend to hold the query string and other per-backend stats
One global area to hold accumulated stats for all backends
One global circular buffer to hold per-table/object stats

The circular buffer will look like:

(Loops) Start---End
 |
 current pointer

Loops is incremented every time the pointer reaches "end".

Each statistics record will have a length of five bytes made up of
oid(4) and action(1).  By having the same length for all statistics
records, we don't need to perform any locking of the buffer.  A backend
will grab the current pointer, add five to it, and write into the
reserved 5-byte area.  If two backends write at the same time, one
overwrites the other, but this is just statistics information, so it is
not a great lose.

Only shared memory gives us near-zero cost for write/read.  99% of
backends will not be using stats, so it has to be cheap.

The collector program can read the shared memory stats and keep hashed
values of accumulated stats.  It uses the "Loops" variable to know if it
has read the current information in the buffer.  When it receives a
signal, it can dump its stats to a file in standard COPY format of
oidtabactiontabcount.  It can also reset its counters with a
signal.

Comments?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Performance monitor signal handler

2001-03-19 Thread Jan Wieck

Bruce Momjian wrote:
  Bruce Momjian [EMAIL PROTECTED] writes:
   Only shared memory gives us near-zero cost for write/read.  99% of
   backends will not be using stats, so it has to be cheap.
 
  Not with a circular buffer it's not cheap, because you need interlocking
  on writes.  Your claim that you can get away without that is simply
  false.  You won't just get lost messages, you'll get corrupted messages.

 How do I get corrupt messages if they are all five bytes?  If I write
 five bytes, and another does the same, I guess the assembler could
 intersperse the writes so the oid gets to be a corrupt value.  Any cheap
 way around this, perhaps by skiping/clearing the write on a collision?

 
   The collector program can read the shared memory stats and keep hashed
   values of accumulated stats.  It uses the "Loops" variable to know if it
   has read the current information in the buffer.
 
  And how does it sleep until the counter has been advanced?  Seems to me
  it has to busy-wait (bad) or sleep (worse; if the minimum sleep delay
  is 10 ms then it's guaranteed to miss a lot of data under load).

 I figured it could just wake up every few seconds and check.  It will
 remember the loop counter and current pointer, and read any new
 information.  I was thinking of a 20k buffer, which could cover about 4k
 events.

Here  I  wonder what your EVENT is. With an Oid as identifier
and a 1 byte (even if it'd be anoter 32-bit value), how  many
messages do you want to generate to get these statistics:

-   Number of sequential scans done per table.
-   Number of tuples returned via sequential scans per table.
-   Number of buffer cache lookups  done  through  sequential
scans per table.
-   Number  of  buffer  cache  hits  for sequential scans per
table.
-   Number of tuples inserted per table.
-   Number of tuples updated per table.
-   Number of tuples deleted per table.
-   Number of index scans done per index.
-   Number of index tuples returned per index.
-   Number of buffer cache lookups  done  due  to  scans  per
index.
-   Number of buffer cache hits per index.
-   Number  of  valid heap tuples returned via index scan per
index.
-   Number of buffer cache lookups done for heap fetches  via
index scan per index.
-   Number  of  buffer  cache hits for heap fetches via index
scan per index.
-   Number of buffer cache lookups not accountable for any of
the above.
-   Number  of  buffer  cache hits not accountable for any of
the above.

What I see is that there's a difference in what we  two  want
to see in the statistics. You're talking about looking at the
actual querystring and such. That's  information  useful  for
someone   actually  looking  at  a  server,  to  see  what  a
particular backend  is  doing.  On  my  notebook  a  parallel
regression  test  (containing 4,000 queries) passes by under
1:30, that's more than 40 queries per second. So that doesn't
tell me much.

What I'm after is to collect the above data over a week or so
and then generate a report to identify the hot spots  of  the
schema.  Which tables/indices cause the most disk I/O, what's
the average percentage of tuples returned in scans (not  from
the  query, I mean from the single scan inside of the joins).
That's the information I need  to  know  where  to  look  for
possibly  better  qualifications, useless indices that aren't
worth to maintain and the like.


Jan

--

#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #



_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Performance monitor signal handler

2001-03-19 Thread Bruce Momjian

  I figured it could just wake up every few seconds and check.  It will
  remember the loop counter and current pointer, and read any new
  information.  I was thinking of a 20k buffer, which could cover about 4k
  events.
 
 Here  I  wonder what your EVENT is. With an Oid as identifier
 and a 1 byte (even if it'd be anoter 32-bit value), how  many
 messages do you want to generate to get these statistics:
 
 -   Number of sequential scans done per table.
 -   Number of tuples returned via sequential scans per table.
 -   Number of buffer cache lookups  done  through  sequential
 scans per table.
 -   Number  of  buffer  cache  hits  for sequential scans per
 table.
 -   Number of tuples inserted per table.
 -   Number of tuples updated per table.
 -   Number of tuples deleted per table.
 -   Number of index scans done per index.
 -   Number of index tuples returned per index.
 -   Number of buffer cache lookups  done  due  to  scans  per
 index.
 -   Number of buffer cache hits per index.
 -   Number  of  valid heap tuples returned via index scan per
 index.
 -   Number of buffer cache lookups done for heap fetches  via
 index scan per index.
 -   Number  of  buffer  cache hits for heap fetches via index
 scan per index.
 -   Number of buffer cache lookups not accountable for any of
 the above.
 -   Number  of  buffer  cache hits not accountable for any of
 the above.
 
 What I see is that there's a difference in what we  two  want
 to see in the statistics. You're talking about looking at the
 actual querystring and such. That's  information  useful  for
 someone   actually  looking  at  a  server,  to  see  what  a
 particular backend  is  doing.  On  my  notebook  a  parallel
 regression  test  (containing 4,000 queries) passes by under
 1:30, that's more than 40 queries per second. So that doesn't
 tell me much.
 
 What I'm after is to collect the above data over a week or so
 and then generate a report to identify the hot spots  of  the
 schema.  Which tables/indices cause the most disk I/O, what's
 the average percentage of tuples returned in scans (not  from
 the  query, I mean from the single scan inside of the joins).
 That's the information I need  to  know  where  to  look  for
 possibly  better  qualifications, useless indices that aren't
 worth to maintain and the like.
 

I was going to have the per-table stats insert a stat record every time
it does a sequential scan, so it sould be [oid][sequential_scan_value]
and allow the collector to gather that and aggregate it.

I didn't think we wanted each backend to do the aggregation per oid. 
Seems expensive. Maybe we would need a count for things like "number of
rows returned" so it would be [oid][stat_type][value].

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Performance monitor signal handler

2001-03-18 Thread Patrick Welche

On Fri, Mar 16, 2001 at 05:25:24PM -0500, Jan Wieck wrote:
 Jan Wieck wrote:
...
 Just  to  get  some  evidence  at hand - could some owners of
 different platforms compile and run  the  attached  little  C
 source please?
... 
 Seems Tom is (unfortunately) right. The pipe blocks at 4K.

On NetBSD-1.5S/i386 with just the highly conservative shmem defaults:

Pipe buffer is 4096 bytes
Sys-V message queue buffer is 2048 bytes

Cheers,

Patrick

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Performance monitor signal handler

2001-03-18 Thread Tom Lane

Jan Wieck [EMAIL PROTECTED] writes:
 Just  to  get  some  evidence  at hand - could some owners of
 different platforms compile and run  the  attached  little  C
 source please?
 (The  program  tests how much data can be stuffed into a pipe
 or a Sys-V message queue before the writer would block or get
 an EAGAIN error).

One final followup on this --- I wasted a fair amount of time just
now trying to figure out why Perl 5.6.0 was silently hanging up
in its self-tests (at op/taint, which seems pretty unrelated...).

The upshot: Jan's test program had left a 16k SysV message queue
hanging about, and that queue was filling all available SysV message
space on my machine.  Seems Perl tries to test message-queue sending,
and it was patiently waiting for some message space to come free.

In short, the SysV message queue limits are so tiny that not only
are you quite likely to get bollixed up if you use messages, but
you're likely to bollix anything else that's using message queues too.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Performance monitor signal handler

2001-03-17 Thread Philip Warner

At 13:49 16/03/01 -0500, Jan Wieck wrote:

Similar problem as with shared  memory  -  size.  If  a  long
running  backend  of  a multithousand table database needs to
send access stats per table - and had accessed them all up to
now - it'll be alot of wasted bandwidth.

Not if you only send totals for individual counters when they change; some
stats may never be resynced, but for the most part it will work. Also, does
Unix allow interrupts to occur as a result of data arrivibg in a pipe? If
so, how about:

- All backends to do *blocking* IO to collector.

- Collector to receive an interrupt when a message arrives; while in the
interrupt it reads the buffer into a local queue, and returns from the
interrupt.

- Main line code processes the queue and writes it to a memory mapped file
for durability.

- If collector dies, postmaster starts another immediately, which slears
the backlog of data in the pipe and then remaps the file.

- Each backend has its own local copy of it's counters which *possibly* to
collector can ask for when it restarts.





Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] Performance monitor signal handler

2001-03-17 Thread Jan Wieck

Philip Warner wrote:
 At 13:49 16/03/01 -0500, Jan Wieck wrote:
 
 Similar problem as with shared  memory  -  size.  If  a  long
 running  backend  of  a multithousand table database needs to
 send access stats per table - and had accessed them all up to
 now - it'll be alot of wasted bandwidth.

 Not if you only send totals for individual counters when they change; some
 stats may never be resynced, but for the most part it will work. Also, does
 Unix allow interrupts to occur as a result of data arrivibg in a pipe? If
 so, how about:

 - All backends to do *blocking* IO to collector.

The  general  problem  remains.  We  only  have  one  central
collector with a limited receive capacity.  The more load  is
on  the  machine,  the  smaller it's capacity gets.  The more
complex the DB schemas get  and  the  more  load  is  on  the
system,  the  more interesting accurate statistics get.  Both
factors are contraproductive. More complex schema means  more
tables  and  thus  bigger  messages.  More  load  means  more
messages.  Having good statistics on a toy system while  they
get  worse  for  a  web  backend  server  that's really under
pressure is braindead from the start.

We don't want the backends to block,  so  that  they  can  do
THEIR work. That's to process queries, nothing else.

Pipes  seem  to  be  inappropriate  because  their  buffer is
limited to 4K on Linux and most BSD flavours. Message  queues
are too because they are limited to 2K on most BSD's. So only
sockets remain.

If we have multiple processes that try to  receive  from  the
UDP  socket,  condense  the  received  packets  into  summary
messages and send them to the central collector,  this  might
solve the problem.


Jan

--

#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #



_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] Performance monitor signal handler

2001-03-17 Thread Samuel Sieb

On Sat, Mar 17, 2001 at 09:33:03AM -0500, Jan Wieck wrote:
 
 The  general  problem  remains.  We  only  have  one  central
 collector with a limited receive capacity.  The more load  is
 on  the  machine,  the  smaller it's capacity gets.  The more
 complex the DB schemas get  and  the  more  load  is  on  the
 system,  the  more interesting accurate statistics get.  Both
 factors are contraproductive. More complex schema means  more
 tables  and  thus  bigger  messages.  More  load  means  more
 messages.  Having good statistics on a toy system while  they
 get  worse  for  a  web  backend  server  that's really under
 pressure is braindead from the start.
 
Just as another suggestion, what about sending the data to a different
computer, so instead of tying up the database server with processing the
statistics, you have another computer that has some free time to do the
processing.

Some drawbacks are that you can't automatically start/restart it from the
postmaster and it will put a little more load on the network, but it seems
to mostly solve the issues of blocked pipes and using too much cpu time
on the database server.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance monitor signal handler

2001-03-17 Thread Tom Lane

Samuel Sieb [EMAIL PROTECTED] writes:
 Just as another suggestion, what about sending the data to a different
 computer, so instead of tying up the database server with processing the
 statistics, you have another computer that has some free time to do the
 processing.

 Some drawbacks are that you can't automatically start/restart it from the
 postmaster and it will put a little more load on the network,

... and a lot more load on the CPU.  Same-machine "network" connections
are much cheaper (on most kernels, anyway) than real network
connections.

I think all of this discussion is vast overkill.  No one has yet
demonstrated that it's not sufficient to have *one* collector process
and a lossy transmission method.  Let's try that first, and if it really
proves to be unworkable then we can get out the lily-gilding equipment.
But there is tons more stuff to do before we have useful stats at all,
and I don't think that this aspect is the most critical part of the
problem.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Performance monitor signal handler

2001-03-17 Thread Bruce Momjian

 ... and a lot more load on the CPU.  Same-machine "network" connections
 are much cheaper (on most kernels, anyway) than real network
 connections.
 
 I think all of this discussion is vast overkill.  No one has yet
 demonstrated that it's not sufficient to have *one* collector process
 and a lossy transmission method.  Let's try that first, and if it really
 proves to be unworkable then we can get out the lily-gilding equipment.
 But there is tons more stuff to do before we have useful stats at all,
 and I don't think that this aspect is the most critical part of the
 problem.

Agreed.  Sounds like overkill.

How about a per-backend shared memory area for stats, plus a global
shared memory area that each backend can add to when it exists.  That
meets most of our problem.

The only open issue is per-table stuff, and I would like to see some
circular buffer implemented to handle that, with a collection process
that has access to shared memory.  Even better, have an SQL table
updated with the per-table stats periodically.  How about a collector
process that periodically reads though the shared memory and UPDATE's
SQL tables with the information.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance monitor signal handler

2001-03-17 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 The only open issue is per-table stuff, and I would like to see some
 circular buffer implemented to handle that, with a collection process
 that has access to shared memory.

That will get us into locking/contention issues.  OTOH, frequent trips
to the kernel to send stats messages --- regardless of the transport
mechanism chosen --- don't seem all that cheap either.

 Even better, have an SQL table updated with the per-table stats
 periodically.

That will be horribly expensive, if it's a real table.

I think you missed the point that somebody made a little while ago
about waiting for functions that can return tuple sets.  Once we have
that, the stats tables can be *virtual* tables, ie tables that are
computed on-demand by some function.  That will be a lot less overhead
than physically updating an actual table.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] Performance monitor signal handler

2001-03-17 Thread Bruce Momjian

 Bruce Momjian [EMAIL PROTECTED] writes:
  The only open issue is per-table stuff, and I would like to see some
  circular buffer implemented to handle that, with a collection process
  that has access to shared memory.
 
 That will get us into locking/contention issues.  OTOH, frequent trips
 to the kernel to send stats messages --- regardless of the transport
 mechanism chosen --- don't seem all that cheap either.

I am confused.  Reading/writing shared memory is not a kernel call,
right?

I agree on the locking contention problems of a circular buffer.

 
  Even better, have an SQL table updated with the per-table stats
  periodically.
 
 That will be horribly expensive, if it's a real table.

But per-table stats aren't something that people will look at often,
right?  They can sit in the collector's memory for quite a while.  See
people wanting to look at per-backend stuff frequently, and that is why
I thought share memory should be good, and a global area for aggregate
stats for all backends.

 I think you missed the point that somebody made a little while ago
 about waiting for functions that can return tuple sets.  Once we have
 that, the stats tables can be *virtual* tables, ie tables that are
 computed on-demand by some function.  That will be a lot less overhead
 than physically updating an actual table.

Yes, but do we want to keep these stats between postmaster restarts? 
And what about writing them to tables when our storage of table stats
gets too big?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Performance monitor signal handler

2001-03-17 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 Even better, have an SQL table updated with the per-table stats
 periodically.
 
 That will be horribly expensive, if it's a real table.

 But per-table stats aren't something that people will look at often,
 right?  They can sit in the collector's memory for quite a while.  See
 people wanting to look at per-backend stuff frequently, and that is why
 I thought share memory should be good, and a global area for aggregate
 stats for all backends.

 I think you missed the point that somebody made a little while ago
 about waiting for functions that can return tuple sets.  Once we have
 that, the stats tables can be *virtual* tables, ie tables that are
 computed on-demand by some function.  That will be a lot less overhead
 than physically updating an actual table.

 Yes, but do we want to keep these stats between postmaster restarts? 
 And what about writing them to tables when our storage of table stats
 gets too big?

All those points seem to me to be arguments in *favor* of a virtual-
table approach, not arguments against it.

Or are you confusing the method of collecting stats with the method
of making the collected stats available for use?

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Performance monitor signal handler

2001-03-17 Thread Bruce Momjian

  But per-table stats aren't something that people will look at often,
  right?  They can sit in the collector's memory for quite a while.  See
  people wanting to look at per-backend stuff frequently, and that is why
  I thought share memory should be good, and a global area for aggregate
  stats for all backends.
 
  I think you missed the point that somebody made a little while ago
  about waiting for functions that can return tuple sets.  Once we have
  that, the stats tables can be *virtual* tables, ie tables that are
  computed on-demand by some function.  That will be a lot less overhead
  than physically updating an actual table.
 
  Yes, but do we want to keep these stats between postmaster restarts? 
  And what about writing them to tables when our storage of table stats
  gets too big?
 
 All those points seem to me to be arguments in *favor* of a virtual-
 table approach, not arguments against it.
 
 Or are you confusing the method of collecting stats with the method
 of making the collected stats available for use?

Maybe I am confusing them.  I didn't see a distinction in the
discussion.

I assumed the UDP/message passing of information to the collector was
the way statistics were collected, and I don't understand why a
per-backend area and global area, with some kind of cicular buffer for
per-table stuff isn't the cheapest, cleanest solution.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Performance monitor signal handler

2001-03-17 Thread Jan Wieck

Tom Lane wrote:
 Samuel Sieb [EMAIL PROTECTED] writes:
  Just as another suggestion, what about sending the data to a different
  computer, so instead of tying up the database server with processing the
  statistics, you have another computer that has some free time to do the
  processing.

  Some drawbacks are that you can't automatically start/restart it from the
  postmaster and it will put a little more load on the network,

 ... and a lot more load on the CPU.  Same-machine "network" connections
 are much cheaper (on most kernels, anyway) than real network
 connections.

 I think all of this discussion is vast overkill.  No one has yet
 demonstrated that it's not sufficient to have *one* collector process
 and a lossy transmission method.  Let's try that first, and if it really
 proves to be unworkable then we can get out the lily-gilding equipment.
 But there is tons more stuff to do before we have useful stats at all,
 and I don't think that this aspect is the most critical part of the
 problem.

Well,

back  to my initial approach with the UDP socket collector. I
now have a collector simply reading  all  messages  from  the
socket.  It  doesn't  do  anything useful except for counting
their number.

Every backend sends a couple  of  1K  junk  messages  at  the
beginning  of  the  main loop. Up to 16 messages, there is no
time(1) measurable  delay  in  the  execution  of  the  "make
runcheck".

The   dummy   collector  can  keep  up  during  the  parallel
regression test until the  backends  send  64  messages  each
time,  at  that number he lost 1.25% of the messages. That is
an amount of statistics data of 256MB to be collected.  Most
of  the  test  queries  will never generate 1K of message, so
that there should be some space here.

My plan  now  is  to  add  some  real  functionality  to  the
collector and the backend, to see if that has an impact.


Jan

--

#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #



_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] Performance monitor signal handler

2001-03-16 Thread Tom Lane

Jan Wieck [EMAIL PROTECTED] writes:
 Uh - not much time to spend if the statistics should at least
 be  half  accurate. And it would become worse in SMP systems.
 So that was a nifty idea, but I think it'd  cause  much  more
 statistic losses than I assumed at first.

 Back to drawing board. Maybe a SYS-V message queue can serve?

That would be the same as a pipe: backends would block if the collector
stopped accepting data.  I do like the "auto discard" aspect of this
UDP-socket approach.

I think Philip had the right idea: each backend should send totals,
not deltas, in its messages.  Then, it doesn't matter (much) if the
collector loses some messages --- that just means that sometimes it
has a slightly out-of-date idea about how much work some backends have
done.  It should be easy to design the software so that that just makes
a small, transient error in the currently displayed statistics.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance monitor signal handler

2001-03-16 Thread Philip Warner

At 17:10 15/03/01 -0800, Alfred Perlstein wrote:
 
 Which is why the backends should not do anything other than maintain the
 raw data. If there is atomic data than can cause inconsistency, then a
 dropped UDP packet will do the same.

The UDP packet (a COPY) can contain a consistant snapshot of the data.
If you have dependancies, you fit a consistant snapshot into a single
packet.

If we were going to go the shared memory way, then yes, as soon as we start
collecting dependant data we would need locking, but IOs, locking stats,
flushes, cache hits/misses are not really in this category.

But I prefer the UDP/Collector model anyway; it gives use greater
flexibility + the ability to keep stats past backend termination, and,as
you say, removes any possible locking requirements from the backends.




Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] Performance monitor signal handler

2001-03-16 Thread Jan Wieck

Alfred Perlstein wrote:
 * Jan Wieck [EMAIL PROTECTED] [010316 08:08] wrote:
  Philip Warner wrote:
  
   But I prefer the UDP/Collector model anyway; it gives use greater
   flexibility + the ability to keep stats past backend termination, and,as
   you say, removes any possible locking requirements from the backends.
 
  OK, did some tests...
 
  The  postmaster can create a SOCK_DGRAM socket at startup and
  bind(2) it to "127.0.0.1:0", what causes the kernel to assign
  a  non-privileged  port  number  that  then  can be read with
  getsockname(2). No other process can have a socket  with  the
  same port number for the lifetime of the postmaster.
 
  If  the  socket  get's  ready, it'll read one backend message
  from   it   with   recvfrom(2).   The   fromaddr   mustbe
  "127.0.0.1:xxx"  where  xxx  is  the  port  number the kernel
  assigned to the above socket.  Yes,  this  is  his  own  one,
  shared  with  postmaster  and  all  backends.  So  both,  the
  postmaster and the backends can  use  this  one  UDP  socket,
  which  the  backends  inherit on fork(2), to send messages to
  the collector. If such  a  UDP  packet  really  came  from  a
  process other than the postmaster or a backend, well then the
  sysadmin has  a  more  severe  problem  than  manipulated  DB
  runtime statistics :-)

 Doing this is a bad idea:

 a) it allows any program to start spamming localhost:randport with
 messages and screw with the postmaster.

 b) it may even allow remote people to mess with it, (see recent
 bugtraq articles about this)

So  it's  possible  for  a  UDP socket to recvfrom(2) and get
packets with  a  fromaddr  localhost:my_own_non_SO_REUSE_port
that really came from somewhere else?

If  that's  possible,  the  packets  must  be coming over the
network.  Oterwise it's the local superuser sending them, and
in  that case it's not worth any more discussion because root
on your system has more powerful possibilities to muck around
with  your  database. And if someone outside the local system
is doing it, it's time for some filter rules, isn't it?

 You should use a unix domain socket (at least when possible).

Unix domain UDP?


  Running  a 500MHz P-III, 192MB, RedHat 6.1 Linux 2.2.17 here,
  I've been able to loose no single message during the parallel
  regression  test,  if each backend sends one 1K sized message
  per query executed, and the collector simply sucks  them  out
  of  the  socket. Message losses start if the collector does a
  per message idle loop like this:
 
  for (i=0,sum=0;i25;i++,sum+=1);
 
  Uh - not much time to spend if the statistics should at least
  be  half  accurate. And it would become worse in SMP systems.
  So that was a nifty idea, but I think it'd  cause  much  more
  statistic losses than I assumed at first.
 
  Back to drawing board. Maybe a SYS-V message queue can serve?

 I wouldn't say back to the drawing board, I would say two steps back.

 What about instead of sending deltas, you send totals?  This would
 allow you to loose messages and still maintain accurate stats.

Similar problem as with shared  memory  -  size.  If  a  long
running  backend  of  a multithousand table database needs to
send access stats per table - and had accessed them all up to
now - it'll be alot of wasted bandwidth.


 You can also enable SIGIO on the socket, then have a signal handler
 buffer packets that arrive when not actively select()ing on the
 UDP socket.  You can then use sigsetmask(2) to provide mutual
 exclusion with your SIGIO handler and general select()ing on the
 socket.

I  already thought that priorizing the socket-drain this way:
there is a fairly big receive buffer. If the buffer is empty,
it  does  a  blocking  select(2). If it's not, it does a non-
blocking (0-timeout) one and only if the  non-blocking  tells
that  there  aren't  new  messages waiting, it'll process one
buffered message and try to receive again.

Will give it a shot.


Jan

--

#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #



_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance monitor signal handler

2001-03-16 Thread Jan Wieck


Tom Lane wrote:
 Jan Wieck [EMAIL PROTECTED] writes:
  Uh - not much time to spend if the statistics should at least
  be  half  accurate. And it would become worse in SMP systems.
  So that was a nifty idea, but I think it'd  cause  much  more
  statistic losses than I assumed at first.

  Back to drawing board. Maybe a SYS-V message queue can serve?

 That would be the same as a pipe: backends would block if the collector
 stopped accepting data.  I do like the "auto discard" aspect of this
 UDP-socket approach.

Does  a pipe guarantee that a buffer, written with one atomic
write(2), never can get intermixed with  other  data  on  the
readers  end?   I know that you know what I mean, but for the
broader audience: Let's define a message to the collector  to
be  4byte-len,len-bytes.   Now  hundreds  of  backends hammer
messages into the (shared) writing end of the pipe, all  with
different sizes. Is itGUARANTEEDthata
read(4bytes),read(nbytes) sequence will  allways  return  one
complete  message  and  never  intermixed  parts of different
write(2)s?

With message queues, this is guaranteed. Also, message queues
would  make  it  easy  to query the collected statistics (see
below).

 I think Philip had the right idea: each backend should send totals,
 not deltas, in its messages.  Then, it doesn't matter (much) if the
 collector loses some messages --- that just means that sometimes it
 has a slightly out-of-date idea about how much work some backends have
 done.  It should be easy to design the software so that that just makes
 a small, transient error in the currently displayed statistics.

If we use two message queues (IPC_PRIVATE  is  enough  here),
one  into collector and one into backend direction, this'd be
an easy way to collect and query statistics.

The backends send delta stats messages to  the  collector  on
one  queue. Message queues block, by default, but the backend
could use IPC_NOWAIT and just go on and collect up,  as  long
as  it finally will use a blocking call before exiting. We'll
loose  statistics  for  backends  that  go  down  in   flames
(coredump), but who cares for statistics then?

To  query statistics, we have a set of new builtin functions.
All functions share  a  global  statistics  snapshot  in  the
backend.  If  on  function call the snapshot doesn't exist or
was generated by  another  XACT/commandcounter,  the  backend
sends  a  statistics  request  for  his  database  ID  to the
collector and waits for the messages to arrive on the  second
message  queue. It can pick up the messages meant for him via
message type, which's equal to his backend number +1, because
the  collector will send 'em as such.  For table access stats
for example, the snapshot will have slots identified  by  the
tables  OID,  so  a function pg_get_tables_seqscan_count(oid)
should be easy  to  implement.  And  setting  up  views  that
present access stats in readable format is a nobrainer.

Now  we  have communication only between the backends and the
collector.  And we're  certain  that  only  someone  able  to
SELECT from a system view will ever see this information.


Jan

--

#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #


_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Performance monitor signal handler

2001-03-16 Thread Tom Lane

Jan Wieck [EMAIL PROTECTED] writes:
 Does  a pipe guarantee that a buffer, written with one atomic
 write(2), never can get intermixed with  other  data  on  the
 readers  end?

Yes.  The HPUX man page for write(2) sez:

  o  Write requests of {PIPE_BUF} bytes or less will not be
 interleaved with data from other processes doing writes on the
 same pipe.  Writes of greater than {PIPE_BUF} bytes may have
 data interleaved, on arbitrary boundaries, with writes by
 other processes, whether or not the O_NONBLOCK flag of the
 file status flags is set.

Stevens' _UNIX Network Programming_ (1990) states this is true for all
pipes (nameless or named) on all flavors of Unix, and furthermore states
that PIPE_BUF is at least 4K on all systems.  I don't have any relevant
Posix standards to look at, but I'm not worried about assuming this to
be true.

 With message queues, this is guaranteed. Also, message queues
 would  make  it  easy  to query the collected statistics (see
 below).

I will STRONGLY object to any proposal that we use message queues.
We've already had enough problems with the ridiculously low kernel
limits that are commonly imposed on shmem and SysV semaphores.
We don't need to buy into that silliness yet again with message queues.
I don't believe they gain us anything over pipes anyway.

The real problem with either pipes or message queues is that backends
will block if the collector stops collecting data.  I don't think we
want that.  I suppose we could have the backends write a pipe with
O_NONBLOCK and ignore failure, however:

  o  If the O_NONBLOCK flag is set, write() requests will  be
 handled differently, in the following ways:

 -  The write() function will not block the process.

 -  A write request for {PIPE_BUF} or fewer bytes  will have
the following effect:  If there is sufficient space
available in the pipe, write() will transfer all the data
and return the number of bytes  requested.  Otherwise,
write() will transfer no data and return -1 with errno set
to EAGAIN.

Since we already ignore SIGPIPE, we don't need to worry about losing the
collector entirely.

Now this would put a pretty tight time constraint on the collector:
fall more than 4K behind, you start losing data.  I am not sure if
a UDP socket would provide more buffering or not; anyone know?

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] Performance monitor signal handler

2001-03-16 Thread Jan Wieck

Tom Lane wrote:
 Jan Wieck [EMAIL PROTECTED] writes:
  Does  a pipe guarantee that a buffer, written with one atomic
  write(2), never can get intermixed with  other  data  on  the
  readers  end?

 Yes.  The HPUX man page for write(2) sez:

   o  Write requests of {PIPE_BUF} bytes or less will not be
  interleaved with data from other processes doing writes on the
  same pipe.  Writes of greater than {PIPE_BUF} bytes may have
  data interleaved, on arbitrary boundaries, with writes by
  other processes, whether or not the O_NONBLOCK flag of the
  file status flags is set.

 Stevens' _UNIX Network Programming_ (1990) states this is true for all
 pipes (nameless or named) on all flavors of Unix, and furthermore states
 that PIPE_BUF is at least 4K on all systems.  I don't have any relevant
 Posix standards to look at, but I'm not worried about assuming this to
 be true.

That's good news - and maybe a Good Assumption (TM).

  With message queues, this is guaranteed. Also, message queues
  would  make  it  easy  to query the collected statistics (see
  below).

 I will STRONGLY object to any proposal that we use message queues.
 We've already had enough problems with the ridiculously low kernel
 limits that are commonly imposed on shmem and SysV semaphores.
 We don't need to buy into that silliness yet again with message queues.
 I don't believe they gain us anything over pipes anyway.

   OK.

 The real problem with either pipes or message queues is that backends
 will block if the collector stops collecting data.  I don't think we
 want that.  I suppose we could have the backends write a pipe with
 O_NONBLOCK and ignore failure, however:

   o  If the O_NONBLOCK flag is set, write() requests will  be
  handled differently, in the following ways:

  -  The write() function will not block the process.

  -  A write request for {PIPE_BUF} or fewer bytes  will have
 the following effect:  If there is sufficient space
 available in the pipe, write() will transfer all the data
 and return the number of bytes  requested.  Otherwise,
 write() will transfer no data and return -1 with errno set
 to EAGAIN.

 Since we already ignore SIGPIPE, we don't need to worry about losing the
 collector entirely.

That's  not  what  the manpage said. It said that in the case
you're inside PIPE_BUF size and using O_NONBLOCK, you  either
send complete messages or nothing, getting an EAGAIN then.

So  we  could  do the same here and write to the pipe. In the
case we cannot, just count up and try  again  next  year  (or
so).


 Now this would put a pretty tight time constraint on the collector:
 fall more than 4K behind, you start losing data.  I am not sure if
 a UDP socket would provide more buffering or not; anyone know?

Again,   this   ain't  what  the  manpage  said.  If  there's
sufficient space available in the pipe  in  combination  with
that  PIPE_BUF  is  at least 4K doesn't necessarily mean that
the pipes buffer space is 4K.

Well,  what  I'm  missing  is  the  ability  to  filter   out
statistics reports on the backend side via msgrcv(2)s msgtype
:-(


Jan

--

#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #



_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Performance monitor signal handler

2001-03-16 Thread Jan Wieck

Tom Lane wrote:
 Now this would put a pretty tight time constraint on the collector:
 fall more than 4K behind, you start losing data.  I am not sure if
 a UDP socket would provide more buffering or not; anyone know?

Looks  like Linux has something around 16-32K of buffer space
for UDP sockets. Just from eyeballing the  fprintf(3)  output
of my destructively hacked postleprechaun.


Jan

--

#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #



_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance monitor signal handler

2001-03-16 Thread Jan Wieck

Jan Wieck wrote:
 Tom Lane wrote:
  Now this would put a pretty tight time constraint on the collector:
  fall more than 4K behind, you start losing data.  I am not sure if
  a UDP socket would provide more buffering or not; anyone know?

 Looks  like Linux has something around 16-32K of buffer space
 for UDP sockets. Just from eyeballing the  fprintf(3)  output
 of my destructively hacked postleprechaun.

Just  to  get  some  evidence  at hand - could some owners of
different platforms compile and run  the  attached  little  C
source please?

(The  program  tests how much data can be stuffed into a pipe
or a Sys-V message queue before the writer would block or get
an EAGAIN error).

My output on RedHat6.1 Linux 2.2.17 is:

Pipe buffer is 4096 bytes
Sys-V message queue buffer is 16384 bytes

Seems Tom is (unfortunately) right. The pipe blocks at 4K.

So  a  Sys-V  message  queue,  with the ability to distribute
messages from  the  collector  to  individual  backends  with
kernel  support  via  "mtype"  is  four  times by unestimated
complexity better here.  What does your system say?

I really never thought that Sys-V IPC is a good way to go  at
all.   I  hate  it's  incompatibility to the select(2) system
call and all these  OS/installation  dependant  restrictions.
But I'm tempted to reevaluate it "for this case".


Jan

--

#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #




#include stdio.h
#include stdlib.h
#include unistd.h
#include fcntl.h
#include errno.h

#include sys/types.h
#include sys/ipc.h
#include sys/msg.h


typedef struct  test_message
{
longmtype;
charmtext[512 - sizeof(long)];
} test_message;


static int  test_pipe(void);
static int  test_msg(void);


int
main(int argc, char *argv[])
{
if(test_pipe()  0)
return 1;

if(test_msg()  0)
return 1;

return 0;
}


static int
test_pipe(void)
{
int p[2];
charbuf[512];
int done;
int rc;

if (pipe(p)  0)
{
perror("pipe(2)");
return -1;
}

if (fcntl(p[1], F_SETFL, O_NONBLOCK)  0)
{
perror("fcntl(2)");
return -1;
}

for(done = 0; ; )
{
if ((rc = write(p[1], buf, sizeof(buf))) != sizeof(buf))
{
if (rc  0)
{
extern int errno;

if (errno == EAGAIN)
{
printf("Pipe buffer is %d bytes\n", done);
return 0;
}

perror("write(2)");
return -1;
}

fprintf(stderr, "whatever happened - rc = %d on write(2)\n", rc);
return -1;
}
done += rc;
}

fprintf(stderr, "Endless write loop returned - what's that?\n");
return -1;
}


static int
test_msg(void)
{
int mq;
test_messagemsg;
int done;

if ((mq = msgget(IPC_PRIVATE, IPC_CREAT | 0600))  0)
{
perror("msgget(2)");
return -1;
}

for (done = 0; ; )
{
msg.mtype = 1;
if (msgsnd(mq, msg, sizeof(msg), IPC_NOWAIT)  0)
{
extern int  errno;

if (errno == EAGAIN)
{
printf("Sys-V message queue buffer is %d bytes\n", done);
return 0;
}

perror("msgsnd(2)");
return -1;
}
done += sizeof(msg);
}

fprintf(stderr, "Endless write loop returned - what's that?\n");
return -1;
}





---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Performance monitor signal handler

2001-03-16 Thread Tom Lane

Jan Wieck [EMAIL PROTECTED] writes:
 Just  to  get  some  evidence  at hand - could some owners of
 different platforms compile and run  the  attached  little  C
 source please?

HPUX 10.20:

Pipe buffer is 8192 bytes
Sys-V message queue buffer is 16384 bytes

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Performance monitor signal handler

2001-03-16 Thread Giles Lean


 Just  to  get  some  evidence  at hand - could some owners of
 different platforms compile and run  the  attached  little  C
 source please?

$ uname -srm
FreeBSD 4.1.1-STABLE
$ ./jan
Pipe buffer is 16384 bytes
Sys-V message queue buffer is 2048 bytes

$ uname -srm
NetBSD 1.5 alpha
$ ./jan
Pipe buffer is 4096 bytes
Sys-V message queue buffer is 2048 bytes

$ uname -srm
NetBSD 1.5_BETA2 i386
$ ./jan
Pipe buffer is 4096 bytes
Sys-V message queue buffer is 2048 bytes

$ uname -srm
NetBSD 1.4.2 i386
$ ./jan
Pipe buffer is 4096 bytes
Sys-V message queue buffer is 2048 bytes

$ uname -srm
NetBSD 1.4.1 sparc
$ ./jan
Pipe buffer is 4096 bytes
Bad system call (core dumped)   # no SysV IPC in running kernel

$ uname -srm
HP-UX B.11.11 9000/800
$ ./jan
Pipe buffer is 8192 bytes
Sys-V message queue buffer is 16384 bytes

$ uname -srm
HP-UX B.11.00 9000/813
$ ./jan
Pipe buffer is 8192 bytes
Sys-V message queue buffer is 16384 bytes

$ uname -srm
HP-UX B.10.20 9000/871
$ ./jan
Pipe buffer is 8192 bytes
Sys-V message queue buffer is 16384 bytes

HP-UX can also use STREAMS based pipes if the kernel parameter
streampipes is set.  Using STREAMS based pipes increases the pipe
buffer size by a lot:

# uname -srm 
HP-UX B.11.11 9000/800
# ./jan
Pipe buffer is 131072 bytes
Sys-V message queue buffer is 16384 bytes

# uname -srm
HP-UX B.11.00 9000/800
# ./jan
Pipe buffer is 131072 bytes
Sys-V message queue buffer is 16384 bytes

Regards,

Giles

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] Performance monitor signal handler

2001-03-16 Thread Larry Rosenman

* Jan Wieck [EMAIL PROTECTED] [010316 16:35]:
 Jan Wieck wrote:
  Tom Lane wrote:
   Now this would put a pretty tight time constraint on the collector:
   fall more than 4K behind, you start losing data.  I am not sure if
   a UDP socket would provide more buffering or not; anyone know?
 
  Looks  like Linux has something around 16-32K of buffer space
  for UDP sockets. Just from eyeballing the  fprintf(3)  output
  of my destructively hacked postleprechaun.
 
 Just  to  get  some  evidence  at hand - could some owners of
 different platforms compile and run  the  attached  little  C
 source please?
 
 (The  program  tests how much data can be stuffed into a pipe
 or a Sys-V message queue before the writer would block or get
 an EAGAIN error).
 
 My output on RedHat6.1 Linux 2.2.17 is:
 
 Pipe buffer is 4096 bytes
 Sys-V message queue buffer is 16384 bytes
 
 Seems Tom is (unfortunately) right. The pipe blocks at 4K.
 
 So  a  Sys-V  message  queue,  with the ability to distribute
 messages from  the  collector  to  individual  backends  with
 kernel  support  via  "mtype"  is  four  times by unestimated
 complexity better here.  What does your system say?
 
 I really never thought that Sys-V IPC is a good way to go  at
 all.   I  hate  it's  incompatibility to the select(2) system
 call and all these  OS/installation  dependant  restrictions.
 But I'm tempted to reevaluate it "for this case".
 
 
 Jan
$ ./queuetest
Pipe buffer is 32768 bytes
Sys-V message queue buffer is 4096 bytes
$ uname -a
UnixWare lerami 5 7.1.1 i386 x86at SCO UNIX_SVR5
$ 

I think some of these are configurable...

LER

-- 
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 972-414-9812 E-Mail: [EMAIL PROTECTED]
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Performance monitor signal handler

2001-03-16 Thread Larry Rosenman

* Larry Rosenman [EMAIL PROTECTED] [010316 20:47]:
 * Jan Wieck [EMAIL PROTECTED] [010316 16:35]:
 $ ./queuetest
 Pipe buffer is 32768 bytes
 Sys-V message queue buffer is 4096 bytes
 $ uname -a
 UnixWare lerami 5 7.1.1 i386 x86at SCO UNIX_SVR5
 $ 
 
 I think some of these are configurable...
They both are.  FIFOBLKSIZE and MSGMNB or some such kernel tunable.

I can get more info if you need it.

LER

-- 
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 972-414-9812 E-Mail: [EMAIL PROTECTED]
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Performance monitor signal handler

2001-03-15 Thread Tom Lane

Jan Wieck [EMAIL PROTECTED] writes:
 What about a collector deamon, fired up by the postmaster and
 receiving UDP packets from the backends. Under heavy load, it
 might miss some statistic messages, well, but that's  not  as
 bad as having locks causing backends to loose performance.

Interesting thought, but we don't want UDP I think; that just opens
up a whole can of worms about checking access permissions and so forth.
Why not a simple pipe?  The postmaster creates the pipe and the
collector daemon inherits one end, while all the backends inherit the
other end.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance monitor signal handler

2001-03-15 Thread Jan Wieck

Tom Lane wrote:
 Jan Wieck [EMAIL PROTECTED] writes:
  What about a collector deamon, fired up by the postmaster and
  receiving UDP packets from the backends. Under heavy load, it
  might miss some statistic messages, well, but that's  not  as
  bad as having locks causing backends to loose performance.

 Interesting thought, but we don't want UDP I think; that just opens
 up a whole can of worms about checking access permissions and so forth.
 Why not a simple pipe?  The postmaster creates the pipe and the
 collector daemon inherits one end, while all the backends inherit the
 other end.

I don't think so - though I haven't tested the following yet,
but AFAIR it's correct.

Have the postmaster creating two UDP sockets before it  forks
off the collector. It can examine the peer addresses of both,
so they don't need well known port numbers,  it  can  be  the
randomly  ones  assigned  by  the kernel. Thus, we don't need
SO_REUSE on them either.

Now, since the collector is forked off by the postmaster,  it
knows  the  peer  address  of the other socket. And since all
backends get forked off from the postmaster as well,  they'll
all  use  the  same  peer  address,  don't  they?  So all the
collector has to look at is the sender address including port
number  of  the  packets.  It needs to be what the postmaster
examined, anything else is from someone else and goes to  bit
heaven.  The  same  way the backends know where to send their
statistics.

If I'm right that in the case of fork()  all  children  share
the  same  socket  with the same peer address, then it's even
safe in the case the collector dies. The postmaster can still
hold the collectors socket and will notice that the collector
died (due to a wait() returning it's PID)  and  can  fire  up
another one. Again some packets got lost (plus all the so far
collected statistics, hmmm - aint that a cool  way  to  reset
statistic  counters - killing the collector?), but it did not
disturb any live backend in any way. They will never get  any
signal,  don't  care  about what's done with their statistics
and such. They just do their work...


Jan

--

#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #


_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Performance monitor signal handler

2001-03-15 Thread Alfred Perlstein

* Philip Warner [EMAIL PROTECTED] [010315 16:14] wrote:
 At 06:57 15/03/01 -0500, Jan Wieck wrote:
 
 And  shared  memory has all the interlocking problems we want
 to avoid.
 
 I suspect that if we keep per-backend data in a separate area, then we
 don;t need locking since there is only one writer. It does not matter if a
 reader gets an inconsistent view, the same as if you drop a few UDP packets.

No, this is completely different.

Lost data is probably better than incorrect data.  Either use locks
or a copying mechanism.  People will depend on the data returned
making sense.

-- 
-Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]]


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance monitor signal handler

2001-03-15 Thread Philip Warner

At 06:57 15/03/01 -0500, Jan Wieck wrote:

And  shared  memory has all the interlocking problems we want
to avoid.

I suspect that if we keep per-backend data in a separate area, then we
don;t need locking since there is only one writer. It does not matter if a
reader gets an inconsistent view, the same as if you drop a few UDP packets.


What about a collector deamon, fired up by the postmaster and
receiving UDP packets from the backends. 

This does sound appealing; it means that individual backend data (IO etc)
will survive past the termination of the backend. I'd like to see the stats
survive the death of the collector if possible, possibly even survive a
stop/start of the postmaster.


Now whatever the backend has to tell the collector, it simply
throws  a UDP packet into his direction. If the collector can
catch it or not, not the backends problem.

If we get the backends to keep the stats they are sending in local counters
as well, then they can send the counter value (not delta) each time, which
would mean that the collector would not 'miss' anything - just it's
operations/sec might see a hiccough. This could have a sidebenefit that(if
wewanted to?) we could allow a client to query their own counters to get an
idea of the costs of their queries.

When we need to reset the counters that should be done explicitly, I think.



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance monitor signal handler

2001-03-15 Thread Philip Warner

At 16:17 15/03/01 -0800, Alfred Perlstein wrote:

Lost data is probably better than incorrect data.  Either use locks
or a copying mechanism.  People will depend on the data returned
making sense.


But with per-backend data, there is only ever *one* writer to a given set
of counters. Everyone else is a reader.



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Performance monitor signal handler

2001-03-15 Thread Alfred Perlstein

* Philip Warner [EMAIL PROTECTED] [010315 16:46] wrote:
 At 16:17 15/03/01 -0800, Alfred Perlstein wrote:
 
 Lost data is probably better than incorrect data.  Either use locks
 or a copying mechanism.  People will depend on the data returned
 making sense.
 
 
 But with per-backend data, there is only ever *one* writer to a given set
 of counters. Everyone else is a reader.

This doesn't prevent a reader from getting an inconsistant view.

Think about a 64bit counter on a 32bit machine.  If you charged per
megabyte, wouldn't it upset you to have a small chance of loosing
4 billion units of sale?

(ie, doing a read after an addition that wraps the low 32 bits
but before the carry is done to the top most signifigant 32bits?)

Ok, what what if everything can be read atomically by itself?

You're still busted the minute you need to export any sort of
compound stat.

If A, B and C need to add up to 100 you have a read race.

-- 
-Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]]


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance monitor signal handler

2001-03-15 Thread Alfred Perlstein

* Philip Warner [EMAIL PROTECTED] [010315 17:08] wrote:
 At 16:55 15/03/01 -0800, Alfred Perlstein wrote:
 * Philip Warner [EMAIL PROTECTED] [010315 16:46] wrote:
  At 16:17 15/03/01 -0800, Alfred Perlstein wrote:
  
  Lost data is probably better than incorrect data.  Either use locks
  or a copying mechanism.  People will depend on the data returned
  making sense.
  
  
  But with per-backend data, there is only ever *one* writer to a given set
  of counters. Everyone else is a reader.
 
 This doesn't prevent a reader from getting an inconsistant view.
 
 Think about a 64bit counter on a 32bit machine.  If you charged per
 megabyte, wouldn't it upset you to have a small chance of loosing
 4 billion units of sale?
 
 (ie, doing a read after an addition that wraps the low 32 bits
 but before the carry is done to the top most signifigant 32bits?)
 
 I assume this means we can not rely on the existence of any kind of
 interlocked add on 64 bit machines?
 
 
 Ok, what what if everything can be read atomically by itself?
 
 You're still busted the minute you need to export any sort of
 compound stat.
 
 Which is why the backends should not do anything other than maintain the
 raw data. If there is atomic data than can cause inconsistency, then a
 dropped UDP packet will do the same.

The UDP packet (a COPY) can contain a consistant snapshot of the data.
If you have dependancies, you fit a consistant snapshot into a single
packet.

-- 
-Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]]


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] Performance monitor signal handler

2001-03-15 Thread Philip Warner

At 16:55 15/03/01 -0800, Alfred Perlstein wrote:
* Philip Warner [EMAIL PROTECTED] [010315 16:46] wrote:
 At 16:17 15/03/01 -0800, Alfred Perlstein wrote:
 
 Lost data is probably better than incorrect data.  Either use locks
 or a copying mechanism.  People will depend on the data returned
 making sense.
 
 
 But with per-backend data, there is only ever *one* writer to a given set
 of counters. Everyone else is a reader.

This doesn't prevent a reader from getting an inconsistant view.

Think about a 64bit counter on a 32bit machine.  If you charged per
megabyte, wouldn't it upset you to have a small chance of loosing
4 billion units of sale?

(ie, doing a read after an addition that wraps the low 32 bits
but before the carry is done to the top most signifigant 32bits?)

I assume this means we can not rely on the existence of any kind of
interlocked add on 64 bit machines?


Ok, what what if everything can be read atomically by itself?

You're still busted the minute you need to export any sort of
compound stat.

Which is why the backends should not do anything other than maintain the
raw data. If there is atomic data than can cause inconsistency, then a
dropped UDP packet will do the same.





Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] Performance monitor

2001-03-13 Thread Denis Perchine

  Small question... Will it work in console? Or it will be X only?

 It will be tck/tk, so I guess X only.

That's bad. Cause it will be unuseful for people having databases far away...
Like me... :-((( Another point is that it is a little bit strange to have 
X-Window on machine with database server... At least if it is not for play, 
but production one...

Also there should be a possibility of remote monitoring of the database. But 
that's just dream... :-)))

-- 
Sincerely Yours,
Denis Perchine

--
E-Mail: [EMAIL PROTECTED]
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
--

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



RE: [HACKERS] Performance monitor

2001-03-13 Thread Mike Mascari

I don't want to look a gift horse in the mouth, but it seems to me that the 
performance monitor should wait until the now-famous query tree redesign 
which will allow for sets from functions. I realize that the shared memory 
requirements might be a bit large, but somehow Oracle accomplishes this 
nicely, with some  50 views (V$ACCESS through V$WAITSTAT) which can be 
queried, usually via SQL*DBA, for performance statistics. More then 50 
performance views may be over-kill, but having the ability to fetch the 
performance statistics with normal queries sure is nice. Perhaps a 
postmaster option which would enable/disable the use of accumulating 
performance statistics in shared memory might ease the hesitation against 
it?

Mike Mascari
[EMAIL PROTECTED]

-Original Message-
From:   Denis Perchine [SMTP:[EMAIL PROTECTED]]

That's bad. Cause it will be unuseful for people having databases far 
away...
Like me... :-((( Another point is that it is a little bit strange to have
X-Window on machine with database server... At least if it is not for play, 
but production one...

Also there should be a possibility of remote monitoring of the database. 
But
that's just dream... :-)))

--
Sincerely Yours,
Denis Perchine


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Performance monitor signal handler

2001-03-13 Thread Alfred Perlstein

* Philip Warner [EMAIL PROTECTED] [010312 18:56] wrote:
 At 13:34 12/03/01 -0800, Alfred Perlstein wrote:
 Is it possible
 to have a spinlock over it so that an external utility can take a snapshot
 of it with the spinlock held?
 
 I'd suggest that locking the stats area might be a bad idea; there is only
 one writer for each backend-specific chunk, and it won't matter a hell of a
 lot if a reader gets inconsistent views (since I assume they will be
 re-reading every second or so). All the stats area should contain would be
 a bunch of counters with timestamps, I think, and the cost up writing to it
 should be kept to an absolute minimum.
 
 
 
 just some ideas..
 
 
 Unfortunatley, based on prior discussions, Bruce seems quite opposed to a
 shared memory solution.

Ok, here's another nifty idea.

On reciept of the info signal, the backends collaborate to piece
together a status file.  The status file is given a temporay name.
When complete the status file is rename(2)'d over a well known
file.

This ought to always give a consistant snapshot of the file to
whomever opens it.

-- 
-Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]]
Daemon News Magazine in your snail-mail! http://magazine.daemonnews.org/


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Performance monitor signal handler

2001-03-13 Thread Bruce Momjian

 
 This ought to always give a consistant snapshot of the file to
 whomever opens it.
 
 
 I think Tom has previously stated that there are technical reasons not to
 do IO in signal handlers, and I have philosophical problems with
 performance monitors that ask 50 backends to do file IO. I really do think
 shared memory is TWTG.

The good news is that right now pgmonitor gets all its information from
'ps', and only shows the query when the user asks for it.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance monitor signal handler

2001-03-13 Thread Philip Warner


This ought to always give a consistant snapshot of the file to
whomever opens it.


I think Tom has previously stated that there are technical reasons not to
do IO in signal handlers, and I have philosophical problems with
performance monitors that ask 50 backends to do file IO. I really do think
shared memory is TWTG.





Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Performance monitor signal handler

2001-03-13 Thread Alfred Perlstein

* Philip Warner [EMAIL PROTECTED] [010313 06:42] wrote:
 
 This ought to always give a consistant snapshot of the file to
 whomever opens it.
 
 
 I think Tom has previously stated that there are technical reasons not to
 do IO in signal handlers, and I have philosophical problems with
 performance monitors that ask 50 backends to do file IO. I really do think
 shared memory is TWTG.

I wasn't really suggesting any of those courses of action, all I
suggested was using rename(2) to give a seperate appilcation a
consistant snapshot of the stats.

Actually, what makes the most sense (although it may be a performance
killer) is to have the backends update a system table that the external
app can query.

-- 
-Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]]
Daemon News Magazine in your snail-mail! http://magazine.daemonnews.org/


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Performance monitor signal handler

2001-03-13 Thread Bruce Momjian

 At 13:34 12/03/01 -0800, Alfred Perlstein wrote:
 Is it possible
 to have a spinlock over it so that an external utility can take a snapshot
 of it with the spinlock held?
 
 I'd suggest that locking the stats area might be a bad idea; there is only
 one writer for each backend-specific chunk, and it won't matter a hell of a
 lot if a reader gets inconsistent views (since I assume they will be
 re-reading every second or so). All the stats area should contain would be
 a bunch of counters with timestamps, I think, and the cost up writing to it
 should be kept to an absolute minimum.
 
 
 
 just some ideas..
 
 
 Unfortunatley, based on prior discussions, Bruce seems quite opposed to a
 shared memory solution.

No, I like the shared memory idea.  Such an idea will have to wait for
7.2, and second, there are limits to how much shared memory I can use. 

Eventually, I think shared memory will be the way to go.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Performance monitor

2001-03-13 Thread Bruce Momjian

 I don't want to look a gift horse in the mouth, but it seems to me that the 
 performance monitor should wait until the now-famous query tree redesign 
 which will allow for sets from functions. I realize that the shared memory 
 requirements might be a bit large, but somehow Oracle accomplishes this 
 nicely, with some  50 views (V$ACCESS through V$WAITSTAT) which can be 
 queried, usually via SQL*DBA, for performance statistics. More then 50 
 performance views may be over-kill, but having the ability to fetch the 
 performance statistics with normal queries sure is nice. Perhaps a 
 postmaster option which would enable/disable the use of accumulating 
 performance statistics in shared memory might ease the hesitation against 
 it?

I don't think query design is an issue here.  We can already create
views to do such things.  Right now, pgmonitor simply uses 'ps'. and
uses gdb to attach to the running process and show the query being
executed.  For 7.2, I hope to improve it.  I like the shared memory
ideas, and the ability to use a query rather than accessing shared
memory directly.

Seems we should have each backend store query/stat information in shared
memory, and create special views to access that information.  We can
restrict such views to the postgres super-user.


-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Performance monitor

2001-03-13 Thread Bruce Momjian

[ Charset KOI8-R unsupported, converting... ]
   Small question... Will it work in console? Or it will be X only?
 
  It will be tck/tk, so I guess X only.
 
 That's bad. Cause it will be unuseful for people having databases far away...
 Like me... :-((( Another point is that it is a little bit strange to have 
 X-Window on machine with database server... At least if it is not for play, 
 but production one...
 
 Also there should be a possibility of remote monitoring of the database. But 
 that's just dream... :-)))

What about remote-X using the DISPLAY variable?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance monitor

2001-03-13 Thread Bruce Momjian

 Hard to say.  'ps' gives some great information about cpu/memory usage
 that may be hard/costly to put in shared memory.  One idea should be to
 issue periodic 'ps/kill' commands though a telnet/ssh pipe to the
 remote machine, or just to the remote X display option.
 
 Of course, getrusage() gives us much of that information.

Forget getrusage().  Only works on current process, so each backend
would have to update its own statistics.  Sounds expensive.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance monitor

2001-03-13 Thread Justin Clift

Hi Guys,

I'd just like to point out that for most secure installations, X is
removed from servers as part of the "remove all software which isn't
absolutely needed."

I know of Solaris machines which perform as servers with a total of 19
OS packages installed, and then precompiled binaries of the server
programs are loaded onto these machines.

Removal of all not-absolutely-necessary software iss also the
recommended procedure by Sun for setting up server platforms.

Having something based on X will be useable by lots of people, just not
by those who make the effort to take correct security precautions.

Regards and best wishes,

Justin Clift

Bruce Momjian wrote:
 
   It will be tck/tk, so I guess X only.
 
   Good point. A typical DB server -- where is performance important --
  has install Xwin?
 
   BTW, I hate Oracle 8.x.x because has X+java based installer, but some
  my servers hasn't monitor and keyboard let alone to Xwin.
 
   What implement performance monitor as client/server application where
  client is some shared lib? This solution allows to create more clients
  for more differents GUI. I know... it's easy planning, but the other
  thing is programming it  :-)
 
 My idea is that they can telnet into the server machine and do remote-X
 with the application.  Just set the DISPLAY variable and it should work.
 
 --
   Bruce Momjian|  http://candle.pha.pa.us
   [EMAIL PROTECTED]   |  (610) 853-3000
   +  If your life is a hard drive, |  830 Blythe Avenue
   +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026
 
 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Performance monitor

2001-03-13 Thread Bruce Momjian

  It will be tck/tk, so I guess X only.
 
  Good point. A typical DB server -- where is performance important -- 
 has install Xwin?
 
  BTW, I hate Oracle 8.x.x because has X+java based installer, but some
 my servers hasn't monitor and keyboard let alone to Xwin.
 
  What implement performance monitor as client/server application where
 client is some shared lib? This solution allows to create more clients
 for more differents GUI. I know... it's easy planning, but the other 
 thing is programming it  :-)

My idea is that they can telnet into the server machine and do remote-X
with the application.  Just set the DISPLAY variable and it should work.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Performance monitor

2001-03-13 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 My idea is that they can telnet into the server machine and do remote-X
 with the application.  Just set the DISPLAY variable and it should work.

Remote X pretty well sucks in the real world.  Aside from speed issues
there is the little problem of firewalls filtering out X connections.

If you've got ssh running then you can tunnel the X connection through
the ssh connection, which fixes the firewall problem, but it makes the
speed problem worse.  And getting ssh plus X forwarding working is not
something I want to have to hassle with when my remote database is down.

If you are thinking of telnet-based remote admin then I suggest you get
out your curses man page and do up a curses GUI.  (No smiley... I'd
seriously prefer that to something that depends on remote X.)

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



[HACKERS] Performance monitor signal handler

2001-03-12 Thread Bruce Momjian

I was going to implement the signal handler like we do with Cancel,
where the signal sets a flag and we check the status of the flag in
various _safe_ places.

Can anyone think of a better way to get information out of a backend?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Performance monitor

2001-03-12 Thread Karl DeBisschop


On 2001.03.07 22:06 Bruce Momjian wrote:
  I think Bruce wants per-backend data, and this approach would seem to
 only
  get the data for the current backend. 
  
  Also, I really don't like the proposal to write files to /tmp. If we
 want a
  perf tool, then we need to have something like 'top', which will
  continuously update. With 40 backends, the idea of writing 40 file to
 /tmp
  every second seems a little excessive to me.
 
 My idea was to use 'ps' to gather most of the information, and just use
 the internal stats when someone clicked on a backend and wanted more
 information.

My own experience is that parsing ps can be difficult if you want to be
portable and want more than basic information. Quite clearly, I could just
be dense, but if it helps, you can look at the configure.in in the CVS tree
at http://sourceforge.net/projects/netsaintplug (GPL, sorry. But if you
find anything worthwhile, and borrowing concepts results in similar code, I
won't complain).

I wouldn't be at all surprised if you found a better approach - my
configuration above, to my mind at least, is not pretty. I hope you do find
a better approach - I know I'll be peeking at your code to see. 

-- 
Karl


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance monitor

2001-03-12 Thread Bruce Momjian

[ Charset KOI8-R unsupported, converting... ]
 On Wednesday 07 March 2001 21:56, Bruce Momjian wrote:
  I have started coding a PostgreSQL performance monitor.  It will be like
  top, but allow you to click on a backend to see additional information.
 
  It will be written in Tcl/Tk.  I may ask to add something to 7.1 so when
  a backend receives a special signal, it dumps a file in /tmp with some
  backend status.  It would be done similar to how we handle Cancel
  signals.
 
 Small question... Will it work in console? Or it will be X only?

It will be tck/tk, so I guess X only.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Performance monitor signal handler

2001-03-12 Thread Alfred Perlstein

* Bruce Momjian [EMAIL PROTECTED] [010312 12:12] wrote:
 I was going to implement the signal handler like we do with Cancel,
 where the signal sets a flag and we check the status of the flag in
 various _safe_ places.
 
 Can anyone think of a better way to get information out of a backend?

Why not use a static area of the shared memory segment?  Is it possible
to have a spinlock over it so that an external utility can take a snapshot
of it with the spinlock held?

Also, this could work for other stuff as well, instead of overloading
a lot of signal handlers one could just periodically poll a region of
the shared segment.

just some ideas..

-- 
-Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]]
Daemon News Magazine in your snail-mail! http://magazine.daemonnews.org/


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Performance monitor signal handler

2001-03-12 Thread Philip Warner

At 13:34 12/03/01 -0800, Alfred Perlstein wrote:
Is it possible
to have a spinlock over it so that an external utility can take a snapshot
of it with the spinlock held?

I'd suggest that locking the stats area might be a bad idea; there is only
one writer for each backend-specific chunk, and it won't matter a hell of a
lot if a reader gets inconsistent views (since I assume they will be
re-reading every second or so). All the stats area should contain would be
a bunch of counters with timestamps, I think, and the cost up writing to it
should be kept to an absolute minimum.



just some ideas..


Unfortunatley, based on prior discussions, Bruce seems quite opposed to a
shared memory solution.



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



[HACKERS] Performance monitor renamed

2001-03-11 Thread Bruce Momjian

I have renamed pgtop.tcl to pgmonitor.  I think the new name is clearer.

ftp://candle.pha.pa.us/pub/postgresql/pgmonitor

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance monitor

2001-03-09 Thread Gordon A. Runkle

In article [EMAIL PROTECTED], "Bruce Momjian"
[EMAIL PROTECTED] wrote:
 The problem I see with the shared memory idea is that some of the
 information needed may be quite large.  For example, query strings can
 be very long.  Do we just allocate 512 bytes and clip off the rest.  And
 as I add more info, I need more shared memory per backend.  I just liked
 the file system dump solution because I could modify it pretty easily,
 and because the info only appears when you click on the process, it
 doesn't happen often.
 
 Of course, if we start getting the full display partly from each
 backend, we will have to use shared memory.

Long-term, perhaps a monitor server (like Sybase ASE uses) might 
be a reasonable approach.  That way, only one process (and a well-
regulated one at that) would be accessing the shared memory, which
should make it safer and have less of an impact performance-wise
if semaphores are needed to regulate access to the various regions
of shared memory.

Then, 1-N clients may access the monitor server to get performance
data w/o impacting the backends.

Gordon.
-- 
It doesn't get any easier, you just go faster.
   -- Greg LeMond

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Performance monitor

2001-03-08 Thread Karel Zak

On Wed, Mar 07, 2001 at 10:06:38PM -0500, Bruce Momjian wrote:
  I think Bruce wants per-backend data, and this approach would seem to only
  get the data for the current backend. 
  
  Also, I really don't like the proposal to write files to /tmp. If we want a
  perf tool, then we need to have something like 'top', which will
  continuously update. With 40 backends, the idea of writing 40 file to /tmp
  every second seems a little excessive to me.
 
 My idea was to use 'ps' to gather most of the information, and just use
 the internal stats when someone clicked on a backend and wanted more
 information.

 Are you sure about 'ps' stuff portability? I don't known how data you
want read from 'ps', but /proc utils are very OS specific and for example
on Linux within a few years was libproc several time overhauled.
 I spent several years with /proc stuff (processes manager: 
http://home.zf.jcu.cz/~zakkr/kim).

Karel

-- 
 Karel Zak  [EMAIL PROTECTED]
 http://home.zf.jcu.cz/~zakkr/
 
 C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Performance monitor

2001-03-08 Thread Bruce Momjian

 Tom Lane writes:
 
  How many of our supported platforms actually have working ps-status
  code?  (This is an honest question: I don't know.)
 
 BeOS, DG/UX, and Cygwin don't have support code, the rest *should* work.

Seems we will find out when people complain my performance monitor
doesn't show the proper columns.  :-)

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Performance monitor

2001-03-08 Thread Larry Rosenman


I don't believe that UnixWare will take the PS change without having 
ROOT.

LER

 Original Message 

On 3/8/01, 3:54:31 PM, Peter Eisentraut [EMAIL PROTECTED] wrote regarding 
Re: [HACKERS] Performance monitor :


 Tom Lane writes:

  How many of our supported platforms actually have working ps-status
  code?  (This is an honest question: I don't know.)

 BeOS, DG/UX, and Cygwin don't have support code, the rest *should* work.

 --
 Peter Eisentraut  [EMAIL PROTECTED]   http://yi.org/peter-e/


 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance monitor

2001-03-07 Thread The Hermit Hacker

On Wed, 7 Mar 2001, Bruce Momjian wrote:

 I have started coding a PostgreSQL performance monitor.  It will be like
 top, but allow you to click on a backend to see additional information.

 It will be written in Tcl/Tk.  I may ask to add something to 7.1 so when
 a backend receives a special signal, it dumps a file in /tmp with some
 backend status.  It would be done similar to how we handle Cancel
 signals.

 How do people feel about adding a single handler to 7.1?  Is it
 something I can slip into the current CVS, or will it have to exist as a
 patch to 7.1.  Seems it would be pretty isolated unless someone sends
 the signal, but it is clearly a feature addition.

Totally dead set against it ...

... the only hold up on RC1 right now was awaiting Vadim getting back so
that he and Tom could work out the WAL related issues ... adding a new
signal handler *definitely* counts as "adding a new feature" ...




---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Performance monitor

2001-03-07 Thread Bruce Momjian

  How do people feel about adding a single handler to 7.1?  Is it
  something I can slip into the current CVS, or will it have to exist as a
  patch to 7.1.  Seems it would be pretty isolated unless someone sends
  the signal, but it is clearly a feature addition.
 
 Totally dead set against it ...
 
 ... the only hold up on RC1 right now was awaiting Vadim getting back so
 that he and Tom could work out the WAL related issues ... adding a new
 signal handler *definitely* counts as "adding a new feature" ...

OK, I will distribute it as a patch.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] Performance monitor

2001-03-07 Thread Tom Lane

The Hermit Hacker [EMAIL PROTECTED] writes:
 How do people feel about adding a single handler to 7.1?

 Totally dead set against it ...

Ditto.  Particularly a signal handler that performs I/O.  That's going
to create all sorts of re-entrancy problems.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Performance monitor

2001-03-07 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 How do people feel about adding a single handler to 7.1?  Is it
 something I can slip into the current CVS, or will it have to exist as a
 patch to 7.1.  Seems it would be pretty isolated unless someone sends
 the signal, but it is clearly a feature addition.

 OK, I will distribute it as a patch.

Patch or otherwise, this approach seems totally unworkable.  A signal
handler cannot do I/O safely, it cannot look at shared memory safely,
it cannot even look at the backend's own internal state safely.  How's
it going to do any useful status reporting?

Firing up a separate backend process that looks at shared memory seems
like a more useful design in the long run.  That will mean exporting
more per-backend status into shared memory, however, and that means that
this is not a trivial change.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] Performance monitor

2001-03-07 Thread Bruce Momjian

 Bruce Momjian [EMAIL PROTECTED] writes:
  How do people feel about adding a single handler to 7.1?  Is it
  something I can slip into the current CVS, or will it have to exist as a
  patch to 7.1.  Seems it would be pretty isolated unless someone sends
  the signal, but it is clearly a feature addition.
 
  OK, I will distribute it as a patch.
 
 Patch or otherwise, this approach seems totally unworkable.  A signal
 handler cannot do I/O safely, it cannot look at shared memory safely,
 it cannot even look at the backend's own internal state safely.  How's
 it going to do any useful status reporting?

Why can't we do what we do with Cancel, where we set a flag and check it
at safe places?

 Firing up a separate backend process that looks at shared memory seems
 like a more useful design in the long run.  That will mean exporting
 more per-backend status into shared memory, however, and that means that
 this is not a trivial change.

Right, that is a lot of work.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance monitor

2001-03-07 Thread Philip Warner

At 18:05 7/03/01 -0500, Bruce Momjian wrote:
 All in all, I do not see this as an easy task that you can whip out and
 then release as a 7.1 patch without extensive testing.  And given that,
 I'd rather see it done with what I consider the right long-term approach,
 rather than a dead-end hack.  I think doing it in a signal handler is
 ultimately going to be a dead-end hack.

Well, the signal stuff will get me going at least.

Didn't someone say this can't be done safely - or am I missing something?

ISTM that doing the work to put things in shared memory will be much more
profitable in the long run. You have previously advocated self-tuning
algorithms for performance - a prerequisite for these will be performance
data in shared memory.



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Performance monitor

2001-03-07 Thread Justin Clift

Hi all,

Wouldn't another approach be to write a C function that does the
necessary work, then just call it like any other C function?

i.e.  Connect to the database and issue a "select
perf_stats('/tmp/stats-2001-03-08-01.txt')" ?

Or similar?

Sure, that means another database connection which would change the
resource count but it sounds like a more consistent approach.

Regards and best wishes,

Justin Clift

Philip Warner wrote:
 
 At 18:05 7/03/01 -0500, Bruce Momjian wrote:
  All in all, I do not see this as an easy task that you can whip out and
  then release as a 7.1 patch without extensive testing.  And given that,
  I'd rather see it done with what I consider the right long-term approach,
  rather than a dead-end hack.  I think doing it in a signal handler is
  ultimately going to be a dead-end hack.
 
 Well, the signal stuff will get me going at least.
 
 Didn't someone say this can't be done safely - or am I missing something?
 
 ISTM that doing the work to put things in shared memory will be much more
 profitable in the long run. You have previously advocated self-tuning
 algorithms for performance - a prerequisite for these will be performance
 data in shared memory.
 
 
 Philip Warner| __---_
 Albatross Consulting Pty. Ltd.   |/   -  \
 (A.B.N. 75 008 659 498)  |  /(@)   __---_
 Tel: (+61) 0500 83 82 81 | _  \
 Fax: (+61) 0500 83 82 82 | ___ |
 Http://www.rhyme.com.au  |/   \|
  |----
 PGP key available upon request,  |  /
 and from pgp5.ai.mit.edu:11371   |/
 
 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance monitor

2001-03-07 Thread Philip Warner

At 11:33 8/03/01 +1100, Justin Clift wrote:
Hi all,

Wouldn't another approach be to write a C function that does the
necessary work, then just call it like any other C function?

i.e.  Connect to the database and issue a "select
perf_stats('/tmp/stats-2001-03-08-01.txt')" ?


I think Bruce wants per-backend data, and this approach would seem to only
get the data for the current backend. 

Also, I really don't like the proposal to write files to /tmp. If we want a
perf tool, then we need to have something like 'top', which will
continuously update. With 40 backends, the idea of writing 40 file to /tmp
every second seems a little excessive to me.




Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



RE: [HACKERS] Performance monitor

2001-03-07 Thread Mike Mascari

I like the idea of updating shared memory with the performance statistics, 
current query execution information, etc., providing a function to fetch 
those statistics, and perhaps providing a system view (i.e. pg_performance) 
based upon such functions which can be queried by the administrator.

FWIW,

Mike Mascari
[EMAIL PROTECTED]

-Original Message-
From:   Philip Warner [SMTP:[EMAIL PROTECTED]]
Sent:   Wednesday, March 07, 2001 7:42 PM
To: Justin Clift
Cc: Bruce Momjian; Tom Lane; The Hermit Hacker; PostgreSQL-development
Subject:Re: [HACKERS] Performance monitor

At 11:33 8/03/01 +1100, Justin Clift wrote:
Hi all,

Wouldn't another approach be to write a C function that does the
necessary work, then just call it like any other C function?

i.e.  Connect to the database and issue a "select
perf_stats('/tmp/stats-2001-03-08-01.txt')" ?


I think Bruce wants per-backend data, and this approach would seem to only
get the data for the current backend.

Also, I really don't like the proposal to write files to /tmp. If we want a
perf tool, then we need to have something like 'top', which will
continuously update. With 40 backends, the idea of writing 40 file to /tmp
every second seems a little excessive to me.




Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



RE: [HACKERS] Performance monitor

2001-03-07 Thread Philip Warner

At 19:59 7/03/01 -0500, Mike Mascari wrote:
I like the idea of updating shared memory with the performance statistics, 
current query execution information, etc., providing a function to fetch 
those statistics, and perhaps providing a system view (i.e. pg_performance) 
based upon such functions which can be queried by the administrator.

This sounds like The Way to me. Although I worry that using a view (or
standard libpq methods) might be too expensive in high load situations
(this is not based on any knowledge of the likely costs, however!).

We do need to make this as cheap as possible since we don't want to distort
the stats, and it will often be used to diagnose perormance problems, and
we don't want to contribute to those problems.



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Performance monitor

2001-03-07 Thread Bruce Momjian

 I think Bruce wants per-backend data, and this approach would seem to only
 get the data for the current backend. 
 
 Also, I really don't like the proposal to write files to /tmp. If we want a
 perf tool, then we need to have something like 'top', which will
 continuously update. With 40 backends, the idea of writing 40 file to /tmp
 every second seems a little excessive to me.

My idea was to use 'ps' to gather most of the information, and just use
the internal stats when someone clicked on a backend and wanted more
information.


-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Performance monitor

2001-03-07 Thread Bruce Momjian

 At 18:05 7/03/01 -0500, Bruce Momjian wrote:
  All in all, I do not see this as an easy task that you can whip out and
  then release as a 7.1 patch without extensive testing.  And given that,
  I'd rather see it done with what I consider the right long-term approach,
  rather than a dead-end hack.  I think doing it in a signal handler is
  ultimately going to be a dead-end hack.
 
 Well, the signal stuff will get me going at least.
 
 Didn't someone say this can't be done safely - or am I missing something?

OK, I will write just the all-process display part, that doesn't need
any per-backend info because it gets it all from 'ps'.  Then maybe
someone will come up with a nifty idea, or I will play with my local
copy to see how it can be done.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] Performance monitor

2001-03-07 Thread Justin Clift

Mike Mascari's idea (er... his assembling of the other ideas) still
sounds like the Best Solution though.

:-)

+ Justin

+++

I like the idea of updating shared memory with the performance
statistics, 
current query execution information, etc., providing a function to fetch 
those statistics, and perhaps providing a system view (i.e.
pg_performance) 
based upon such functions which can be queried by the administrator.

FWIW,

Mike Mascari
[EMAIL PROTECTED]

+++

Bruce Momjian wrote:
 
  I think Bruce wants per-backend data, and this approach would seem to only
  get the data for the current backend.
 
  Also, I really don't like the proposal to write files to /tmp. If we want a
  perf tool, then we need to have something like 'top', which will
  continuously update. With 40 backends, the idea of writing 40 file to /tmp
  every second seems a little excessive to me.
 
 My idea was to use 'ps' to gather most of the information, and just use
 the internal stats when someone clicked on a backend and wanted more
 information.
 
 --
   Bruce Momjian|  http://candle.pha.pa.us
   [EMAIL PROTECTED]   |  (610) 853-3000
   +  If your life is a hard drive, |  830 Blythe Avenue
   +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Performance monitor

2001-03-07 Thread Bruce Momjian

Yes, seems that is best.  I will probably hack something up here so I
can do some testing of the app itself.

 Mike Mascari's idea (er... his assembling of the other ideas) still
 sounds like the Best Solution though.
 
 :-)
 
 + Justin
 
 +++
 
 I like the idea of updating shared memory with the performance
 statistics, 
 current query execution information, etc., providing a function to fetch 
 those statistics, and perhaps providing a system view (i.e.
 pg_performance) 
 based upon such functions which can be queried by the administrator.
 
 FWIW,
 
 Mike Mascari
 [EMAIL PROTECTED]
 
 +++
 
 Bruce Momjian wrote:
  
   I think Bruce wants per-backend data, and this approach would seem to only
   get the data for the current backend.
  
   Also, I really don't like the proposal to write files to /tmp. If we want a
   perf tool, then we need to have something like 'top', which will
   continuously update. With 40 backends, the idea of writing 40 file to /tmp
   every second seems a little excessive to me.
  
  My idea was to use 'ps' to gather most of the information, and just use
  the internal stats when someone clicked on a backend and wanted more
  information.
  
  --
Bruce Momjian|  http://candle.pha.pa.us
[EMAIL PROTECTED]   |  (610) 853-3000
+  If your life is a hard drive, |  830 Blythe Avenue
+  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026
 


-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Performance monitor

2001-03-07 Thread Bruce Momjian

 I wouldn't be at all surprised if you found a better approach - my
 configuration above, to my mind at least, is not pretty. I hope you do find
 a better approach - I know I'll be peeking at your code to see. 

Yes, I have an idea and hope it works.
-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])