Re: [HACKERS] PostgreSQL7.1 on AIX5L is running with too poorferformance

2001-07-30 Thread Tatsuo Ishii

 Leslie [EMAIL PROTECTED] writes:
  PostgreSQL7.1 is now running on AIX5L( S85, 6GB memory, 6CPU), which was
  running on Linux before(Pentium3, 2CPU, as far as I
  remember...sorry..).
  The performance( on AIX5L ) is just half as good as the one( on Linux ).
 
 Hmm ... is the AIX compilation selecting an appropriate TAS
 implementation for spinlocks? 

I think yes. I have compiled 7.1 on an AIX5L box and found that TAS()
was replaced by:

 cs((int *) (lock), 0, 1)

 If it's falling back to semaphore-based
 spinlocks, I can easily believe that you might take a 2X performance
 hit.  Possibly s_lock.h needs some additional #if tests for AIX5.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] SIGCHLD handler in Postgres C function.

2001-07-30 Thread Bill Studenmund

On Sun, 22 Jul 2001, Tatsuo Ishii wrote:

  [EMAIL PROTECTED] writes:
   I have written a postgres C function that
   uses a popen linux system call. Orginally when I first tried it I kept
   getting an ECHILD.  I read a little bit more on the pclose function
   and the wait system calls and discoverd that on LINUX if the signal
   handler for  SIGCHLD is set to SIG_IGN you will get the ECHILD error
   on pclose(or wait4 for that matter).  So I did some snooping around in
   the postgres backend code and found that in the traffic cop that the
   SIGCHLD signal handler is set to SIG_IGN.  So in my C function right
   before the popen call I set the signal handler for SIGCHLD to SIG_DFL
   and right after the pclose I set it back to SIG_IGN.  I tested this
   and it seems to solve my problem.

Just ignore ECHILD. It's not messy at all. :-) It sounds like your kernel
is using SIG_IGN to do the same thing as the SA_NOCLDWAIT flag in *BSD
(well NetBSD at least). When a child dies, it gets re-parrented to init
(which is wait()ing). init does the child-died cleanup, rather than the
parent needing to. That way when the parent runs wait(), there is no
child, so you get an ECHILD.

All ECHILD is doing is saying there was no child. Since we aren't really
waiting for the child, I don't see how that's a problem.

Take care,

Bill


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] SIGCHLD handler in Postgres C function.

2001-07-30 Thread Tom Lane

Bill Studenmund [EMAIL PROTECTED] writes:
 All ECHILD is doing is saying there was no child. Since we aren't really
 waiting for the child, I don't see how that's a problem.

You're missing the point: on some platforms the system() call is
returning a failure indication because of ECHILD.  It's system() that's
broken, not us, and the issue is how to work around its brokenness
without sacrificing more error detection than we have to.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



RE: [HACKERS] Performance TODO items

2001-07-30 Thread Darren King

 3)  I am reading the Solaris Internals book and there is mention of a
 free behind capability with large sequential scans.  When a large
 sequential scan happens that would wipe out all the old cache entries,
 the kernel detects this and places its previous pages first
 on the free list.  For out code, if we do a sequential scan of a table
 that is larger than our buffer cache size, I think we should detect
 this and do the same.  See http://techdocs.postgresql.org for my
 performance paper for an example.

 New TODO entries are:

   * Order duplicate index entries by tid
   * Add queue of backends waiting for spinlock
   * Add free-behind capability for large sequential scans

So why do we cache sequetially-read pages?  Or at least not have an
option to control it?

Oracle (to the best of my knowledge) does NOT cache pages read by a
sequential index scan for at least two reasons/assumptions (two being
all that I can recall):

1. Caching pages for sequential scans over sufficiently large tables
will just cycle the cache.  The pages that will be cached at the end of
the query will be the last N pages of the table, so when the same
sequential query is run again, the scan from the beginning of the table
will start flushing the oldest cached pages which are more than likely
going to be the ones that will be needed at the end of the scan, etc,
etc.  In a multi-user environment, the effect is worse.

2. Concurrent or consective queries in a dynamic database will not
generate plans that use the same sequential scans, so they will tend to
thrash the cache.

Now there are some databases where the same general queries are run time
after time and caching the pages from sequential scans does make sense,
but in larger, enterprise-type systems, indices are created to help
speed up the most used queries and the sequential cache entries only
serve to clutter the cache and flush the useful pages.

Is there any way that caching pages read in by a sequential scan could
be made a configurable-option?

Any chance someone could run pgbench on a test system set up to not
cache sequential reads?

Darren


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] SIGCHLD handler in Postgres C function.

2001-07-30 Thread Bill Studenmund

On Mon, 30 Jul 2001, Tom Lane wrote:

 Bill Studenmund [EMAIL PROTECTED] writes:
  All ECHILD is doing is saying there was no child. Since we aren't really
  waiting for the child, I don't see how that's a problem.

 You're missing the point: on some platforms the system() call is
 returning a failure indication because of ECHILD.  It's system() that's
 broken, not us, and the issue is how to work around its brokenness
 without sacrificing more error detection than we have to.

I think I do get the point. But perhaps I didn't make my point well. :-)

I think the problem is that on some OSs, setting SIGCHLD to SIG_IGN
actually triggers automatic child reaping. So the problem is that we are:
1) setting SIGCHLD to SIG_IGN, 2) Calling system(), and 3) thinking ECHILD
means something was really wrong.

I think 4.4BSD systems will do what we expect (as the NO_CHLDWAIT flag
requests child reaping), but linux systems will give us the ECHILD.
Looking at source on the web, I found:

kernel/signal.c:1042

* Note the silly behaviour of SIGCHLD: SIG_IGN means that the
* signal isn't actually ignored, but does automatic child
* reaping, while SIG_DFL is explicitly said by POSIX to force
* the signal to be ignored.

So we get automatic reaping on Linux systems (which isn't bad).

If automatic reaping happens, system will give us an ECHILD as the waitpid
(or equivalent) will not have found a child. :-)

My suggestion is just leave the ifs as if ((error == 0) || (error ==
ECHLD)) (or the inverse).

Take care,

Bill


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] SIGCHLD handler in Postgres C function.

2001-07-30 Thread Tom Lane

Bill Studenmund [EMAIL PROTECTED] writes:
 Looking at source on the web, I found:

 kernel/signal.c:1042

 * Note the silly behaviour of SIGCHLD: SIG_IGN means that the
 * signal isn't actually ignored, but does automatic child
 * reaping, while SIG_DFL is explicitly said by POSIX to force
 * the signal to be ignored.

Hmm, interesting.  If you'll recall, the start of this thread was a
proposal to change our backends' handling of SIGCHLD from SIG_IGN to
SIG_DFL (and get rid of explicit tests for ECHILD).  I didn't quite see
why changing the handler should make a difference, but above we seem to
have the smoking gun.

Which kernel, and which version, is the above quote from?

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] SIGCHLD handler in Postgres C function.

2001-07-30 Thread Tom Lane

Bill Studenmund [EMAIL PROTECTED] writes:
 I see three choices:

 1) Change back to SIG_DFL for normal behavior. I think this will be fine
   as we run w/o problem on systems that lack this behavior. If
   turning off automatic child reaping would cause a problem, we'd
   have seen it already on the OSs which don't automatically reap
   children. Will a backend ever fork after it's started?

Backends never fork more backends --- but there are some places that
launch transient children and wait for them to finish.  A non-transient
subprocess should always be launched by the postmaster, never by a
backend, IMHO.

 2) Change to DFL around system() and then change back.

I think this is pretty ugly, and unnecessary.

 3) Realize that ECHILD means that the child was auto-reaped (which is an
   ok think and, I think, will only happen if the child exited w/o
   error).

That's the behavior that's in place now, but I do not like it.  We
should not need to code an assumption that this error isn't really
an error --- especially when it only happens on some platforms.
On a non-Linux kernel, an ECHILD failure really would be a failure,
and the existing code would fail to detect that there was a problem.

Bottom line: I like solution #1.  Does anyone have an objection to it?

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



[HACKERS] developer's website

2001-07-30 Thread Vince Vielhaber


The developer's corner will soon be going away.  I'm in the process of
putting together a developer's site.  Different URL, different look,
beta announcements will be there, regression database will be there,
developement docs, etc.  If you want a sneak preview:

http://developer.postgresql.org/

Vince.
-- 
==
Vince Vielhaber -- KA8CSHemail: [EMAIL PROTECTED]http://www.pop4.net
 56K Nationwide Dialup from $16.00/mo at Pop4 Networking
Online Campground Directoryhttp://www.camping-usa.com
   Online Giftshop Superstorehttp://www.cloudninegifts.com
==




---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] SIGCHLD handler in Postgres C function.

2001-07-30 Thread Bill Studenmund

On Mon, 30 Jul 2001, Tom Lane wrote:

 Bill Studenmund [EMAIL PROTECTED] writes:
  Looking at source on the web, I found:

  kernel/signal.c:1042

  * Note the silly behaviour of SIGCHLD: SIG_IGN means that the
  * signal isn't actually ignored, but does automatic child
  * reaping, while SIG_DFL is explicitly said by POSIX to force
  * the signal to be ignored.

 Hmm, interesting.  If you'll recall, the start of this thread was a
 proposal to change our backends' handling of SIGCHLD from SIG_IGN to
 SIG_DFL (and get rid of explicit tests for ECHILD).  I didn't quite see
 why changing the handler should make a difference, but above we seem to
 have the smoking gun.

 Which kernel, and which version, is the above quote from?

Linux kernel source, 2.4.3, I think i386 version (though it should be the
same for this bit, it's supposed to be MI). Check out
http://lxr.linux.no/source/

I do recall the reason for the thread. :-) I see three choices:

1) Change back to SIG_DFL for normal behavior. I think this will be fine
as we run w/o problem on systems that lack this behavior. If
turning off automatic child reaping would cause a problem, we'd
have seen it already on the OSs which don't automatically reap
children. Will a backend ever fork after it's started?

2) Change to DFL around system() and then change back.

3) Realize that ECHILD means that the child was auto-reaped (which is an
ok think and, I think, will only happen if the child exited w/o
error).

Take care,

Bill


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



[HACKERS]

2001-07-30 Thread BigWhat.com



set 
digest


Re: [HACKERS] vacuumlo.

2001-07-30 Thread Hiroshi Inoue
Tom Lane wrote:
 
 Grant [EMAIL PROTECTED] writes:
  Can you see a scenario where a programmer would forget to delete the
  data from pg_largeobject and the database becoming very large filled
  with orphaned large objects?
 
 Sure.  My point wasn't that the functionality isn't needed, it's that
 I'm not sure vacuumlo does it well enough to be ready to promote to
 the status of mainstream code.  It needs more review and testing before
 we can move it out of /contrib.
 

IIRC vacuumlo doesn't take the type lo(see contrib/lo) into
account. I'm suspicious if vacuumlo is reliable.

regards,
Hiroshi Inoue

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] vacuumlo.

2001-07-30 Thread Grant

   Can you see a scenario where a programmer would forget to delete the
   data from pg_largeobject and the database becoming very large filled
   with orphaned large objects?
  
  Sure.  My point wasn't that the functionality isn't needed, it's that
  I'm not sure vacuumlo does it well enough to be ready to promote to
  the status of mainstream code.  It needs more review and testing before
  we can move it out of /contrib.
  
 
 IIRC vacuumlo doesn't take the type lo(see contrib/lo) into
 account. I'm suspicious if vacuumlo is reliable.

This was my round about way of asking if something to combat this issue
can be placed in the to do list. :)

Thanks.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



[HACKERS] Re: Returned mail: User unknown

2001-07-30 Thread Grant

Is it just me or is an address on the hackers list who's mail is handled
by wmail.metro.taejon.kr not existant?

On Tue, 31 Jul 2001, Mail Delivery Subsystem wrote:

 The original message was received at Tue, 31 Jul 2001 14:25:00 +1000 (EST)
 from IDENT:[EMAIL PROTECTED]
 
-- The following addresses had permanent fatal errors --
 [EMAIL PROTECTED]


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Re: Returned mail: User unknown

2001-07-30 Thread Tom Lane

Grant [EMAIL PROTECTED] writes:
 Is it just me or is an address on the hackers list who's mail is handled
 by wmail.metro.taejon.kr not existant?

I've had to institute a sendmail access block against that site :-(
It bounces a useless complaint for every damn posting I make.  What's
worse is that it looks like it's trying to deliver extra copies to
the people named in the To:/CC: lines --- if it somehow fails to fail
to deliver those copies, it's spamming.

Yo Marc, are you awake?  These losers should be blocked from our lists
permanently (or at least till they install less-broken mail software).

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



RE: [HACKERS] Performance TODO items

2001-07-30 Thread Mikheev, Vadim

 * Order duplicate index entries by tid
  
  In other words - add tid to index key: very old idea.
 
 I was thinking during index creation, it would be nice to
 order them by tid, but not do lots of work to keep it that way.

I hear this not do lots of work so often from you -:)
Days of simplicity are gone, Bruce. To continue, this project
requires more and more complex solutions.

 * Add queue of backends waiting for spinlock
  
  We shouldn't mix two different approaches for different
  kinds of short-time internal locks - in one cases we need in
  light lmgr (when we're going to keep lock long enough, eg for IO)
  and in another cases we'd better to proceed with POSIX' mutex-es
  or semaphores instead of spinlocks. Queueing backends waiting
  for spinlock sounds like nonsense - how are you going to protect
  such queue? With spinlocks? -:)
 
 Yes, I guess so but hopefully we can spin waiting for the queue lock
 rather than sleep. We could use POSIX spinlocks/semaphores now but we
 don't because of performance, right?

No. As long as no one proved with test that mutexes are bad for
performance...
Funny, such test would require ~ 1 day of work.

 Should we be spinning waiting for spinlock on multi-cpu machines?  Is
 that the answer?

What do you mean?

Vadim

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance TODO items

2001-07-30 Thread Bruce Momjian

* Order duplicate index entries by tid
   
   In other words - add tid to index key: very old idea.
  
  I was thinking during index creation, it would be nice to
  order them by tid, but not do lots of work to keep it that way.
 
 I hear this not do lots of work so often from you -:)
 Days of simplicity are gone, Bruce. To continue, this project
 requires more and more complex solutions.

Yep.  I can dream.  :-)

* Add queue of backends waiting for spinlock
   
   We shouldn't mix two different approaches for different
   kinds of short-time internal locks - in one cases we need in
   light lmgr (when we're going to keep lock long enough, eg for IO)
   and in another cases we'd better to proceed with POSIX' mutex-es
   or semaphores instead of spinlocks. Queueing backends waiting
   for spinlock sounds like nonsense - how are you going to protect
   such queue? With spinlocks? -:)
  
  Yes, I guess so but hopefully we can spin waiting for the queue lock
  rather than sleep. We could use POSIX spinlocks/semaphores now but we
  don't because of performance, right?
 
 No. As long as no one proved with test that mutexes are bad for
 performance...
 Funny, such test would require ~ 1 day of work.

Good question.  I know the number of function calls to spinlock stuff is
huge.  Seems real semaphores may be a big win on multi-cpu boxes.

  Should we be spinning waiting for spinlock on multi-cpu machines?  Is
  that the answer?
 
 What do you mean?

On a single-cpu machine, if you can't get the spinlock, you should just
sleep and let the process who had it continue.  On a multi-cpu machine,
you perhaps should just keep trying, hoping that the process who holds
it finishes soon.  I wonder is we should find some kind of test on
postmaster startup that would test to see if we have multiple cpu's and
change from spinlock sleeping to spinlock spin-waiting.

Add to this that fact we can't sleep for less than on clock tick on most
machines, 10ms, and the spinlock stuff needs work.

TODO updated:

* Improve spinlock code, perhaps with OS semaphores, sleeper queue, or 
  spining to obtain lock on multi-cpu systems

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



RE: [HACKERS] Performance TODO items

2001-07-30 Thread Mikheev, Vadim

   We could use POSIX spinlocks/semaphores now but we
   don't because of performance, right?
  
  No. As long as no one proved with test that mutexes are bad for
  performance...
  Funny, such test would require ~ 1 day of work.
 
 Good question. I know the number of function calls to spinlock stuff
 is huge. Seems real semaphores may be a big win on multi-cpu boxes.

Ok, being tired of endless discussions I'll try to use mutexes instead
of spinlocks and run pgbench on my Solaris WS 10 and E4500 (4 CPU) boxes.

Vadim

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Re: From TODO, XML?

2001-07-30 Thread Bruce Momjian

   I have managed to get several XML files into PostgreSQL by writing a parser,
   and it is a huge hassle, the public parsers are too picky. I am thinking that a
   fuzzy parser, combined with some intelligence and an XML DTD reader, could make
   a very cool utility, one which I have not been able to find.
  
   Perhaps it is a two stage process? First pass creates a schema which can be
   modified/corrected, the second pass loads the data.
  
  Can we accept only relational XML.  Does that buy us anything?  Are the
  other database vendors outputting heirchical XML?  Are they using
  foreign/primary keys to do it?
 
 Then what's the point? Almost no one creates a non-hierarchical XML. For the
 utility to be usefull, beyond just a different format for pg_dump, it has to
 deal with these issues and do the right thing.

Oh, seems XML will be much more complicated than I thought.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] OU8...

2001-07-30 Thread Bruce Momjian


Patch applied.  Thanks.  Still needs updated autoconf.


 Skip the patch for configure.in in that last one, use this in it's
 place (I missed one sysv5uw). 
 
 
 
 -- 
 Larry Rosenman http://www.lerctr.org/~ler
 Phone: +1 972-414-9812 E-Mail: [EMAIL PROTECTED]
 US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749

[ Attachment, skipping... ]

 
 ---(end of broadcast)---
 TIP 6: Have you searched our list archives?
 
 http://www.postgresql.org/search.mpl

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Re: Re: Storing XML in PostgreSQL

2001-07-30 Thread Bruce Momjian


Added to /contrib, with small Makefile changes.  Requires expat library.
Does not compile by default.

 I've packaged up what I've done so far and you can find it at
 http://www.cabbage.uklinux.net/pgxml.tar.gz
 
 The TODO file included indicates what still remains to be done (a lot!).
 
 In particular, it would be good to implement more of the XPath grammar.
 However, once we get into the realm of more complex paths there becomes a
 question about optimisation of XPath selection. If the documents are
 pre-parsed, then XPath query elements can be rewritten as SQL queries and
 you get the optimisation of the planner on your side.
 
 I'd like to stick with the current solution if possible, because I think
 it delivers a very simple interface to the user and is (code-wise) also
 very straightforward. Maybe less efficient queries are a penalty worth paying?
 
 Any thoughts?
 
 Regards
 
 John
 
 
 ---(end of broadcast)---
 TIP 4: Don't 'kill -9' the postmaster
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



[HACKERS] Re: From TODO, XML?

2001-07-30 Thread mlw

Bruce Momjian wrote:
 
  I have been fighting, for a while now, with idiot data vendors that think XML
  is a cure all. The problem is that XML is a hierarchical format where as SQL is
  a relational format.
 
  It would be good to get pg_dump to write an XML file and DTD, but getting
  external sources of XML into PostgreSQL is WAY more complicated. If an XML
  import is to be useful beyond just a different format for pg_dump, there has to
  be some intelligent database construction based on the XML information.
 
  Go to mp3.com, and download some of their XML format data, first, it is bad
  XML, second, it is hierarchical.
 
  I have managed to get several XML files into PostgreSQL by writing a parser,
  and it is a huge hassle, the public parsers are too picky. I am thinking that a
  fuzzy parser, combined with some intelligence and an XML DTD reader, could make
  a very cool utility, one which I have not been able to find.
 
  Perhaps it is a two stage process? First pass creates a schema which can be
  modified/corrected, the second pass loads the data.
 
 Can we accept only relational XML.  Does that buy us anything?  Are the
 other database vendors outputting heirchical XML?  Are they using
 foreign/primary keys to do it?

Then what's the point? Almost no one creates a non-hierarchical XML. For the
utility to be usefull, beyond just a different format for pg_dump, it has to
deal with these issues and do the right thing.


 
 --
   Bruce Momjian|  http://candle.pha.pa.us
   [EMAIL PROTECTED]   |  (610) 853-3000
   +  If your life is a hard drive, |  830 Blythe Avenue
   +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026
 
 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

-- 
5-4-3-2-1 Thunderbirds are GO!

http://www.mohawksoft.com

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Re: From TODO, XML?

2001-07-30 Thread Karel Zak

On Mon, Jul 30, 2001 at 03:43:26PM +1000, Gavin Sherry wrote:
 On Mon, 30 Jul 2001, mlw wrote:
 
 I have had the same problem. The best XML parser I could find was the
 gnome-xml library at xmlsoft.org (libxml). I am currently using this in C

 What happen if you use DOM type of XML parser for large file? A dump from
SQL DB can be realy large. IMHO is for this (data dump from SQL DB) is
better SAX type of XML parser.

 an XSL stylesheet to transform an XML based database dump from some third

 Yes, it's right way how use XML.

Karel

-- 
 Karel Zak  [EMAIL PROTECTED]
 http://home.zf.jcu.cz/~zakkr/
 
 C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



[HACKERS] Performance TODO items

2001-07-30 Thread Bruce Momjian

I have thought of a few new TODO performance items:

1)  Someone at O'Reilly suggested that we order our duplicate index
entries by tid so if we are hitting the heap for lots of duplicates, the
hits will be on sequential pages.  Seems like a nice idea.

2)  After Tatsuo's report of running 1000 backends on pgbench and from a
Solaris report, I think we should have a queue of backends waiting for a
spinlock, rather than having them sleep and try again.  This is
particularly important for multi-processor machines.

3)  I am reading the Solaris Internals book and there is mention of a
free behind capability with large sequential scans.  When a large
sequential scan happens that would wipe out all the old cache entries,
the kernel detects this and places its previous pages first on the free
list.  For out code, if we do a sequential scan of a table that is
larger than our buffer cache size, I think we should detect this and do
the same.  See http://techdocs.postgresql.org for my performance paper
for an example.

New TODO entries are:

* Order duplicate index entries by tid
* Add queue of backends waiting for spinlock
* Add free-behind capability for large sequential scans

I will modify them with any comments people have.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



[HACKERS] Portal question

2001-07-30 Thread Christopher Kings-Lynne

Hi all,

This is the situation:  You are doing a big query, but you want the results
on the web page to be paginated.  ie. The user can click page 1, 2, etc.

So, you need know how many rows total would be returned, but you also only
need a small fraction of them.

What is an efficient way of doing this?

It seems to me that using a CURSOR would be advantageous, however once a
CURSOR is opened, how do you get the full row count?

ie. Can you do this:?

1. Declare a cursor
2. Find the total number of rows returned
3. Fetch the subset of the rows that are required
4. Construct a pagination based on the info from 2 and 3.

If this can't be done - how do you do it?  Is the only way to repeat the
whole query twice, the first time doing a count(*) instead of the select
variables?

Thanks,

Chris



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Performance TODO items

2001-07-30 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 New TODO entries are:
   * Add queue of backends waiting for spinlock

I already see:

* Create spinlock sleepers queue so everyone doesn't wake up at once


BTW, I agree with Vadim's opinion that we should add a new type of lock
(intermediate between spinlocks and lockmanager locks) rather than try
to add new semantics onto spinlocks.  For example, it'd be very nice to
distinguish read-only and read-write access in this new kind of lock,
but we can't expect spinlocks to be able to do that.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] Performance TODO items

2001-07-30 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 Should we be spinning waiting for spinlock on multi-cpu machines?  Is
 that the answer?

A multi-CPU machine is actually the only case where a true spinlock
*does* make sense.  On a single CPU you might as well yield the CPU
immediately, because you have no chance of getting the lock until the
current holder is allowed to run again.  On a multi CPU it's a
reasonable guess that the current holder is running on one of the other
CPUs and may release the lock soon (soon == less than a process
dispatch cycle, hence busy-wait is better than release CPU).

We are currently using spinlocks for a lot of situations where the mean
time spent holding the lock is probably larger than soon as defined
above.  We should have a different lock implementation for those cases.
True spinlocks should be reserved for protecting code where the maximum
time spent holding the lock is guaranteed to be very short.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Performance TODO items

2001-07-30 Thread Bruce Momjian

 Bruce Momjian [EMAIL PROTECTED] writes:
  New TODO entries are:
  * Add queue of backends waiting for spinlock
 
 I already see:
 
 * Create spinlock sleepers queue so everyone doesn't wake up at once

That is an old copy of the TODO.  I reworded it.  You will only see this
now:

* Improve spinlock code, perhaps with OS semaphores, sleeper queue, or
  spining to obtain lock on multi-cpu systems

 BTW, I agree with Vadim's opinion that we should add a new type of lock
 (intermediate between spinlocks and lockmanager locks) rather than try
 to add new semantics onto spinlocks.  For example, it'd be very nice to
 distinguish read-only and read-write access in this new kind of lock,
 but we can't expect spinlocks to be able to do that.

Yes, I agree too.  On a uniprocessor machine, if I can't get the
spinlock, I want to yield the cpu, ideally for less than 10ms.  On a
multi-cpu machine, if the lock is held by another CPU and that process
is running, we want to spin waiting for the lock to free.   If not, we
can sleep.  We basically need some more sophisticated semantics around
these locks, or move to OS semaphores.

In fact, can't we sleep on an interruptible system call, and send
signals to processes when we release the lock?  That way we don't have
the 10ms sleep limitation.  One idea is to have a bytes for each backend
in shared memory and have each backend set the bit if it is waiting for
the semaphore.  There would be no contention with multiple backends
registering their sleep at the same time.

We have seen reports of 4-cpu systems having poor performance while the
system is only 12% busy, perhaps because the processes are all sleeping
waiting for the next tick.

I think multi-cpu machines are going to give us new challenges.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] Performance TODO items

2001-07-30 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 I have thought of a few new TODO performance items:
 1)  Someone at O'Reilly suggested that we order our duplicate index
 entries by tid so if we are hitting the heap for lots of duplicates, the
 hits will be on sequential pages.  Seems like a nice idea.

A more general solution is for indexscan to collect up a bunch of TIDs
from the index, sort them in-memory by TID order, and then probe into
the heap with those TIDs.  This is better than the above because you get
nice ordering of the heap accesses across multiple key values, not just
among the tuples with the same key.  (In a unique or near-unique index,
the above idea is nearly worthless.)

In the best case there are few enough TIDs retrieved from the index that
you can just do this once, but even if there are lots of TIDs, it should
be a win to do this in batches of a few thousand TIDs.  Essentially we
decouple indexscans into separate index-access and heap-access phases.

One big problem is that this doesn't interact well with concurrent VACUUM:
our present solution for concurrent VACUUM assumes that indexscans hold
a pin on an index page until they've finished fetching the pointed-to
heap tuples.  Another objection is that we'd have a harder time
implementing the TODO item of marking an indextuple dead when its
associated heaptuple is dead.  Anyone see a way around these problems?

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Performance TODO items

2001-07-30 Thread Bruce Momjian

 A more general solution is for indexscan to collect up a bunch of TIDs
 from the index, sort them in-memory by TID order, and then probe into
 the heap with those TIDs.  This is better than the above because you get
 nice ordering of the heap accesses across multiple key values, not just
 among the tuples with the same key.  (In a unique or near-unique index,
 the above idea is nearly worthless.)
 
 In the best case there are few enough TIDs retrieved from the index that
 you can just do this once, but even if there are lots of TIDs, it should
 be a win to do this in batches of a few thousand TIDs.  Essentially we
 decouple indexscans into separate index-access and heap-access phases.
 
 One big problem is that this doesn't interact well with concurrent VACUUM:
 our present solution for concurrent VACUUM assumes that indexscans hold
 a pin on an index page until they've finished fetching the pointed-to
 heap tuples.  Another objection is that we'd have a harder time
 implementing the TODO item of marking an indextuple dead when its
 associated heaptuple is dead.  Anyone see a way around these problems?

Interesting.  I figured the cache could keep most pages in such a case. 
I was thinking more of helping file system readahead by requesting the
earlier block first in a mult-block request.  Not sure how valuable that
would be.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



[HACKERS] Autoconf 2.50 is out (was Re: [PATCHES] Re: OpenUnix 8 Patch)

2001-07-30 Thread Tom Lane

I wrote:
 If autoconf releases were happening on a regular basis, we could get
 away with just tracking the released version of autoconf for these
 files.  However, they aren't and we can't.

Just moments after writing that, I was startled to read on another
mailing list that the long-mythical Autoconf 2.50 is released!

We should probably consider updating from autoconf 2.13 as our project
standard to 2.50.  However, I'd recommend waiting till Peter E. returns
from his vacation to see what his opinion about it is.  IIRC, he's been
following that project, which I have not been.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



[HACKERS] Re: Autoconf 2.50 is out (was Re: [PATCHES] Re: OpenUnix 8 Patch)

2001-07-30 Thread Larry Rosenman

* Tom Lane [EMAIL PROTECTED] [010730 18:34]:
 I wrote:
  If autoconf releases were happening on a regular basis, we could get
  away with just tracking the released version of autoconf for these
  files.  However, they aren't and we can't.
 
 Just moments after writing that, I was startled to read on another
 mailing list that the long-mythical Autoconf 2.50 is released!
 
 We should probably consider updating from autoconf 2.13 as our project
 standard to 2.50.  However, I'd recommend waiting till Peter E. returns
 from his vacation to see what his opinion about it is.  IIRC, he's been
 following that project, which I have not been.
I also see LOTS of complaints about compat issues, which is why I just
pulled the 2 files from ftp://ftp.gnu.org/pub/gnu/config when I
generated that patch.   Looks like they updated them again today:

$ ftp ftp.gnu.org
Connected to gnuftp.gnu.org.
220 ProFTPD 1.2.0pre10 Server (ProFTPD) [gnuftp.gnu.org]
331 Anonymous login ok, send your complete e-mail address as password.
230-If you have any problems with the GNU software or its downloading,
please
 refer your questions to [EMAIL PROTECTED].
 
 There are several mirrors of this archive, a complete list can be
found on
 http://www.gnu.org/order/ftp.html. Please use one of the mirrors if
possible.
 
 Archives of the GNU mailing lists can be found in
 ftp://ftp-mailing-list-archives.gnu.org/.
 
 Please note that the directory structure on ftp.gnu.org was
redisorganzied
 fairly recently, such that there is a directory for each program.
One side
 effect of this is that if you cd into the gnu directory, and do
  ls emacs*
 you will get a list of all the files in the emacs directory, but it
will not
 be obvious from the ls output that you have to `cd emacs' before you
can
 download those files.
 
 Note further the non-GNU programs that were formerly in gnu/ have
moved to
 gnu/non-gnu/.  Most of them were just pointers in the format
program.README.
 If you are looking for such a file, be sure to check gnu/non-gnu/.
230 Anonymous access granted, restrictions apply.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp cd pub/gnu/config
250 CWD command successful.
ftp pwd
257 /gnu/config is current directory.
ftp dir
200 PORT command successful.
150 Opening ASCII mode data connection for file list.
-rw-r--r--   1 ftp  ftp 38214 Jul 30 14:00 config.guess
-rw-r--r--   1 ftp  ftp 27872 Jul 30 14:00 config.sub
226-Transfer complete.
226 Quotas off
ftp 


Larry
 
   regards, tom lane

-- 
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 972-414-9812 E-Mail: [EMAIL PROTECTED]
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] SIGCHLD handler in Postgres C function.

2001-07-30 Thread Bruce Momjian

 Bruce Momjian [EMAIL PROTECTED] writes:
  The auto-reaping is standard SysV behavior, while BSD is really ignore. 
 
 You'll recall the ECHILD exception was installed by Tatsuo after seeing
 problems on Solaris.  Evidently Solaris uses the auto-reap behavior too.

SVr4/Solaris took the SysV behavior.  Steven's didn't like it.  :-)


 I'm somewhat surprised that HPUX does not --- it tends to follow its
 SysV heritage when there's a conflict between that and BSD practice.
 Guess they went BSD on this one.

I thought HPUX was mostly SysV tools on BSD kernel.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



RE: [HACKERS] Performance TODO items

2001-07-30 Thread Mikheev, Vadim

 New TODO entries are:
 
   * Order duplicate index entries by tid

In other words - add tid to index key: very old idea.

   * Add queue of backends waiting for spinlock

We shouldn't mix two different approaches for different
kinds of short-time internal locks - in one cases we need in
light lmgr (when we're going to keep lock long enough, eg for IO)
and in another cases we'd better to proceed with POSIX' mutex-es
or semaphores instead of spinlocks. Queueing backends waiting
for spinlock sounds like nonsense - how are you going to protect
such queue? With spinlocks? -:)

Vadim

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] developer's website

2001-07-30 Thread bpalmer

 The developer's corner will soon be going away.  I'm in the process of
 putting together a developer's site.  Different URL, different look,
 beta announcements will be there, regression database will be there,
 developement docs, etc.  If you want a sneak preview:

   http://developer.postgresql.org/

Right now,  a good part of what I mirror (and my traffic) for the web
site are of the devel site.  Will a third set of mirrors (www / ftp /
devel) need to be setup,  or will the main site handle all the load?

- Brandon


 b. palmer,  [EMAIL PROTECTED]pgp:crimelabs.net/bpalmer.pgp5


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] developer's website

2001-07-30 Thread Vince Vielhaber

On Mon, 30 Jul 2001, bpalmer wrote:

  The developer's corner will soon be going away.  I'm in the process of
  putting together a developer's site.  Different URL, different look,
  beta announcements will be there, regression database will be there,
  developement docs, etc.  If you want a sneak preview:
 
  http://developer.postgresql.org/

 Right now,  a good part of what I mirror (and my traffic) for the web
 site are of the devel site.  Will a third set of mirrors (www / ftp /
 devel) need to be setup,  or will the main site handle all the load?

Don't know just yet for sure.

Vince.
-- 
==
Vince Vielhaber -- KA8CSHemail: [EMAIL PROTECTED]http://www.pop4.net
 56K Nationwide Dialup from $16.00/mo at Pop4 Networking
Online Campground Directoryhttp://www.camping-usa.com
   Online Giftshop Superstorehttp://www.cloudninegifts.com
==




---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] SIGCHLD handler in Postgres C function.

2001-07-30 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 The auto-reaping is standard SysV behavior, while BSD is really ignore. 

You'll recall the ECHILD exception was installed by Tatsuo after seeing
problems on Solaris.  Evidently Solaris uses the auto-reap behavior too.

I'm somewhat surprised that HPUX does not --- it tends to follow its
SysV heritage when there's a conflict between that and BSD practice.
Guess they went BSD on this one.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] SIGCHLD handler in Postgres C function.

2001-07-30 Thread Bruce Momjian

 Bill Studenmund [EMAIL PROTECTED] writes:
  Looking at source on the web, I found:
 
  kernel/signal.c:1042
 
  * Note the silly behaviour of SIGCHLD: SIG_IGN means that the
  * signal isn't actually ignored, but does automatic child
  * reaping, while SIG_DFL is explicitly said by POSIX to force
  * the signal to be ignored.
 
 Hmm, interesting.  If you'll recall, the start of this thread was a
 proposal to change our backends' handling of SIGCHLD from SIG_IGN to
 SIG_DFL (and get rid of explicit tests for ECHILD).  I didn't quite see
 why changing the handler should make a difference, but above we seem to
 have the smoking gun.
 
 Which kernel, and which version, is the above quote from?

The auto-reaping is standard SysV behavior, while BSD is really ignore. 
See the Steven's Unix Programming book for more info.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] vacuumlo.

2001-07-30 Thread Tom Lane

Grant [EMAIL PROTECTED] writes:
 Can you see a scenario where a programmer would forget to delete the
 data from pg_largeobject and the database becoming very large filled
 with orphaned large objects?

Sure.  My point wasn't that the functionality isn't needed, it's that
I'm not sure vacuumlo does it well enough to be ready to promote to
the status of mainstream code.  It needs more review and testing before
we can move it out of /contrib.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



[HACKERS] vacuumlo.

2001-07-30 Thread Grant

I sent the email below to the creator of contrib/vacuumlo/ with no reply
just yet.

Is it possible to get his code included in the main vacuumdb program for
support to vacuum orphaned large objects?

Or... Any suggestions, what do people think?

Thanks.

-- Forwarded message --
Date: Tue, 24 Jul 2001 09:45:01 +1000 (EST)
From: Grant [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: vacuumlo.

G'day,

I've recently discovered that LO's do not get deleted when a record
referencing that OID is removed. I'm assuming you created the program and
do you think it is possible to get this included as an argument for
vacuumdb?

vacuumdb -o db

Or something along those lines so scanning for orphan large objects could
be done from the main vacuum binary?

What do you think?

Thanks.




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] SIGCHLD handler in Postgres C function.

2001-07-30 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 I'm somewhat surprised that HPUX does not --- it tends to follow its
 SysV heritage when there's a conflict between that and BSD practice.
 Guess they went BSD on this one.

 I thought HPUX was mostly SysV tools on BSD kernel.

No, it was all SysV (or maybe even older) to start with, and later on
they adopted BSD features wholesale.  But where there's a conflict, it's
still mostly SysV.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] vacuumlo.

2001-07-30 Thread Tom Lane

Grant [EMAIL PROTECTED] writes:
 Is it possible to get [vacuumlo] included in the main vacuumdb program for
 support to vacuum orphaned large objects?

Hmm.  I'm not convinced that vacuumlo is ready for prime time...
in particular, how safe is it in the presence of concurrent
transactions that might be adding or removing LOs?

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] vacuumlo.

2001-07-30 Thread Grant

  Is it possible to get [vacuumlo] included in the main vacuumdb program for
  support to vacuum orphaned large objects?
 
 Hmm.  I'm not convinced that vacuumlo is ready for prime time...
 in particular, how safe is it in the presence of concurrent
 transactions that might be adding or removing LOs?

I see large objects for each database are stored in pg_largeobject referenced
by the loid. So when I delete a file from a table containing an oid type I have
to make sure to delete the matching row(s) from pg_largeobject.

Can you see a scenario where a programmer would forget to delete the data from
pg_largeobject and the database becoming very large filled with orphaned large
objects? Or am I on the wrong track?


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster