date:20040505

[HACKERS] SPI_fnumber is case sensitive

2004-05-05 Thread Thomas Hallgren

The SPI_fnumber (returns the column number using a name) is case sensitive.
I think a patch would be appropriate. Either to the documentation that could
state this fact and suggest lowercasing the colname parameter, or to the
code to make it case insensitive. I'd be happy to provide the one chosen.

Which one is better?

Regards,

Thomas Hallgren


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread sdv mailer

Yes, I realize it's a bit old but I just wanted to
make a small point that forking is slower. It's funny
you should ask because thread creation on Linux has in
fact improved over process creation much more in 2.4
kernel.

Benchmark at IBM shows Linux 2.4 thread creation is
30x faster than process creation. Process creation on
Windows 2000 is about twice longer than process
creation on Linux. This means forking on Win32 will be
2x slower! See 2002 benchmark below:

http://www-106.ibm.com/developerworks/linux/library/l-rt7/?Open&t=grl,l=252,p=mgth

Cheers,

--- Andrew Dunstan <[EMAIL PROTECTED]> wrote:
> sdv mailer said:
> > Forking is expensive on many systems. Linux is a
> bit
> > better but still expensive compared to threads. On
> > Windows, creating process is much more expensive
> than
> > on Linux. Check this benchmark:
> >
> >
>
http://cs.nmu.edu/~randy/Research/Papers/Scheduler/understanding.html
> >
> > Forking shouldn't be taken lightly as free thing.
> > There are pros and cons. The general trend is
> going
> > towards threads, but that's a different issue.
> >
> >
> 
> This article shows a 3x speedup for thread creation
> over fork(), not the
> numbers you have quoted. Furthermore, it talks about
> Linux kernel 2.0.30.
> Do you know how old that is? The paper itself comes
> from Linux Journal,
> January 1999, according to the author's web site.
> 
> Argument will get you nowhere - if you want it done
> then do it and prove
> everyone wrong.
> 
> cheers
> 
> andrew
> 
> 
> 
> 
> 
> ---(end of
> broadcast)---
> TIP 7: don't forget to increase your free space map
settings

__
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs  
http://hotjobs.sweepstakes.yahoo.com/careermakeover 

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread Andrew Dunstan

sdv mailer said:
> Forking is expensive on many systems. Linux is a bit
> better but still expensive compared to threads. On
> Windows, creating process is much more expensive than
> on Linux. Check this benchmark:
>
> http://cs.nmu.edu/~randy/Research/Papers/Scheduler/understanding.html
>
> Forking shouldn't be taken lightly as free thing.
> There are pros and cons. The general trend is going
> towards threads, but that's a different issue.
>
>

This article shows a 3x speedup for thread creation over fork(), not the
numbers you have quoted. Furthermore, it talks about Linux kernel 2.0.30.
Do you know how old that is? The paper itself comes from Linux Journal,
January 1999, according to the author's web site.

Argument will get you nowhere - if you want it done then do it and prove
everyone wrong.

cheers

andrew





---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread sdv mailer

I don't think I can volunteer on this end as I am
already actively volunteering for another open
project. I was hoping someone could take up on this
since one of the last threads mentionned we don't have
something substantial to present for 7.5 if June 1 is
dateline for code freeze. Pre-fork came to mind. :-)

As for proof of concept, I think pgpool from Tatsuo
Ishii is a good indication that pre-fork works. I'll
try to see if I can generate some benchmarks using
pgpool on my Linux. 

PgPool is a server-side connection pool/load
balancer/replicator that implements pre-fork but
because it acts as a proxy there is 7% to 15% overhead
according to his README file.

http://www.mail-archive.com/[EMAIL PROTECTED]/msg44082.html

--- Andrew Dunstan <[EMAIL PROTECTED]> wrote:
> sdv mailer wrote:
> 
> [snip]
> 
> >Pre-fork will give MySQL one less argument to throw
> at
> >PostgreSQL. 
> >
> >I think optimizing is this area will speed up the
> >general case for everyone rather than optimizing a
> >feature that affects 10% of the users. On top of
> that,
> >it will make a strong marketing case because
> forking
> >will no longer become a speed issue when compared
> to
> >MySQL.
> >
> >  
> >
> 
> So when can we expect to see your proof of concept
> code and benchmarks 
> to show the speedup achieved?
> 
> cheers
> 
> andrew
> 
> ---(end of
> broadcast)---
> TIP 7: don't forget to increase your free space map
settings

__
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs  
http://hotjobs.sweepstakes.yahoo.com/careermakeover 

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread sdv mailer


I'll pretend I didn't see that last comment on
Windows. I wouldn't want to disappoint the users who
are eagerly expecting the Win32 port to complete
including myself.  ;-)

Having said that, I think it's more the reason to get
a working pre-fork for Win32. Don't you think so?



--- "scott.marlowe" <[EMAIL PROTECTED]> wrote:
> On Wed, 5 May 2004, sdv mailer wrote:
> 
> > Forking is quite fast on Linux but creating a new
> > process is still 10x more expensive than creating
> a
> > thread and is even worse on Win32 platform. CPU
> load
> > goes up because the OS needs to
> allocate/deallocate
> > memory making it difficult to get a steady state
> > resource consumption.
> 
> Just a nit to pick here.  In Linux, the difference
> between forking and 
> spawning a new thread is almost nothing.  Definitely
> less than a factor of 
> 2, and most assuredly less than the quoted factor of
> 10 here.
> 
> The fact that windows has a heavy process /
> lightweight thread design 
> means little to me, since I'll likely never deploy a
> production postgresql 
> server on it that needs to handle any serious load.
> 





__
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs  
http://hotjobs.sweepstakes.yahoo.com/careermakeover 

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread sdv mailer

Forking is expensive on many systems. Linux is a bit
better but still expensive compared to threads. On
Windows, creating process is much more expensive than
on Linux. Check this benchmark:

http://cs.nmu.edu/~randy/Research/Papers/Scheduler/understanding.html

Forking shouldn't be taken lightly as free thing.
There are pros and cons. The general trend is going
towards threads, but that's a different issue.

--- "scott.marlowe" <[EMAIL PROTECTED]> wrote:
> On Wed, 5 May 2004, sdv mailer wrote:
> 
> > Forking is quite fast on Linux but creating a new
> > process is still 10x more expensive than creating
> a
> > thread and is even worse on Win32 platform. CPU
> load
> > goes up because the OS needs to
> allocate/deallocate
> > memory making it difficult to get a steady state
> > resource consumption.
> 
> Just a nit to pick here.  In Linux, the difference
> between forking and 
> spawning a new thread is almost nothing.  Definitely
> less than a factor of 
> 2, and most assuredly less than the quoted factor of
> 10 here.
> 
> The fact that windows has a heavy process /
> lightweight thread design 
> means little to me, since I'll likely never deploy a
> production postgresql 
> server on it that needs to handle any serious load.
> 

__
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs  
http://hotjobs.sweepstakes.yahoo.com/careermakeover 

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread sdv mailer


I've already tried pooling (SQLRelay) and persistent
connection (PHP). They may work for other people but
they do not work for me. I have already separated
static from database driven codes but you can never
balance web server load with database server load.
Pre-fork scales with database load and not with web
server load. This point is crucial.

Most people paying $5.99/mo for web hosting don't have
access to persistent connection or connection pooling
under PHP. Maybe this is why MySQL is favored among
them. I'm not saying this is my case, but there is a
general need for speedier connections. If you can
satisfy the needs of the mass, then you practically
won their vote. Currently MySQL connects 10x faster
than PostgreSQL. See my last benchmark.






__
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs  
http://hotjobs.sweepstakes.yahoo.com/careermakeover 

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] ALTER TABLE TODO items

2004-05-05 Thread Bruce Momjian

Tom Lane wrote:
> Richard Huxton <[EMAIL PROTECTED]> writes:
> > Does that mean I'll want to disable triggers while I do this?
> 
> Hrm.  Right now the code does not fire triggers at all, but that seems
> wrong.  However, I doubt that very many triggers could cope with update
> events in which the old and new rows have different rowtypes :-(.
> Any thoughts what to do about that?

If triggers exist, I think we should just throw a warning that triggers
will not be fired.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Multiple Xids in PGPROC?

2004-05-05 Thread Bruce Momjian

Manfred Koizar wrote:
> With subtransactions we'd have to fall back to checking pg_clog (and
> pg_subtrans) in certain cases.  There are lots of possible
> implementations.  Here are some ideas (just brainstorming):
> 
>  .  We could first scan the PGPROC array.  If the xid is an active main
> transaction, we're finished.
> 
>  .  If xid is older than RecentGlobalXmin, it cannot be active.
> 
>  .  We could include a small number of subtransaction xids in PGPROC.
> 
>  .  For additional subtransactions not fitting into this small array
> there could be minsubxid and maxsubxid fields in PGPROC.  If the xid we
> are looking for is outside all these ranges, it cannot be an active
> subtransaction.
> 
>  .  If all these tests fail, we fall back to checking pg_clog.

Could we add a boolean to each PROC structure indicating if that xid has
subtransactions?  If none have subtransactions, we don't need to look in
pg_clog.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Multiple Xids in PGPROC?

2004-05-05 Thread Bruce Momjian

Alvaro Herrera wrote:
> On Tue, May 04, 2004 at 11:21:18PM -0400, Bruce Momjian wrote:
> > Sorry I haven't gotten your patches in yet.  Tom is working on some
> > other back patches.
> 
> I've been sloppy lately with #ifdef, because it takes some time to get
> right and testing it takes even more time.  I don't know if it's worth
> it -- do you still have the idea of incremental, non disturbing patches?

No.  I think we should forget about the #ifdef's and just keep adding to
your patch.  I am not sure if we are going to apply it incrementally or
as one big patch, but We are going to review your existing posted patch
in a few days.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Multiple Xids in PGPROC?

2004-05-05 Thread Bruce Momjian

Tom Lane wrote:
> Manfred Koizar <[EMAIL PROTECTED]> writes:
> > The straightforward pg_clog lookup is still in transam.c,
> > but has been deactivated:
> >  * Now this func in shmem.c and gives quality answer by scanning
> >  * PGPROC structures of all running backend. - vadim 11/26/96
> 
> > What was the motivation for this change?  Consistency or speed?
> 
> Getting the right answer --- the other way can't tell the difference
> between an open transaction and a crashed one.
> 
> >  .  We could include a small number of subtransaction xids in PGPROC.
> 
> Yeah, I was just thinking that myself.  If we only need to show open
> subtrans xids, then the number you'd need would depend on nesting depth
> not the total number of subxacts used.  So half-a-dozen or so would
> probably suffice for 99% of situations.  You'd need a flag that could be
> set to show "I'm so deeply nested I can't fit all my subxacts here",
> but you'd only need to go to pg_subtrans when that happened.
> 
> On the other hand, I'm not sure how much that helps, considering you
> probably have to resolve the subtrans XID up to its parent anyway to
> check commit/abort status.

I am confused.  Don't we need to know about all subtransctions, not just
opened ones?

BEGIN; -- xid=100
BEGIN; -- xid=101
COMMIT;

At this point, don't backends need to know the parent of xid 101,
meaning we can't limit visibility to just the transactions that are
currently openly nested?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] ALTER TABLE TODO items

2004-05-05 Thread Christopher Kings-Lynne

Yeah, the USING is actually any arbitrary expression over the old table
row.  (Getting that to work was a tad tricky...)  So you can view this
as a full-table UPDATE operation that folds in possible column type changes.
All I can say is three cheers for Tom and Rod on this one
Chris
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] initdb failure in CVS

2004-05-05 Thread Tom Lane

Andrew Dunstan <[EMAIL PROTECTED]> writes:
> Did we actually find a current system where it broke with a straight 
> putenv("LC_ALL")?

Well, the Single Unix Spec does not say that that works, and neither
does the HPUX man page for putenv, so you're going to have a hard time
convincing me that it's a portable solution.  I think the real story is
simply that no one has tested the C version of initdb hard enough to
notice whether that line actually accomplished anything or not.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] COPY command - CSV files

2004-05-05 Thread Christopher Kings-Lynne

I believe this has already been implemented in CVS...
Chris
Umberto Zappi wrote:
I wish modify COPY command for support of CSV files (dump/load files in
CSV format).
I think this option is very important from import data from spreedsheet
as OpenOffice/calc or M$/excel.
I have found this task in TODO list, also.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] Multiple Xids in PGPROC?

2004-05-05 Thread Tom Lane

Alvaro Herrera <[EMAIL PROTECTED]> writes:
> I think if we want to do nontrivial manipulations in PGPROC we should
> make sure it's properly locked.  Maybe it's a good time to implement the
> locking suggested here?  With a LWLock instead of a spinlock, of course;
> we would need MaxBackends extra LWLocks.

Given the performance issues we're currently seeing with spinlocks
on SMP machines, I'm not sure I want to turn GetSnapshot from a
get-one-lock operation into a get-one-lock-per-backend operation :-(
The comment you were looking at was written on the assumption that
grabbing a spinlock is cheap, but it seems it isn't ...

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread Andrew Dunstan


Rod Taylor wrote:
As a temporary step these types of things help. But it's still doesn't
really fix the problem of Apache not using real connection pooling.
 

Rod,
In principle, this should not be enormously hard to do - at least for 
Unix where the methods of handing off file handles between processes are 
fairly well known ( I have no idea if this is even possible on Windows).

Maybe you'd like to start a pgFoundry project to do it? It would be a 
great feather in the postgresql cap, and I think it's well worth doing.

cheers
andrew

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] Multiple "selects" returned from a single stored procedure

2004-05-05 Thread Joe Conway

Shachar Shemesh wrote:
MS-SQL has the capacity for both out variables from stored procedures,
as well as running several "selects" inside the procedures, and then
giving the results for all selects to the caller. Fortunetly for me,
that specific application doesn't run more than one select per stored
procedure.
The way I handled out variables so far was to have the function return a
compound type, with the variables as rows. With embedded selects,
however, this will no longer work.
I guess what I would like to suggest is for the thus far unused "select"
command in PLPGSQL to be used, in some way, to return values outside the
scope of the strict "returns" context. I guess out variables will also
be nice, but that's besides the point.
This has come up before (search the archives). I think the answer is to 
implement actual stored procedures (as opposed to functions, which is 
what we now have). A stored procedure call, per SQL99/2003 would look 
something like:
  call sp_my_stored_proc();
(which in MSSQL looks like "exec sp_my_stored_proc()")

The difference between this and an SRF is that the stored procedure 
cannot be used in a FROM clause, and therefore cannot be joined with 
other data or filtered with WHERE criteria. But that fact also means 
that we should be able to deal with projecting multiple heterogenous 
result sets, and the structure of the sets does not need to be known in 
advance.

If anyone has any ideas on how to both modify called parameters, and
return a rowset, please let me know. Best I came up with so far was to
create a temporary table for the out vars or the selects. I can then rig
the OLE DB to make it look as if the function returned that.
I wonder if you could write an SRF that returns setof refcursor, and 
then expand the cursors one-by-one in the OLE DB layer. See:
http://www.postgresql.org/docs/7.4/interactive/plpgsql-cursors.html
(37.8.3.3. Returning Cursors)

HTH,
Joe
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread Rod Taylor

> > I cannot tell if mod_pg_pool works across Apache forked backends or is
> > still bound to a single process. They state it is intended for sharing
> > connections across modules, so it is probably still backend specific.
> 
> Have you looked at sqlrealy.sourceforge.net?  IT looks like it might do 
> what you need.

SQL Relay (and friends) do what I'm looking for in a round about way.

If you put it onto the webservers it would help -- but it would require
deployment of additional webservers to accommodate the increased load.
That can be accomplished if it helps drop the load on the DB machine.
But still uses resources unnecessarily.

I've not looked at sqlrelay but most of these things use a different
interface. That would work with the inhouse code but puts a damper on
the commercial software.

As a temporary step these types of things help. But it's still doesn't
really fix the problem of Apache not using real connection pooling.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread scott.marlowe

On Wed, 5 May 2004, Rod Taylor wrote:

> > And, of course, most development environments (perl, php, java etc)
> > have their own language specific connection pooling solutions.
> 
> Yes, the one for php is what I was thinking of when I made my statement.
> They work on a per backend basis as Apache does not allow for the type
> of communication between processes that would otherwise be required. A
> connection created by Apache backend A cannot be used by Apache backend
> B.
> 
> Java is an example where it is done well, but the language decision was
> made long before I joined the firm.
> 
> I cannot tell if mod_pg_pool works across Apache forked backends or is
> still bound to a single process. They state it is intended for sharing
> connections across modules, so it is probably still backend specific.

Have you looked at sqlrealy.sourceforge.net?  IT looks like it might do 
what you need.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

[HACKERS] database read/write from applications code Request for advice

2004-05-05 Thread terr

I am wondering if there is a simple way to implement a read() blah and write() blah 
function to a database.

The application I'm looking at is very simple and this is all that is required.  In 
the past I used a cursor - did the select, then retrieved the records one at a time.

Any suggestions?  I would prefer to not re-map each feild with each call and if I use 
the equivalent of pro*c it generates over 1000 parameters into the call - which is 
crasy.

A function that binds the variables from the calling program would be acceptable.  For 
instance we might have pgbind(&myCfield, fieldType, sizeof(myCfield) ...) or something 
along this line.

Thanx.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Multiple Xids in PGPROC?

2004-05-05 Thread Alvaro Herrera

On Wed, May 05, 2004 at 02:18:16PM -0400, Tom Lane wrote:
> Manfred Koizar <[EMAIL PROTECTED]> writes:

> >  .  We could include a small number of subtransaction xids in PGPROC.
> 
> Yeah, I was just thinking that myself.  If we only need to show open
> subtrans xids, then the number you'd need would depend on nesting depth
> not the total number of subxacts used.  So half-a-dozen or so would
> probably suffice for 99% of situations.  You'd need a flag that could be
> set to show "I'm so deeply nested I can't fit all my subxacts here",
> but you'd only need to go to pg_subtrans when that happened.

There is a comment in varsup.c, GetNewTransactionId():

 * XXX by storing xid into MyProc without acquiring SInvalLock, we are
 * relying on fetch/store of an xid to be atomic, else other backends
 * might see a partially-set xid here.  But holding both locks at once
 * would be a nasty concurrency hit (and in fact could cause a
 * deadlock against GetSnapshotData).  So for now, assume atomicity.
 * Note that readers of PGPROC xid field should be careful to fetch
 * the value only once, rather than assume they can read it multiple
 * times and get the same answer each time.
 *
 * A solution to the atomic-store problem would be to give each PGPROC
 * its own spinlock used only for fetching/storing that PGPROC's xid.
 * (SInvalLock would then mean primarily that PGPROCs couldn't be added/
 * removed while holding the lock.)

I think if we want to do nontrivial manipulations in PGPROC we should
make sure it's properly locked.  Maybe it's a good time to implement the
locking suggested here?  With a LWLock instead of a spinlock, of course;
we would need MaxBackends extra LWLocks.

-- 
Alvaro Herrera ()
"Tiene valor aquel que admite que es un cobarde" (Fernandel)

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread Rod Taylor

> And, of course, most development environments (perl, php, java etc)
> have their own language specific connection pooling solutions.

Yes, the one for php is what I was thinking of when I made my statement.
They work on a per backend basis as Apache does not allow for the type
of communication between processes that would otherwise be required. A
connection created by Apache backend A cannot be used by Apache backend
B.

Java is an example where it is done well, but the language decision was
made long before I joined the firm.

I cannot tell if mod_pg_pool works across Apache forked backends or is
still bound to a single process. They state it is intended for sharing
connections across modules, so it is probably still backend specific.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Postgres Optimization: IO and Data Organization

2004-05-05 Thread Doug McNaught

[EMAIL PROTECTED] (James Thornton) writes:

> I want to understand how Postgres organizes data and handles IO
> operations so that I will better know how to optimize a Postgres
> database server. I am looking for answers to specific questions and
> pointers to where this stuff is documented.

If you haven't read the Postgres docs in detail that would be a good
place to start.  :)

> How does Postgres organize its data? For example, is it grouped
> together on the disk, or is it prone to be spread out over the disk?
> Does vacuum reorganize the data? (Seeking to minimize disk head
> movement.)

Tables and indexes are stored in disk files in the filesystem, so PG
relies on the OS to lay out data on the disk.

> How does Postgres handle sequential IO? Does it treat is specially
> such as issuing large IO operations that span block boundaries?

The WAL (write-ahead log), a sort of journal, is written sequentially.
I"m not too familiar with whether WAL writes are ganged together if
possible, but I would think so.

> How does Postgres handle direct IOs (operations directly to disk,
> bypassing the buffer cache)? Will it issue multiple asynchronous IO
> operations?

No direct I/O, no async I/O.  A background checkpoint process handles
a lot of the data writeback I/O.

> Is Postgres always one process per client, or can it spawn additional
> processes to parallelise some operations such as a nested loops join
> operation?

One process per client connection.  Right now there is no spawning of
additional "worker" processes.

> Is there a recommended file system to use for Postgres data, such as
> ext2 or another non-journaling FS?

You definitely want a journaled FS or the equivalent, since losing
filesystem metadata on a crash can ruin your whole day, not to mention
the fsck times...

There doesn't seem to be a clear winner in the "which FS" debate.  If
you use ext3, it's probably fastest to mount with 'data=writeback' for
your DB partition, since you can rely on PG to journal the data
writes.  Most other FS's only journal metadata anyway.

Hope this helps!

-Doug

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] initdb failure in CVS

2004-05-05 Thread Bruce Momjian


Thanks, works.

---

Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > I am seeing the following failure of initdb in CVS:
> 
> Okay, I did some more work on handling unsetenv cleanly ... give it
> another try.
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 7: don't forget to increase your free space map settings
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Multiple Xids in PGPROC?

2004-05-05 Thread Simon Riggs

On Wed, 2004-05-05 at 05:30, Rod Taylor wrote:

> Yup.. And some of us intend on wrapping every single statement in a
> subtransaction so we can rollback on an error without aborting the main
> transaction.
> 

That is exactly what is needed to achieve full Oracle & DB2
compatibility.

I suggest that this should be a session settable parameter, to allow
session transaction semantics to mimic particular DBMS. 

I want the behaviour but not the effort...

Best Regards, Simon Riggs

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread Joshua D. Drake

The fact that windows has a heavy process / lightweight thread design 
means little to me, since I'll likely never deploy a production postgresql 
server on it that needs to handle any serious load.
Yes but Solaris also has a heavy process / lightweight thread design.
J


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
begin:vcard
fn:Joshua D. Drake
n:Drake;Joshua D.
org:Command Prompt, Inc.
adr:;;PO Box 215;Cascade Locks;Oregon;97014;USA
email;internet:[EMAIL PROTECTED]
title:Consultant
tel;work:503-667-4564
tel;fax:503-210-0034
note:Command Prompt, Inc. is the largest and oldest US based commercial PostgreSQL support provider. We  provide the only commercially viable integrated PostgreSQL replication solution, but also custom programming, and support. We authored  the book Practical PostgreSQL, the procedural language plPHP, and adding trigger capability to plPerl.
x-mozilla-html:FALSE
url:http://www.commandprompt.com/
version:2.1
end:vcard


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] initdb failure in CVS

2004-05-05 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> I am seeing the following failure of initdb in CVS:

Okay, I did some more work on handling unsetenv cleanly ... give it
another try.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread Stephen Frost

* Steve Atkins ([EMAIL PROTECTED]) wrote:
> On Wed, May 05, 2004 at 02:12:58PM -0400, Rod Taylor wrote:
> > 
> > Most of it has been. It's the duty cycle. As stated in another email,
> > only about 20% of the work a script does is database related -- which
> > occurs all at one time. Even when all Apache backends are active, a
> > large number of connections will be idle but were used or will be used
> > at some point during the generation of that page.
> > 
> > It really is an Apache fault -- but I don't think it can be fixed within Apache 
> > itself.
> 
> http://apache.webthing.com/
> 
>   mod_pg_pool or mod_valet_sql - Apache modules to handle postgresql
>   connection pools

Looks like what we need are functions in PHP or something which use the
functions provided by these apache modules, if they don't exist already
(as far as I know they don't?).  Or whatever language it is that he's
using.

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread scott.marlowe

On Wed, 5 May 2004, sdv mailer wrote:

> Forking is quite fast on Linux but creating a new
> process is still 10x more expensive than creating a
> thread and is even worse on Win32 platform. CPU load
> goes up because the OS needs to allocate/deallocate
> memory making it difficult to get a steady state
> resource consumption.

Just a nit to pick here.  In Linux, the difference between forking and 
spawning a new thread is almost nothing.  Definitely less than a factor of 
2, and most assuredly less than the quoted factor of 10 here.

The fact that windows has a heavy process / lightweight thread design 
means little to me, since I'll likely never deploy a production postgresql 
server on it that needs to handle any serious load.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] initdb failure in CVS

2004-05-05 Thread Andrew Dunstan

Bruce Momjian wrote:
I am seeing the following failure of initdb in CVS:

The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale C.

creating directory /u/pg/data ... ok
creating directory /u/pg/data/global ... ok
creating directory /u/pg/data/pg_xlog ... ok
creating directory /u/pg/data/pg_clog ... ok
creating directory /u/pg/data/base ... ok
creating directory /u/pg/data/base/1 ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 1000
creating configuration files ... ok
	creating template1 database in /u/pg/data/base/1 ... 

   FATAL:  invalid value for parameter "client_encoding": ""
initdb: child process exited with exit code 1
initdb: failed
initdb: removing data directory "/u/pg/data"
The problem seems to be related to a commit made to initdb a few days
ago.
revision 1.24
date: 2004/05/05 16:09:31;  author: tgl;  state: Exp;  lines: +23 -2
Use a more portable technique for unsetting environment variables,
and unset PGCLIENTENCODING to prevent backend from dying if it's set
to something incompatible with the -E option.
I don't have any encoding set in my system.
 

The change is based on this code from here (backend/commands/variable.c):
/*
* unsetenv() works fine, but is BSD, not POSIX, and is not available
* under Solaris, among others. Apparently putenv() called as below
* clears the process-specific environment variables. Other
* reasonable arguments to putenv() (e.g. "TZ=", "TZ", "") result in a
* core dump (under Linux anyway). - thomas 1998-01-26
*/
if (tzbuf[0] == 'T')
{
strcpy(tzbuf, "=");
if (putenv(tzbuf) != 0)
elog(LOG, "could not clear TZ environment variable");
tzset();
}
The Linux man page for putenv says this:
Description for libc4, libc5, glibc: If the argument string is of the
form name, and does not contain an ‘=’ character, then the variable
name is removed from the environment. If putenv() has to allocate a
new array environ, and the previous array was also allocated by
putenv(), then it will be freed. In no case will the old storage asso-
ciated to the environment variable itself be freed.
The libc4 and libc5 and glibc 2.1.2 versions conform to SUSv2: the
pointer string given to putenv() is used. In particular, this string
becomes part of the environment; changing it later will change the
environment. (Thus, it is an error is to call putenv() with an auto-
matic variable as the argument, then return from the calling function
while string is still part of the environment.) However, glibc
2.0-2.1.1 differs: a copy of the string is used. On the one hand this
causes a memory leak, and on the other hand it violates SUSv2. This has
been fixed in glibc2.1.2.
The BSD4.4 version, like glibc 2.0, uses a copy.
I suspect you have seen this latter effect, i.e. it in effect did
putenv("PGCLIENTENCODING=");
putenv("=");
leaving you with an empty string as the env value rather than unsetting it.
Did we actually find a current system where it broke with a straight 
putenv("LC_ALL")?

cheers
andrew


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread Steve Atkins

On Wed, May 05, 2004 at 02:12:58PM -0400, Rod Taylor wrote:
> 
> Most of it has been. It's the duty cycle. As stated in another email,
> only about 20% of the work a script does is database related -- which
> occurs all at one time. Even when all Apache backends are active, a
> large number of connections will be idle but were used or will be used
> at some point during the generation of that page.
> 
> It really is an Apache fault -- but I don't think it can be fixed within Apache 
> itself.

http://apache.webthing.com/

  mod_pg_pool or mod_valet_sql - Apache modules to handle postgresql
  connection pools

http://sqlrelay.sourceforge.net/
http://dbbalancer.sourceforge.net/

  Database connection pooling software

And, of course, most development environments (perl, php, java etc)
have their own language specific connection pooling solutions.

Cheers,
  Steve
  

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] initdb failure in CVS

2004-05-05 Thread Rod Taylor

On Wed, 2004-05-05 at 13:48, Bruce Momjian wrote:
> I am seeing the following failure of initdb in CVS:
> 
>  FATAL:  invalid value for parameter "client_encoding": ""

I get the same thing.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] initdb failure in CVS

2004-05-05 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> I am seeing the following failure of initdb in CVS:
>  FATAL:  invalid value for parameter "client_encoding": ""

Hmm.  Apparently the pg_unsetenv routine I put into initdb.c doesn't
work on your platform.  Which is odd, because we've used exactly the
same technique to unset TZ in variable.c for years and years, and not
had any reports of trouble.

We might have to put configure to work to figure out how to do unsetenv
properly.  Any thoughts?

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

[HACKERS] Multiple "selects" returned from a single stored procedure

2004-05-05 Thread Shachar Shemesh

Hi list,
I'm in the process of porting an existing MS-SQL database to PostgreSQL.
The application uses OLE DB (and that's why I'm writing the OLE DB for
Postgresql). One of the requirements is that we will introduce as little
changes to the application. It has to be able to work with both
databases, as well as Access (which is not really a database).
Now the question:
MS-SQL has the capacity for both out variables from stored procedures,
as well as running several "selects" inside the procedures, and then
giving the results for all selects to the caller. Fortunetly for me,
that specific application doesn't run more than one select per stored
procedure.
The way I handled out variables so far was to have the function return a
compound type, with the variables as rows. With embedded selects,
however, this will no longer work.
I guess what I would like to suggest is for the thus far unused "select"
command in PLPGSQL to be used, in some way, to return values outside the
scope of the strict "returns" context. I guess out variables will also
be nice, but that's besides the point.
If anyone has any ideas on how to both modify called parameters, and
return a rowset, please let me know. Best I came up with so far was to
create a temporary table for the out vars or the selects. I can then rig
the OLE DB to make it look as if the function returned that.
Shachar
--
Shachar Shemesh
Lingnu Open Systems Consulting
http://www.lingnu.com/


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] ALTER TABLE TODO items

2004-05-05 Thread Tom Lane

Richard Huxton <[EMAIL PROTECTED]> writes:
> Does that mean I'll want to disable triggers while I do this?

Hrm.  Right now the code does not fire triggers at all, but that seems
wrong.  However, I doubt that very many triggers could cope with update
events in which the old and new rows have different rowtypes :-(.
Any thoughts what to do about that?

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] ALTER TABLE TODO items

2004-05-05 Thread Richard Huxton

Tom Lane wrote:
Bruce Momjian <[EMAIL PROTECTED]> writes:
Wow, you can reference different column as part of the alter column.

Yeah, the USING is actually any arbitrary expression over the old table
row.  (Getting that to work was a tad tricky...)  So you can view this
as a full-table UPDATE operation that folds in possible column type changes.
Does that mean I'll want to disable triggers while I do this? Actually, 
if the structure's changing I presume I'll want to drop/recreate my 
triggers anyway (even if they get reparsed like view definitions).

--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread Jonathan Gardner

On Wednesday 05 May 2004 07:24 am, Rod Taylor wrote:
> > And "preforking" makes this different, how ? Perhaps having a pool of
> > processes ready to be handed a query to a specific database, where you
> > configure N connections to db1, M to db2 etc. still means lots of
> > resource usage. In effect a preforked database server *is* an idle
> > connection, just without the TCP establishment and teardown sequence
> > which is negligable on modern platforms - and even if it were not
> > negligable, it would be effectively identical regardless of the chosen
> > DB platform.
>
> In theory, it should drastically reduce the number of idle connections
> for poor connection pooling on the other end.
>

If the client is poorly written, nothing on the server side can really 
prevent them from being poorly written.

> The problem are pools for Apache that establish 1 connection per Apache
> backend. 100 Apache backends means 100 backend connections (50 of which
> may be idle as not all pages use the database). Multiply that by 40
> webservers and you have a real mess of idle connections.
>

Or, you run several seperate Apache webservers. The ones that serve static 
content or don't need database connections don't run with the ones that do. 
And just like each idle Apache process uses memory and other resources, 
each idle PostgreSQL connection does to. So managing the number of Apache 
connections so that there aren't too many or too few solves the problem of 
having too many or too few idle database connections. This is all stuff 
that I personally have managed and planned for, and it is quite easy to do 
without any connection pooling on the server side.

It all comes down to management, which Apache does a reasonable job of. 
Either we duplicate the efforts of Apache (they are non-trivial), or we 
piggy-back on their success. And who's to say that the right solution for 
Apache is the right solution for another application? Are we going to 
implement a different flavor of management for each kind of application?

I suggest you implement server-side connection pooling and see for yourself:

(a) How much overhead there is for configuration (which databases? How many 
idle?)

(b) How much easier it is to do on the client side after all.

If you really believe that you are right and I am wrong, then prove it. I'll 
be happy to be shown the error of my thinking (and see an improvement to 
PostgreSQL in the process).

That's the great thing about Open Source. We can all talk the talk, but it 
comes down to whoever actually walks the walk. In the proprietary world, no 
one gets a chance to walk the walk.

-- 
Jonathan Gardner
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Multiple Xids in PGPROC?

2004-05-05 Thread Alvaro Herrera

On Tue, May 04, 2004 at 11:21:07PM -0400, Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > So, the big question is, how do we do this?  The most obvious way (to
> > me) is to keep the whole array inside the PGPROC struct.
> > ...
> > The main downside is that it potentially
> > requires a lot of shared memory.  Can we afford that?
> 
> No.  Shared memory is fixed size, therefore the above is guaranteed to
> fail.
> 
> I thought we had devised a solution that did not require expansible
> shared memory for this.  Bruce, Manfred, do you recall how that went?

All right, here is how I think it should work.

Consider the following scenario:

create table foo (a int);
BEGIN;  -- Xid = 100
insert into foo values (1);
BEGIN;  -- Xid = 110
insert into foo values (2);
COMMIT;

BEGIN;  -- Xid = 120
update foo set a=1;
COMMIT;
COMMIT;

A backend starts just after Xid=120 has sub-committed.  Its snapshot
will be:

snapshot = {
xmax = 150
xmin = 90
xip = { 100, ... }
}

So everytime I see a tuple with Xmin/Xmax between 90 and 150 I have to
look it up in pg_subtrans up to the topmost transaction (which will have
pg_subtrans=0) and see if the result is in the xip list.

For example, the tuple with Xid=110 will have pg_subtrans=100; Xid=100
will have pg_subtrans=0, and xip contains 100, so the tuple has xmin in
progress.

(I'd like to avoid the pg_subtrans lookup in the non-subtransaction case,
but I don't see how to do that.)

-- 
Alvaro Herrera ()
"God is real, unless declared as int"

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Multiple Xids in PGPROC?

2004-05-05 Thread Tom Lane

Alvaro Herrera <[EMAIL PROTECTED]> writes:
> (I'd like to avoid the pg_subtrans lookup in the non-subtransaction case,
> but I don't see how to do that.)

Could we afford to make xids self-identifying?  For instance, odd
numbers are base xacts, even numbers are sub xacts.  This would in the
worst case cause us to cycle through the XID space twice as fast as we
need to, but I'm not convinced that's a big problem.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[HACKERS] initdb failure in CVS

2004-05-05 Thread Bruce Momjian

I am seeing the following failure of initdb in CVS:

The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale C.

creating directory /u/pg/data ... ok
creating directory /u/pg/data/global ... ok
creating directory /u/pg/data/pg_xlog ... ok
creating directory /u/pg/data/pg_clog ... ok
creating directory /u/pg/data/base ... ok
creating directory /u/pg/data/base/1 ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 1000
creating configuration files ... ok

creating template1 database in /u/pg/data/base/1 ... 

   FATAL:  invalid value for parameter "client_encoding": ""

initdb: child process exited with exit code 1
initdb: failed
initdb: removing data directory "/u/pg/data"

The problem seems to be related to a commit made to initdb a few days
ago.

revision 1.24
date: 2004/05/05 16:09:31;  author: tgl;  state: Exp;  lines: +23 -2
Use a more portable technique for unsetting environment variables,
and unset PGCLIENTENCODING to prevent backend from dying if it's set
to something incompatible with the -E option.

I don't have any encoding set in my system.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] ALTER TABLE TODO items

2004-05-05 Thread Tom Lane

Robert Treat <[EMAIL PROTECTED]> writes:
> On Wed, 2004-05-05 at 10:36, Bruce Momjian wrote:
>> Do we still want this TODO?
>> 
>> o Allow columns to be reordered using ALTER ... POSITION i col1 [,col2];

> I think we should leave since it is still functionality that people will
> want.

It's not that no one would want it, it's that the code impact (and risk
of bugs) associated with separate logical and physical column numbers
seems very disproportionate to the value.  The main argument for it
AFAIR was to support column type substitution via drop col/add col/
reorder col.  Now that we have a better way I think the value of such a
feature wouldn't be worth the work/risk.

> recreating the entire table is likely to cause excessive i/o and disk
> space issues compared to a potentially much nicer add column/update
> column/drop column routine.

How you figure that?  The UPDATE step will in itself require 2X disk
space --- and after that you'll need a VACUUM FULL to get it back.
The implementation Rod came up with is much nicer.

> Hmm... upon further thought, if the above implementation stands up, istm
> that its machinations could also be used to implement the reordering
> functionality... ie. rewrite the table and fix up any dependencies as
> needed.  

True.  In fact, this example that I put into the regression tests may be
food for thought:

create table another (f1 int, f2 text);
insert into another values(1, 'one');
insert into another values(2, 'two');
insert into another values(3, 'three');

select * from another;
 f1 |  f2   
+---
  1 | one
  2 | two
  3 | three
(3 rows)

alter table another
  alter f1 type text using f2 || ' more',
  alter f2 type bigint using f1 * 10;

select * from another;
 f1 | f2 
+
 one more   | 10
 two more   | 20
 three more | 30
(3 rows)

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] pg_autovacuum misinterpreting reltuples?

2004-05-05 Thread Matthew T. O'Connor

Jeff Boes wrote:
We noticed that one of our high-volume insert tables was being vacuumed 
every time pg_autovacuum woke up. (I"m running it with the default 
threshold values, and a 900-second sleep cycle.) The table has a few 
million rows in it. With "debug = 2" on, here's what the pg_autovacuum 
log reports for this table:

[2004-04-30 09:32:11 AM]   table name: nexcerpt."public"."links"
[2004-04-30 09:32:11 AM]  relid: 1014188;   relisshared: 0
[2004-04-30 09:32:11 AM]  reltuples: 3;  relpages: 11
Note that "reltuples" is being reported as "3", which seems a bit low. Aha!
# select reltuples from pg_class where relname = 'links';
 reltuples
-
3.32127e+06
(1 row)
Is it possible that pg_autovacuum sees this value as "3"? Or is it just 
mis-formatting the value in the log?
Yes.  This is a known bug inside of pg_autovacuum, it is fixed inside of 
CVS, and will be released with 7.4.3 (whenever that happens, hopefully 
soon since I'm getting lots of reports for this bug).  Please downlooad 
pg_autovacuum from CVS and compile by hand for the time being.

Matthew O'Connor
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Multiple Xids in PGPROC?

2004-05-05 Thread Manfred Koizar

On Tue, 04 May 2004 23:21:07 -0400, Tom Lane <[EMAIL PROTECTED]> wrote:
>I thought we had devised a solution that did not require expansible
>shared memory for this.  Bruce, Manfred, do you recall how that went?

AFAIR we did not discuss TransactionIdIsInProgress() specifically.
Currently this function is special insofar as it does not consult
pg_clog but loops over the PGPROC array.  The current implementation is
in sinval.c.  The straightforward pg_clog lookup is still in transam.c,
but has been deactivated:
 * Now this func in shmem.c and gives quality answer by scanning
 * PGPROC structures of all running backend. - vadim 11/26/96

What was the motivation for this change?  Consistency or speed?

With subtransactions we'd have to fall back to checking pg_clog (and
pg_subtrans) in certain cases.  There are lots of possible
implementations.  Here are some ideas (just brainstorming):

 .  We could first scan the PGPROC array.  If the xid is an active main
transaction, we're finished.

 .  If xid is older than RecentGlobalXmin, it cannot be active.

 .  We could include a small number of subtransaction xids in PGPROC.

 .  For additional subtransactions not fitting into this small array
there could be minsubxid and maxsubxid fields in PGPROC.  If the xid we
are looking for is outside all these ranges, it cannot be an active
subtransaction.

 .  If all these tests fail, we fall back to checking pg_clog.

Servus
 Manfred

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Postgres Crashes

2004-05-05 Thread F Harvell

  This sounds very much like a memory problem.  I would replace all of 
the memory with another set of (preferably known good) memory and see 
if the problems persist.  Also look for other cores that may be 
dropped.  If there are several, memory is the likely cause.  Be aware 
that it will likely be active, large memory applications (of which 
PostgreSQL may be the only one on the server) that will materialize the 
issues.

  Memory testing application may also show the problem, however, they 
do not test like production use.  I have had test apps run for weeks 
where production use can cause failures in mere minutes.  Also, note 
that I have seen issues with bad CPU's (bad cache?) that have caused 
similar problems.

On 30 Apr 2004, at 15:24, Prem Gopalan wrote:
We run a multithreaded application that uses postgres 7.4 on Linux
2.4.18, dual cpu Xeon processor machine. We have occassional weird
crashes and have tried a lot of things to reproduce them in house, but
in vain. We do have coredumps and I have listed the backtraces and
their common characteristics here.
...

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] ALTER TABLE TODO items

2004-05-05 Thread Bruce Momjian

Tom Lane wrote:
> select * from another;
>  f1 |  f2   
> +---
>   1 | one
>   2 | two
>   3 | three
> (3 rows)
> 
> alter table another
>   alter f1 type text using f2 || ' more',
>   alter f2 type bigint using f1 * 10;
> 
> select * from another;
>  f1 | f2 
> +
>  one more   | 10
>  two more   | 20
>  three more | 30
> (3 rows)

Wow, you can reference different column as part of the alter column.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread Andrew Dunstan

sdv mailer wrote:
[snip]
Pre-fork will give MySQL one less argument to throw at
PostgreSQL. 

I think optimizing is this area will speed up the
general case for everyone rather than optimizing a
feature that affects 10% of the users. On top of that,
it will make a strong marketing case because forking
will no longer become a speed issue when compared to
MySQL.
 

So when can we expect to see your proof of concept code and benchmarks 
to show the speedup achieved?

cheers
andrew
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] ALTER TABLE TODO items

2004-05-05 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> Wow, you can reference different column as part of the alter column.

Yeah, the USING is actually any arbitrary expression over the old table
row.  (Getting that to work was a tad tricky...)  So you can view this
as a full-table UPDATE operation that folds in possible column type changes.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Postgres Crashes

2004-05-05 Thread Tom Lane

[EMAIL PROTECTED] (Prem Gopalan) writes:
> The dying process is postmaster. After these crashes the server is
> missing from ps and no more new connections are possible. The backend
> processes stay on till their connections close.

That behavior does sound like a postmaster crash --- but all the stack
traces you show are clearly in backend code.  A backend crash ought not
take out the postmaster.  So something fairly odd is going on here.

What if anything shows up in the postmaster's stderr log when this
happens?

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread Rod Taylor

On Wed, 2004-05-05 at 11:57, Greg Stark wrote:
> Rod Taylor <[EMAIL PROTECTED]> writes:
> 
> > Cutting that count down to 10 idlers in total by having PostgreSQL
> > prefork a specific database would make a significant difference.
> 
> Well it would be 10 for each database. Since as has been pointed out before
> loading the database is most of the delay.
> 
> If that's enough why not just run 10 apache processes instead of 100? 

Because then we would need 10 times as many servers ;)

> I'm assuming the static non-database driven content is already separated onto
> other servers. In which case running 100 apache processes, most of which are
> idle is the source of the problem.

Most of it has been. It's the duty cycle. As stated in another email,
only about 20% of the work a script does is database related -- which
occurs all at one time. Even when all Apache backends are active, a
large number of connections will be idle but were used or will be used
at some point during the generation of that page.

It really is an Apache fault -- but I don't think it can be fixed within Apache itself.

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread Richard Huxton

sdv mailer wrote:
I'm talking about connecting to multiple database
servers on separate machines. Schemas don't apply
here.
How much work would it take to make a pre-fork smart
enough to open different databases on incoming
connection? How much of it can be modeled after
Apache?
I've not used it but Tatsuo Ishii has just released pgpool v1.0. Quoting 
from its README:

1. What is pgpool
   pgpool is a connection server program for PostgreSQL. pgpool runs
   between PostgreSQL's client(frontend) and server(backend). Any
   PostgreSQL clients can connect to pgpool as if it's a real
   PostgreSQL server.
   pgpool caches the connection to PostgreSQL server to reduce the
   overhead to establish the connection to it.
   Also pgpool could use two PostgreSQL servers for fail over
   purpose. If the first server goes down, pgpool will automatically
   switch to the secondary server.
If that's not what you're after, then it must be fairly close. Check the 
mailing list archives - the download URL is on the "announce" list April 
2004.

I'm sure any real figures from your testing will be of much interest to 
all of us.

HTH
--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread Rod Taylor

> Or, you run several seperate Apache webservers. The ones that serve static 
> content or don't need database connections don't run with the ones that do. 
> And just like each idle Apache process uses memory and other resources, 
> each idle PostgreSQL connection does to. So managing the number of Apache 

Considered that, but it doesn't help much. The duty cycle of any given
page is about 20% database, 80% webserver work. So at any given time 80%
of the connections to the database will be idle in a best case scenario.

If Apache did decent connection pooling or PostgreSQL gave us a hand
then a given webserver would need 1/4 of the connections which could be
internally shared.

Page 1 start
Page 1 DB connect

Page 1 DB disconnect
.
. 
.
Page 1 transmit results

If we could really disconnect from the database and not suffer high
re-connection overhead OR have Apache recognize the connection is unused
and allow another Apache backend to use it there would not be a problem.

> It all comes down to management, which Apache does a reasonable job of.

> If you really believe that you are right and I am wrong, then prove it. I'll 
> be happy to be shown the error of my thinking (and see an improvement to 
> PostgreSQL in the process).

You wouldn't run into a problem like this on a system with good
connection pooling. JBoss comes to mind, once a connection is free it is
available to other threads to use. AOL Server is a webserver which
demonstrates proper connection pooling.

Apache is the problem we're trying to work around. It does everything
per backend, rather than having a common pool for the server. That can
be fixed by improving PostgreSQL or by doing something (I'm not sure
what) with apache.




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] ALTER TABLE TODO items

2004-05-05 Thread Bruce Momjian

Stephen Frost wrote:
> > I don't think so.  As I remember it was part of doing logical attribute
> > numbers as a way to add ALTER COLUMN TYPE, but because we now use table
> > recreate to implement this, it is unlikely we will ever add logical
> > attribute numbers (adds too much complexity to the code).
> 
> Does using table recreate break views built against the table?  I recall
> someone saying the old way did, it would be very nice if that could be
> fixed or at least added to the TODO.  I like the idea of being able to
> alter the positions of the columns too, but that's not as big of an
> issue as breaking views.

Yea, I think it will break views. You have to recreate them.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Multiple Xids in PGPROC?

2004-05-05 Thread Tom Lane

Manfred Koizar <[EMAIL PROTECTED]> writes:
> The straightforward pg_clog lookup is still in transam.c,
> but has been deactivated:
>  * Now this func in shmem.c and gives quality answer by scanning
>  * PGPROC structures of all running backend. - vadim 11/26/96

> What was the motivation for this change?  Consistency or speed?

Getting the right answer --- the other way can't tell the difference
between an open transaction and a crashed one.

>  .  We could include a small number of subtransaction xids in PGPROC.

Yeah, I was just thinking that myself.  If we only need to show open
subtrans xids, then the number you'd need would depend on nesting depth
not the total number of subxacts used.  So half-a-dozen or so would
probably suffice for 99% of situations.  You'd need a flag that could be
set to show "I'm so deeply nested I can't fit all my subxacts here",
but you'd only need to go to pg_subtrans when that happened.

On the other hand, I'm not sure how much that helps, considering you
probably have to resolve the subtrans XID up to its parent anyway to
check commit/abort status.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread Greg Stark

Rod Taylor <[EMAIL PROTECTED]> writes:

> Cutting that count down to 10 idlers in total by having PostgreSQL
> prefork a specific database would make a significant difference.

Well it would be 10 for each database. Since as has been pointed out before
loading the database is most of the delay.

If that's enough why not just run 10 apache processes instead of 100? 

I'm assuming the static non-database driven content is already separated onto
other servers. In which case running 100 apache processes, most of which are
idle is the source of the problem.

-- 
greg

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread sdv mailer

Pre-fork does not equal to idle connections! Pre-fork
scales with database load where as persistent
connections scales with webserver load. A web server
that is heavily loaded but not necessarily performing
a lot of database activity will spawn hundreds of idle
database connections using persistent connection. With
pre-fork, you can potentially lower this down to even
10 open connections. 

Forking is quite fast on Linux but creating a new
process is still 10x more expensive than creating a
thread and is even worse on Win32 platform. CPU load
goes up because the OS needs to allocate/deallocate
memory making it difficult to get a steady state
resource consumption.

More importantly, solving the forking delay will have
a big impact on people's mind who have been given the
impression that forking is very very slow. Here's what
one site has to say about PostgreSQL's forking:

http://www.geocities.com/mailsoftware42/db/

"Postgres forks on every incoming connection - and the
forking process and backend setup is a bit slow, but
one can speed up PostgreSQL by coding things as stored
procedures"

Pre-fork will give MySQL one less argument to throw at
PostgreSQL. 

I think optimizing is this area will speed up the
general case for everyone rather than optimizing a
feature that affects 10% of the users. On top of that,
it will make a strong marketing case because forking
will no longer become a speed issue when compared to
MySQL.









__
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs  
http://hotjobs.sweepstakes.yahoo.com/careermakeover 

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] ALTER TABLE TODO items

2004-05-05 Thread Gaetano Mendola

Bruce Momjian wrote:
Do we still want this TODO?
o Allow columns to be reordered using ALTER ... POSITION i col1 [,col2];
  have SELECT * and INSERT honor such ordering
I don't think so.  As I remember it was part of doing logical attribute
numbers as a way to add ALTER COLUMN TYPE, but because we now use table
recreate to implement this, it is unlikely we will ever add logical
attribute numbers (adds too much complexity to the code).
Well, I manage a DB that is up and running 24/24 7/7 since 3 years now,
the only off working time was during the engine update.
At the beginning with few hundred record on each table, in order to add
a column in the desidered position I was performing the recreation table
adventure with the pain to reconstruct all views depending on it ( at
that time postgres didn't even had any dependencies information ), and
all foreign key refering the table.
Now with milion of record this is not feseable. What we do now is add,
in the development DB, the column at the end of the table, this just to
have the table in production and in the development environment with the
same definition.
I think that have a way to reorder the column inside a table definition
could save us some pains.
Regards
Gaetano Mendola




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] The features I'm waiting for.

2004-05-05 Thread Gaetano Mendola

Bruce Momjian wrote:
David Garamond wrote:
scott.marlowe wrote:
For me, the only features I'm likely to use in the upcoming releases are 
nested transactions.  While PITR is a great selling point, and the Windows 
Port is something I do look forward to, having to do half my job 
programming windows boxes, nested transactions are a feature I can 
genuinely use in my daily (maybe weekly??? :-) life.

While a focus on things that make postgresql more market acceptable are 
important, the things that make it more feature complete to me as a user 
are the things I'd gladly wait an extra month or two for.  

But I'm not programming any of the code, so I'm just sayin'...
I'm sure everybody has their own favorite feature. But I can say quite 
confidently that the upcoming release contains the most number of highly 
anticipated features ever. Nested transaction, 2-phase commit, Windows 
port... I mean these are all major stuffs. They are paving the way of 
deployments of Postgres in new areas and applications. Plus don't forget 
all the other sweet goodies like autovacuum and PITR.

But the next release could also be the buggies version ever, due to the 
number of these new features. :-)

The point is that if we stay to the June 1 feature freeze, you will not
have all those features in 7.5, only a few of them.
Am I wrong or if the 7.5 will have the 2-phase commit we will see pop up
finally robust tools in order to have postgresql in cluster ( mirror, 
fault tollerance, load balancer, ... ) ?

I'm looking forward to these tools indeed.
Regards
Gaetano Mendola

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread sdv mailer

I'm talking about connecting to multiple database
servers on separate machines. Schemas don't apply
here.

How much work would it take to make a pre-fork smart
enough to open different databases on incoming
connection? How much of it can be modeled after
Apache?








__
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs  
http://hotjobs.sweepstakes.yahoo.com/careermakeover 

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] COPY command - CSV files

2004-05-05 Thread Andrew Dunstan

It's been done already.
see http://developer.postgresql.org/todo.php and 
http://developer.postgresql.org/docs/postgres/sql-copy.html

cheers
andrew
Umberto Zappi wrote:
I wish modify COPY command for support of CSV files (dump/load files in
CSV format).
I think this option is very important from import data from spreedsheet
as OpenOffice/calc or M$/excel.
I have found this task in TODO list, also.
I've begin my work, modify COPY syntax in:
COPY tablename [ ( column [, ...] ) ]
   FROM { 'filename' | STDIN }
   [ [ WITH ]
 [ BINARY ]
 [ OIDS ]
 [ DELIMITER [ AS ] 'delimiter' ]
 [ [ OPTIONALLY ] ENCLOSED [ BY ] 'delimiter1' [ AND
'delimiter2' ] ]
 [ NULL [ AS ] 'null string' ] ]
COPY tablename [ ( column [, ...] ) ]
   TO { 'filename' | STDOUT }
   [ [ WITH ]
 [ BINARY ]
 [ OIDS ]
 [ DELIMITER [ AS ] 'delimiter' ]
 [ ENCLOSED [ BY ] 'delimiter1' [ AND 'delimiter2' ] ]
 [ NULL [ AS ] 'null string' ] ]
Syntax is like to  control-file of Oracle's utility sql*load.
Enclosed define the first and second  delimiters (if are different)
which surround each field.
The delimiters may be optionally if keyword exists.
At soon!
Umberto Zappi.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] ALTER TABLE TODO items

2004-05-05 Thread Bruce Momjian

Tom Lane wrote:
> Stephen Frost <[EMAIL PROTECTED]> writes:
> > Does using table recreate break views built against the table?
> 
> Right now it just rejects the ALTER attempt:
> 
> regression=# create table t1 (f1 int);
> CREATE TABLE
> regression=# create view v1 as select * from t1;
> CREATE VIEW
> regression=# alter table t1 alter f1 type bigint;
> ERROR:  cannot alter type of a column used by a view or rule
> DETAIL:  rule _RETURN on view v1 depends on column "f1"
> regression=#
> 
> Improving this per the previous discussion probably ought to be
> mentioned in the TODO list.

Add to TODO?

Allow views to be auto-recreated based on table changes

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] COPY command - CSV files

2004-05-05 Thread Bruno Wolff III

On Sun, May 02, 2004 at 17:03:35 +0200,
  Umberto Zappi <[EMAIL PROTECTED]> wrote:
> I wish modify COPY command for support of CSV files (dump/load files in
> CSV format).
> I think this option is very important from import data from spreedsheet
> as OpenOffice/calc or M$/excel.
> I have found this task in TODO list, also.

A lot of work has already been done on this. You should read through
the archives. I think most of the discussion was roughly a month ago.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread Rod Taylor

> I know the issue of pre-fork PostgreSQL has been discussed previously.
> Someone mentionned pre-fork can be implemented when schemas become available

> Any chance of that happening for 7.5?

0 chance unless you have a patch ready now.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] COPY command - CSV files

2004-05-05 Thread Bruce Momjian

Umberto Zappi wrote:
> I wish modify COPY command for support of CSV files (dump/load files in
> CSV format).
> I think this option is very important from import data from spreedsheet
> as OpenOffice/calc or M$/excel.
> I have found this task in TODO list, also.
> 
> I've begin my work, modify COPY syntax in:
> 
> COPY tablename [ ( column [, ...] ) ]
> FROM { 'filename' | STDIN }
> [ [ WITH ]
>   [ BINARY ]
>   [ OIDS ]
>   [ DELIMITER [ AS ] 'delimiter' ]
>   [ [ OPTIONALLY ] ENCLOSED [ BY ] 'delimiter1' [ AND
> 'delimiter2' ] ]
>   [ NULL [ AS ] 'null string' ] ]
> 
> COPY tablename [ ( column [, ...] ) ]
> TO { 'filename' | STDOUT }
> [ [ WITH ]
>   [ BINARY ]
>   [ OIDS ]
>   [ DELIMITER [ AS ] 'delimiter' ]
>   [ ENCLOSED [ BY ] 'delimiter1' [ AND 'delimiter2' ] ]
>   [ NULL [ AS ] 'null string' ] ]
> 
> Syntax is like to  control-file of Oracle's utility sql*load.
> Enclosed define the first and second  delimiters (if are different)
> which surround each field.
> The delimiters may be optionally if keyword exists.

I guess you didn't notice that the TODO item has a dash next it, meaning
it is done and will be in 7.5.  We didn't use Oracle's syntax, but we do
allow for the escape character in the quotes to be specified if different:

COPY tablename [ ( column [, ...] ) ] ]' ]
FROM { 'filename' | STDIN }elimiter' ] ...] ]
[ [ WITH ] S ] AS ] 'null string' ]' ]
  [ BINARY ]  [ AS ] 'delimiter' ] ...] ]
  [ OIDS ] AS ] 'null string' ]' ]
  [ DELIMITER [ AS ] 'delimiter' ] ...] ]
  [ NULL [ AS ] 'null string' ]' ]
  [ CSV [ QUOTE [ AS ] 'quote' ] , ...] ]
[ ESCAPE [ AS ] 'escape' ]
[ FORCE NOT NULL column [, ...] ]
TO { 'filename' | STDOUT }delimiter' ]

COPY tablename [ ( column [, ...] ) ] ]' ]
TO { 'filename' | STDOUT }delimiter' ]
[ [ WITH ] S ] AS ] 'null string' ]' ]
  [ BINARY ]R [ AS ] 'delimiter' ]
  [ OIDS ] AS ] 'null string' ]' ]
  [ DELIMITER [ AS ] 'delimiter' ]
  [ NULL [ AS ] 'null string' ]' ]
  [ CSV [ QUOTE [ AS ] 'quote' ]
[ ESCAPE [ AS ] 'escape' ]

One interesting idea we had was for ,, to be null, and ,"", to be a
zero-length string.  You can control that with FORCE NOT NULL,
meaning ,, is a zero-length string too.

To get the full details, see the current docs on the developers web
page.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] ALTER TABLE TODO items

2004-05-05 Thread Robert Treat

On Wed, 2004-05-05 at 10:36, Bruce Momjian wrote:
> 
> TODO items completed:
> 
> o -ALTER TABLE ADD COLUMN does not honor DEFAULT and non-CHECK CONSTRAINT
> o -ALTER TABLE ADD COLUMN column DEFAULT should fill existing
>   rows with DEFAULT value
> o -Allow ALTER TABLE to modify column lengths and change to binary
>   compatible types
> 
> Seems we didn't have ALTER COLUMN TYPE on the TODO list.
> 
> Do we still want this TODO?
> 
> o Allow columns to be reordered using ALTER ... POSITION i col1 [,col2];
>   have SELECT * and INSERT honor such ordering
> 
> I don't think so.  As I remember it was part of doing logical attribute
> numbers as a way to add ALTER COLUMN TYPE, but because we now use table
> recreate to implement this, it is unlikely we will ever add logical
> attribute numbers (adds too much complexity to the code).
> 

I think we should leave since it is still functionality that people will
want. Furthermore I am not sure we are done with ALTER COLUMN TYPE
completely. Granted I've not yet had time to take a thorough look at the
implementation so I could be off here, but as I understand it the
current code seems a little problematic on large tables; recreating the
entire table is likely to cause excessive i/o and disk space issues
compared to a potentially much nicer add column/update column/drop
column routine. 

Hmm... upon further thought, if the above implementation stands up, istm
that its machinations could also be used to implement the reordering
functionality... ie. rewrite the table and fix up any dependencies as
needed.  

way to back track on myself eh? 

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Postgres Crashes

2004-05-05 Thread Bruce Momjian

Prem Gopalan wrote:
> We run a multithreaded application that uses postgres 7.4 on Linux
> 2.4.18, dual cpu Xeon processor machine. We have occassional weird
> crashes and have tried a lot of things to reproduce them in house, but
> in vain. We do have coredumps and I have listed the backtraces and
> their common characteristics here.

Whether your client is multi-threaded or not should have no affect on
the postmaster and any crashes you see there.

This part of the backtrace seems significant:

> #3  0x081a767e in elog_finish (elevel=20, fmt=0x8235680 "invalid
> memory alloc request size %lu")

I wonder if you are allocating too much memory.  Looking at the 7.4 code
I see these all as ERROR, not FATAL (backend exits) or PANIC (postmaster
exits), so it shouldn't be crashing anything:

./backend/utils/mmgr/mcxt.c:elog(ERROR, "invalid memory
alloc request size %lu",
./backend/utils/mmgr/mcxt.c:elog(ERROR, "invalid memory
alloc request size %lu",
./backend/utils/mmgr/mcxt.c:elog(ERROR, "invalid memory
alloc request size %lu",
./backend/utils/mmgr/mcxt.c:elog(ERROR, "invalid memory
alloc request size %lu",

Would you send over a backtrace that shows more levels above this?

Can you reproduce this crash on demand?

I can't imagine why you would get this error.  I wonder if you have a
problem with bad memory on that machine?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] ALTER TABLE TODO items

2004-05-05 Thread Tom Lane

Stephen Frost <[EMAIL PROTECTED]> writes:
> Does using table recreate break views built against the table?

Right now it just rejects the ALTER attempt:

regression=# create table t1 (f1 int);
CREATE TABLE
regression=# create view v1 as select * from t1;
CREATE VIEW
regression=# alter table t1 alter f1 type bigint;
ERROR:  cannot alter type of a column used by a view or rule
DETAIL:  rule _RETURN on view v1 depends on column "f1"
regression=#

Improving this per the previous discussion probably ought to be
mentioned in the TODO list.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

[HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread sdv mailer

Hi,

I know the issue of pre-fork PostgreSQL has been
discussed previously.
Someone mentionned pre-fork can be implemented when
schemas become available
in PostgreSQL because there will be less of the need
to run multiple
databases.

I think Oracle 7 uses pre-forking and it helps speed
up the startup time
considerably. Often, there are cases where connection
pooling or persistent
connection cannot be used efficiently (e.g. replicated
or splitted databases
over hundreds of machines or where persistent
connection opens up too many
idle connections). Instead, there's a big need to
create a new connection on
every query and with PostgreSQL needing to fork on
every incoming connection
can be quite slow.

Any chance of that happening for 7.5?

Thanks.






__
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs  
http://hotjobs.sweepstakes.yahoo.com/careermakeover 

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

[HACKERS] pg_autovacuum misinterpreting reltuples?

2004-05-05 Thread Jeff Boes

We noticed that one of our high-volume insert tables was being vacuumed 
every time pg_autovacuum woke up. (I"m running it with the default 
threshold values, and a 900-second sleep cycle.) The table has a few 
million rows in it. With "debug = 2" on, here's what the pg_autovacuum 
log reports for this table:

[2004-04-30 09:32:11 AM]   table name: nexcerpt."public"."links"
[2004-04-30 09:32:11 AM]  relid: 1014188;   relisshared: 0
[2004-04-30 09:32:11 AM]  reltuples: 3;  relpages: 11
Note that "reltuples" is being reported as "3", which seems a bit low. Aha!
# select reltuples from pg_class where relname = 'links';
 reltuples
-
3.32127e+06
(1 row)
Is it possible that pg_autovacuum sees this value as "3"? Or is it just 
mis-formatting the value in the log?

--
Jeff Boes  vox 269.226.9550 ext 24
Database Engineer fax 269.349.9076
Nexcerpt, Inc. http://www.nexcerpt.com
  ...Nexcerpt... Extend your Expertise
---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Weird prepared stmt behavior

2004-05-05 Thread Oliver Jowett

(I'm not on -hackers, but saw this in the archives)
Alvaro Herrera wrote:
On Fri, Apr 30, 2004 at 09:44:52PM -0400, Tom Lane wrote:
Alvaro Herrera  writes:
> Is this expected?  If so, why?  I'd expect the prepared stmt to be
> deallocated.
prepare.c probably should have provisions for rolling back its state to
the start of a failed transaction ... but it doesn't.
Before jumping into doing that, though, I'd want to have some
discussions about the implications for the V3 protocol's notion of
prepared statements.  The protocol spec does not say anything that
would suggest that prepared statements are lost on transaction rollback,
and offhand it seems like they shouldn't be because the protocol is
lower-level than transactions.
Right now there is no distinction between a PREPARE prepared statement
and a protocol-level one.  If we want to have the v3proto's statements
behave different from PREPARE's, it's just a matter of adding a new
field into the PreparedStatement.  I can do that and make them behave
different if people think this is how it should be.
I don't really have an opinion on whether protocol-level should behave
different.  What do people think?
At least from the JDBC driver's point of view, having prepared 
statements roll back is more work for the driver. Currently it uses 
PREPARE/EXECUTE statements, but eventually it'll use the protocol-level 
messages.

When the JDBC driver is given a query to execute and decides to use 
server-side preparation, it sends a PREPARE (or eventually a Parse 
message). Thereafter, when that same query is executed it will send an 
EXECUTE (or Bind/Execute) instead of the full query. It does this by 
setting some state in the driver-side object representing the query to 
say "this query is prepared with name 'foo'".

If PREPARE can roll back, the driver must maintain a set of all 
statements that were sucessfully PREPAREd in the current transaction, 
and fix up the corresponding query object state whenever a transaction 
rolls back.

From that point of view, it's much simpler to keep PREPARE (or at least 
Parse) as it currently is. I suspect the same argument applies to any 
interface layer that uses PREPARE or Parse automatically.

-O
---(end of broadcast)---
TIP 8: explain analyze is your friend

[HACKERS] Postgres Optimization: IO and Data Organization

2004-05-05 Thread James Thornton

I want to understand how Postgres organizes data and handles IO
operations so that I will better know how to optimize a Postgres
database server. I am looking for answers to specific questions and
pointers to where this stuff is documented.

How does Postgres organize its data? For example, is it grouped
together on the disk, or is it prone to be spread out over the disk?
Does vacuum reorganize the data? (Seeking to minimize disk head
movement.)

How does Postgres handle sequential IO? Does it treat is specially
such as issuing large IO operations that span block boundaries?

How does Postgres handle direct IOs (operations directly to disk,
bypassing the buffer cache)? Will it issue multiple asynchronous IO
operations?

Is Postgres always one process per client, or can it spawn additional
processes to parallelise some operations such as a nested loops join
operation?

Is there a recommended file system to use for Postgres data, such as
ext2 or another non-journaling FS?

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

[HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread Private

Hi,

I know the issue of pre-fork PostgreSQL has been discussed previously.
Someone mentionned pre-fork can be implemented when schemas become available
in PostgreSQL because there will be less of the need to run multiple
databases.

I think Oracle 7 uses pre-forking and it helps speed up the startup time
considerably. Often, there are cases where connection pooling or persistent
connection cannot be used efficiently (e.g. replicated or splitted databases
over hundreds of machines or where persistent connection opens up too many
idle connections). Instead, there's a big need to create a new connection on
every query and with PostgreSQL needing to fork on every incoming connection
can be quite slow.

Any chance of that happening for 7.5?

Thanks.



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

[HACKERS] Postgres Crashes

2004-05-05 Thread Prem Gopalan

We run a multithreaded application that uses postgres 7.4 on Linux
2.4.18, dual cpu Xeon processor machine. We have occassional weird
crashes and have tried a lot of things to reproduce them in house, but
in vain. We do have coredumps and I have listed the backtraces and
their common characteristics here.

Briefly, the last frame is a call to a glibc (or rarely some other
shared lib) method.And the instruction pointer points to an indirect
jmp instruction to the shared lib method. Almost all coredumps show
this characteristic.

The dying process is postmaster. After these crashes the server is
missing from ps and no more new connections are possible. The backend
processes stay on till their connections close.

Any ideas appreciated.

core #1 (8706)
---

(gdb) bt
#0  0x0806f1c4 in snprintf () 
#1  0x081a7f50 in send_message_to_frontend (edata=0x826c1e0)
at /root/src/postgres/src/backend/utils/error/elog.c:1239
#2  0x081a6c85 in errfinish (dummy=0) at
/root/src/postgres/src/backend/utils/error/elog.c:359
#3  0x081a767e in elog_finish (elevel=20, fmt=0x8235680 "invalid
memory alloc request size %lu")
at /root/src/postgres/src/backend/utils/error/elog.c:853


(gdb) disassemble
Dump of assembler code for function snprintf:
0x0806f1c4 :jmp*0x823f02c
0x0806f1ca :push   $0x508
0x0806f1cf :   jmp0x806e7a4 <_init+24>
End of assembler dump.

(gdb) x/i $pc
0x806f1c4 :   jmp*0x823f02c

(gdb) x *0x823f02c
0x182de130 :  push   %ebp

(gdb) disassemble *0x823f02c
Dump of assembler code for function snprintf:
0x182de130 :push   %ebp
0x182de131 :mov%esp,%ebp
0x182de133 :push   %ebx

core #2 (5889)
---

(gdb) bt
#0  0x0806f0b4 in memcpy ()
#1  0x08103cee in pq_getbytes (s=0xbfffeb5c ".", len=4) at
/root/src/postgres/src/backend/libpq/pqcomm.c:748
#2  0x08103e04 in pq_getmessage (s=0xbfffec10, maxlen=0) at
/root/src/postgres/src/backend/libpq/pqcomm.c:837
#3  0x0814c98b in SocketBackend (inBuf=0xbfffec10) at
/root/src/postgres/src/backend/tcop/postgres.c:377

(gdb) disassemble
Dump of assembler code for function memcpy:
0x0806f0b4 :  jmp*0x823efe8
0x0806f0ba :  push   $0x480
0x0806f0bf : jmp0x806e7a4 <_init+24>
End of assembler dump.

(gdb) x/i $pc
0x806f0b4 : jmp*0x823efe8

(gdb) x *0x823efe8
0x18304f18 :push   %ebp

(gdb) disassemble *0x823efe8
Dump of assembler code for function memcpy:
0x18304f18 :  push   %ebp
0x18304f19 :  mov%esp,%ebp
0x18304f1b :  mov0x10(%ebp),%eax

core #3 (32662)
---

(gdb) bt
#0  0x0806f3c4 in strncpy ()
#1  0x081b22fa in set_ps_display (activity=0x4 )
at /root/src/postgres/src/backend/utils/misc/ps_status.c:282
#2  0x0814f3f5 in PostgresMain (argc=4, argv=0x8279838,
username=0x8279808 "postgres")
at /root/src/postgres/src/backend/tcop/postgres.c:2805
#3  0x0812f24b in BackendFork (port=0x82877a8) at
/root/src/postgres/src/backend/postmaster/postmaster.c:2558

(gdb) x/i $pc
0x806f3c4 :jmp*0x823f0ac

(gdb) disassemble *0x823f0ac
Dump of assembler code for function strncpy:
0x183033c0 : push   %ebp
0x183033c1 : mov%esp,%ebp
0x183033c3 : push   %edi

core #4 (28335)
---

(gdb) bt
#0  0x0806f0c1 in memcpy ()
#1  0x08103cee in pq_getbytes (s=0xbfffeb5c "\f", len=4) at
/root/src/postgres/src/backend/libpq/pqcomm.c:748
#2  0x08103e04 in pq_getmessage (s=0xbfffec10, maxlen=0) at
/root/src/postgres/src/backend/libpq/pqcomm.c:837
#3  0x0814c98b in SocketBackend (inBuf=0xbfffec10) at
/root/src/postgres/src/backend/tcop/postgres.c:377
 
(gdb) x/i $pc
0x806f0c1 :  idiv   %bh

(gdb) disassemble
Dump of assembler code for function memcpy:
0x0806f0b4 :  jmp*0x823efe8
0x0806f0ba :  push   $0x480
0x0806f0bf : jmp0x806e7a4 <_init+24>
End of assembler dump.

(gdb) disassemble *0x823efe8
Dump of assembler code for function memcpy:
0x18304f18 :  push   %ebp
0x18304f19 :  mov%esp,%ebp
0x18304f1b :  mov0x10(%ebp),%eax
  
core #5 (22375)


(gdb) bt
#0  0x0806f32c in SSL_CTX_use_certificate_file ()
#1  0x08103cee in pq_getbytes (s=0xbfffeb5c "\f", len=4) at
/root/src/postgres/src/backend/libpq/pqcomm.c:748
#2  0x08103e04 in pq_getmessage (s=0xbfffec10, maxlen=0) at
/root/src/postgres/src/backend/libpq/pqcomm.c:837
#3  0x0814c98b in SocketBackend (inBuf=0xbfffec10) at
/root/src/postgres/src/backend/tcop/postgres.c:377

(gdb) x/i $pc
0x806f32c : add   
$0x70e9,%eax

(gdb) disassemble
Dump of assembler code for function SSL_CTX_use_certificate_file:
0x0806f324 :jmp*0x823f084
0x0806f32a :push   $0x5b8
0x0806f32f :   jmp0x806e7a4
<_init+24>
End of assembler dump.

(gdb) disassemble *0x823f084
Dump of assembler code for function SSL_CTX_use_certificate_file:
0x0806f324 :jmp*0x823f084
0x0806f32a :push   $0x5b8
0x0806f32f :   jmp0x806e7a4
<_init+24>
End of assembler dump.

core #6 (22371)
---

(gdb) bt
#0  0x0806f2e0 in semop ()
#1  0x1f0812c0 in ?? ()
#2  0x08149778 in LWLockRelease (lockid=WALWriteLock) at
/root/src/postgres/src/back

[HACKERS] COPY command - CSV files

2004-05-05 Thread Umberto Zappi

I wish modify COPY command for support of CSV files (dump/load files in
CSV format).
I think this option is very important from import data from spreedsheet
as OpenOffice/calc or M$/excel.
I have found this task in TODO list, also.
I've begin my work, modify COPY syntax in:
COPY tablename [ ( column [, ...] ) ]
   FROM { 'filename' | STDIN }
   [ [ WITH ]
 [ BINARY ]
 [ OIDS ]
 [ DELIMITER [ AS ] 'delimiter' ]
 [ [ OPTIONALLY ] ENCLOSED [ BY ] 'delimiter1' [ AND
'delimiter2' ] ]
 [ NULL [ AS ] 'null string' ] ]
COPY tablename [ ( column [, ...] ) ]
   TO { 'filename' | STDOUT }
   [ [ WITH ]
 [ BINARY ]
 [ OIDS ]
 [ DELIMITER [ AS ] 'delimiter' ]
 [ ENCLOSED [ BY ] 'delimiter1' [ AND 'delimiter2' ] ]
 [ NULL [ AS ] 'null string' ] ]
Syntax is like to  control-file of Oracle's utility sql*load.
Enclosed define the first and second  delimiters (if are different)
which surround each field.
The delimiters may be optionally if keyword exists.
At soon!
Umberto Zappi.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread Andrew Sullivan

On Mon, May 03, 2004 at 11:59:45PM -0700, sdv mailer wrote:
> 
> Connection pooling (eg. SQLRelay) didn't work either
> because we needed to connect to hundreds of DB servers
> from each web server. Imagine having 200+ open
> connections on the web server and how many more of
> these connections remain idle. The situation gets

This sounds like a case where you probably ought to be using schema
support instead of many different databases, for the record.  I don't
see how pre forking is going to help you at all, because a connection
is to a database, so you're going to have to pick one, and it's
likely as not to be the wrong one.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] ALTER TABLE TODO items

2004-05-05 Thread Stephen Frost

* Bruce Momjian ([EMAIL PROTECTED]) wrote:
> Do we still want this TODO?
> 
> o Allow columns to be reordered using ALTER ... POSITION i col1 [,col2];
>   have SELECT * and INSERT honor such ordering
> 
> I don't think so.  As I remember it was part of doing logical attribute
> numbers as a way to add ALTER COLUMN TYPE, but because we now use table
> recreate to implement this, it is unlikely we will ever add logical
> attribute numbers (adds too much complexity to the code).

Does using table recreate break views built against the table?  I recall
someone saying the old way did, it would be very nice if that could be
fixed or at least added to the TODO.  I like the idea of being able to
alter the positions of the columns too, but that's not as big of an
issue as breaking views.

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Multiple Xids in PGPROC?

2004-05-05 Thread Rod Taylor

> Is there some solution whereby the common case (99.999% of transactions 
> won't be subtransactoins) is fast, and the uncommon case of being in a 
> subtransaction is slower?

I hope not, because for many of us there will be as many (if not more)
subtransactions than standard transactions.

-- 
Rod Taylor 

Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
PGP Key: http://www.rbt.ca/signature.asc


signature.asc
Description: This is a digitally signed message part

Re: [HACKERS] inconsistent owners in newly created databases?

2004-05-05 Thread Greg Stark


Fabien COELHO <[EMAIL PROTECTED]> writes:

>   nspacl   = aclitems_switch_grantor(nspacl, datdba)

Instead of having a hard coded list of template1 objects that need to be
chowned to the database owner. Perhaps there should be a special user like
dbowner which owns the schema and whatever other objects are necessary. Then
createdb would chown over anything owned by dbowner but not by objects owned
by postgres.

This would have the advantage that a dba could add objects to template1 and
choose whether to set them to be owned by postgres or owned by dbowner. Then
create various databases owned by different users and automatically have the
selected template objects be owned by the database owner.

-- 
greg


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[HACKERS] ALTER TABLE TODO items

2004-05-05 Thread Bruce Momjian


TODO items completed:

o -ALTER TABLE ADD COLUMN does not honor DEFAULT and non-CHECK CONSTRAINT
o -ALTER TABLE ADD COLUMN column DEFAULT should fill existing
  rows with DEFAULT value
o -Allow ALTER TABLE to modify column lengths and change to binary
  compatible types

Seems we didn't have ALTER COLUMN TYPE on the TODO list.

Do we still want this TODO?

o Allow columns to be reordered using ALTER ... POSITION i col1 [,col2];
  have SELECT * and INSERT honor such ordering

I don't think so.  As I remember it was part of doing logical attribute
numbers as a way to add ALTER COLUMN TYPE, but because we now use table
recreate to implement this, it is unlikely we will ever add logical
attribute numbers (adds too much complexity to the code).

---

Tom Lane wrote:
> CVSROOT:  /cvsroot
> Module name:  pgsql-server
> Changes by:   [EMAIL PROTECTED]   04/05/05 01:48:48
> 
> Modified files:
>   doc/src/sgml/ref: alter_table.sgml 
>   src/backend/bootstrap: bootparse.y 
>   src/backend/catalog: dependency.c heap.c index.c 
>   src/backend/commands: cluster.c indexcmds.c tablecmds.c 
>   src/backend/nodes: copyfuncs.c equalfuncs.c 
>   src/backend/parser: analyze.c gram.y 
>   src/backend/tcop: utility.c 
>   src/backend/utils/adt: ruleutils.c 
>   src/include/catalog: dependency.h heap.h index.h 
>   src/include/commands: cluster.h defrem.h tablecmds.h 
>   src/include/nodes: nodes.h parsenodes.h 
>   src/include/parser: analyze.h 
>   src/include/utils: builtins.h 
>   src/test/regress/expected: alter_table.out foreign_key.out 
>  inherit.out 
>   src/test/regress/sql: alter_table.sql foreign_key.sql 
> inherit.sql 
> 
> Log message:
>   ALTER TABLE rewrite.  New cool stuff:
>   
>   * ALTER ... ADD COLUMN with defaults and NOT NULL constraints works per SQL
>   spec.  A default is implemented by rewriting the table with the new value
>   stored in each row.
>   
>   * ALTER COLUMN TYPE.  You can change a column's datatype to anything you
>   want, so long as you can specify how to convert the old value.  Rewrites
>   the table.  (Possible future improvement: optimize no-op conversions such
>   as varchar(N) to varchar(N+1).)
>   
>   * Multiple ALTER actions in a single ALTER TABLE command.  You can perform
>   any number of column additions, type changes, and constraint additions with
>   only one pass over the table contents.
>   
>   Basic documentation provided in ALTER TABLE ref page, but some more docs
>   work is needed.
>   
>   Original patch from Rod Taylor, additional work from Tom Lane.
> 
> 
> ---(end of broadcast)---
> TIP 8: explain analyze is your friend
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Multiple Xids in PGPROC?

2004-05-05 Thread Greg Stark

Christopher Kings-Lynne <[EMAIL PROTECTED]> writes:

> > Yup.. And some of us intend on wrapping every single statement in a
> > subtransaction so we can rollback on an error without aborting the main
> > transaction.
> 
> Point there being "main transaction".  What i'm saying is that the vast
> majority of your "transactions" will be single statements.  eg. single selects,
> single updates, etc.

And if autocommit mode is off, which is how I would strongly urge people to
work, these single statements will be a subtransaction within a main
transaction.

> > In fact, I would be surprised if tools like psql went very long without
> > doing the same thing so users can recover from spelling mistakes.
> 
> If the user does an explicit BEGIN, then perhaps we might, but how often does
> the user do an explicit BEGIN?

Well currently very rare since it's so infuriating to have to start all over
when you make a spelling error. As long as autocommit mode is on the same
thing would happen.

But in Oracle autocommit mode is OFF in the command line tool. You have to
type commit to commit any changes. At first this is surprising and annoying,
but after a while you find it's terribly useful and a much safer way to work.
You can do an update, then double-check that you did the right thing before
committing it.

-- 
greg

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread Rod Taylor

> And "preforking" makes this different, how ? Perhaps having a pool of
> processes ready to be handed a query to a specific database, where you
> configure N connections to db1, M to db2 etc. still means lots of resource
> usage. In effect a preforked database server *is* an idle connection, just
> without the TCP establishment and teardown sequence which is negligable on
> modern platforms - and even if it were not negligable, it would be
> effectively identical regardless of the chosen DB platform.

In theory, it should drastically reduce the number of idle connections
for poor connection pooling on the other end.

The problem are pools for Apache that establish 1 connection per Apache
backend. 100 Apache backends means 100 backend connections (50 of which
may be idle as not all pages use the database). Multiply that by 40
webservers and you have a real mess of idle connections.

Cutting that count down to 10 idlers in total by having PostgreSQL
prefork a specific database would make a significant difference.

The other (preferable) alternative is to convince Apache to use a common
connection pool per server rather than per Apache backend.



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Multiple Xids in PGPROC?

2004-05-05 Thread Alvaro Herrera

On Tue, May 04, 2004 at 11:21:18PM -0400, Bruce Momjian wrote:
> Alvaro Herrera wrote:

> > I've whacked the subtrans patch enough so that the simple tests (i.e.
> > non concurrent) for tuple visibility work.  I can create a table and
> > populate it in subtransactions, rollback or commit them selectively and
> > get the desired effect at the end.  Naturally, catalog entries also
> > behave [somewhat] sanely.  Oh, I made pg_subtrans work too.  (Though
> > whether it's relatively bug-free is yet to be proven.)
> 
> I remember going through this.  Other backends will use pg_subtrans to
> know what transactions are in progress. They have to do the standard
> lookups to find the status of the parent transaction.  The backend-local
> list of xids is needed so the commit can clean up those subtransaction
> xids so that later transactions don't have to use pg_subtrans.

Ok, this can be done with what I have so far.  I'm not sure how slow
will it be compared to checking the PGPROC struct, because it may
involve getting a pg_subtrans page from disk.  Currently I have 8
pg_subtrans buffers on shared memory, the same as pg_clog; maybe we want
more to reduce that probability.  8 kB each, 2k xacts each, 16k xacts
total.

I'll test this and will probably be submitting a patch shortly.

> Sorry I haven't gotten your patches in yet.  Tom is working on some
> other back patches.

I've been sloppy lately with #ifdef, because it takes some time to get
right and testing it takes even more time.  I don't know if it's worth
it -- do you still have the idea of incremental, non disturbing patches?

> Also, do you have a plan to handle some of the more complex issues
> like locking in subtransactions?

Certainly.  As soon as I have a concurrent scenario working, I'll pursue
the cleanup of all modules at subxact abort.  (I have some working, some
which I know don't work, and some which I haven't tried yet.)

-- 
Alvaro Herrera ()
"Cada quien es cada cual y baja las escaleras como quiere" (JMSerrat)

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread Peter Galbavy

sdv mailer wrote:
> We used to run persistent connection until the DB
> servers got maxed out because of too many idle
> connections sucking up all the memory. Web servers run
> different loads than database servers and persistent
> connections are notorious for crashing your DB.

And this translates from your experiences with mysql to postgresql ? You
haven't made it clear which platforms and what level of concurrent
connections gave you this behaviour. Tom Lane has already explained that
most of the connection time is probably used in configuring the connection
based on the database required etc.

> Connection pooling (eg. SQLRelay) didn't work either
> because we needed to connect to hundreds of DB servers
> from each web server. Imagine having 200+ open
> connections on the web server and how many more of
> these connections remain idle. The situation gets
> worse when you multiply by an even greater number of
> web servers connected to all these database servers.
> Do the math! We're talking large server farm here, not
> 2 or 3 machines.

And "preforking" makes this different, how ? Perhaps having a pool of
processes ready to be handed a query to a specific database, where you
configure N connections to db1, M to db2 etc. still means lots of resource
usage. In effect a preforked database server *is* an idle connection, just
without the TCP establishment and teardown sequence which is negligable on
modern platforms - and even if it were not negligable, it would be
effectively identical regardless of the chosen DB platform.

> I think pre-forking can be beneficial and is a lot
> simpler than to rewrite a multi-threaded DB server.

This is open source, feel free to do a proof on concept (or pay someone to
do a proof of concept), run the numbers and see if your assertions work for
real. Many others here with more experience than myself of running thousands
of connections at once don't appear to think so. My limited expereience with
many hundreds of "idle" connections is that it is not particularly taxing at
all on any even semi-modern hardware (PIII/512MB etc).

Peter

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] ERROR: heapgettup: failed ReadBuffer

2004-05-05 Thread jihuang

Thanks for your real-time respone!
the problem was sloved after I upgrade the postgreSQL from 7.3.4 to 7.4.2.
by the way, is there any bug-tracking website for postgreSQL ?
I follow the [HOMEPAGE] -> [DEVELOPERS] ->find nothing  relative to 
bugzilla-like items,
follow the [GBROG] ->>>   it's PostgreSQL related projects , but without 
PostgreSQL itself ?

let me show a advertisement...  quote from ORELLY's Developer Weblogs
> RT foundry is being developed in Taiwan as part of the Open Foundry 
Project, which is aimed at encouraging for
> FS/OSS development in Taiwan.  The foundry is a SF-like, expect using 
better technologies
> (RT for bug/request tracking, subversion for source control, etc ...

the following link is the issue and comments log for sloving this 
problem I said.
http://rt.openfoundry.org/Foundry/Project/Tracker/Display.html?Queue=90&id=2653

there are some chinese characters mixed,  but I just wanna to show that 
host a dedicate  issue/bug tracking system
may improve a software project evloution.

June-Yen
Tom Lane wrote:
jihuang <[EMAIL PROTECTED]> writes:

I put  36+ rows in a table , and now any select , update , analyze 
... command fail.
the log shows  "ERROR: heapgettup: failed ReadBuffer",

What Postgres version is this?  AFAICS that error has been impossible
for quite some time ...
			regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Resource allocation?

2004-05-05 Thread Neil Conway

On 4-May-04, at 2:18 PM, David Fetter wrote:
I'd like to be able to run large batch-process jobs for caching, but
limit the total resource utilization on them, something like nice(1).
Better still would be some way to put a max on total resources & be
able to allocate from that.
Does PostgreSQL have facilities for such a thing?
Well, nice(1) does not limit resource allocation, it changes the 
scheduling priority of a process -- ISTM the two features are largely 
orthogonal.

Limits on resource allocation (i.e. essentially quotas, getrlimit() and 
so on) would be cool; some other people have expressed interest in them 
in the past.

Implementing "batch processes" would be trickier: it seems to me that a 
naive implementation would fall prey to priority inversion. It should 
definitely possible to do, though. If you're interested, have at it :-)

-Neil
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

84 matches

Mail list logo