date:20030214

Re: [HACKERS] Incremental backup

2003-02-14 Thread Curt Sampson

On Fri, 14 Feb 2003, Bruce Momjian wrote:

> OK, once we have PITR, will anyone want incremental backups?

Well, I'm not entirely clear on how PITR will work, so I may be off-base
here, but it seems to me that offering incremental backups that back
up only changed pages might not be all that big a win, given how
postgres writes its pages. On DBMSs that don't use MVCC, if you change a
particular item in a row ten times, one page is changed. If you do it in
postgres, you could well be changing ten pages, as the system writes the
two copies of the entire row wherever it can find space. So in databases
where a lot of rows are changed, where an incremental backup would
normally be a win because it would be much smaller than the logs over a
given period, it isn't going to be with postgres.

But you know, if we could get rid of redundant changes in the logs we're
using for backup, that could save a lot of space in a situation like
the one I described above. If a particular row and column is changed
fifty times over the course of a month, it's going to be recorded fifty
times in the log. But there's really no need for all fifty of those,
if you don't mind not being able to restore to any time before the
current time. You can reduce the size of the logs you need to store
for backup by throwing away the first forty-nine of those changes, and
keeping only the most recent version. There shouldn't be any worries
about referential integrity, because when you do a restore, you start
with a full backup that is ok, and once you've successfully applied
all the transactions in the log, you know it will be ok again, so any
intermediate states during the restore where integrity is not maintained
are not a problem.

cjs
-- 
Curt Sampson  <[EMAIL PROTECTED]>   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Location of the configuration files, round 2

2003-02-14 Thread mlw

One of the things that I HATE about this discussion is that everyone 
wants to put limits on configurability.

I am an old fashion UNIX guy, capability without enforcing policy! 
Adding an ability is different than enforcing a policy. All I any to do 
is add the capability of configuration in a way that most admins would 
be used to.

If people want an FHS compatible install, I don't care. I want to enable 
it, but it should not be enforced.


Kevin Brown wrote:

Wow, there's been a lot of discussion on this issue!


While it won't address some of the issues that have been brought up,
there is one very simple thing we can do that will help sysadmins
quite a lot: eliminate the postmaster's use of $PGDATA, and force the
data directory to be specified on the command line.  It's fine if the
shell scripts still use $PGDATA, but the postmaster should not.

The reason is that at least it'll always be possible for
administrators to figure out where the data is by looking at the
output of 'ps'.


While I'd prefer to also see a GUC variable added to the config file
that tells the postmaster where to look for the data, the above will
at least simplify the postmaster's code (since the logic for dealing
with $PGDATA can be eliminated) while eliminating some of the trouble
administrators currently have with it.



 




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Curt Sampson

On Fri, 14 Feb 2003, scott.marlowe wrote:

> Asking for everything in a directory with the name local in it to be
> shared is kind of counter intuitive to me.

Not really. If you install a particular program that doesn't come with
the OS on one machine on your site, why would you not want to install it
separately on all of the others?

Typically, I want my favourite non-OS utilities on all machines, not
just one. (Even if I don't use them on all machines.) Thus /usr/local is
for site-local stuff.

cjs
-- 
Curt Sampson  <[EMAIL PROTECTED]>   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] plpython trigger code is wrong (Re: [GENERAL] Potential bug -- script that drops postgres server)

2003-02-14 Thread Bradley McLean


* Tom Lane ([EMAIL PROTECTED]) [030214 19:35]:
> Lance Thomas <[EMAIL PROTECTED]> writes:
> > Below is something that may be of interest -- a short, 7-statement script
> > that seems to drop my postgres server.
> 
> It appears that the plpython trigger implementation assumes that any
> given procedure will be used as a trigger for only one relation.  The
> reason it crashes is it's trying to use the rowtype info of the relation
> it was first compiled for with the other relation.
> 
> Probably the easiest fix is to include the relation OID as part of the
> Python name of a trigger procedure, so that a separate copy is compiled
> for each relation the procedure is used with.

Interesting idea.  I had been taking the approach of applying the cache
to just the python compilation, and not the rowtype info.  This has a
substantial performance penalty, which I'd been addressing by eliminating
some unneeded parameter processing that doesn't apply in the trigger
case, and considering a separate cache for each rowtype.

> Any plpython users want to step forward and fix this?  I have other
> things on my plate ...

I'm looking at the bug right now.  Patch in a day or so.

-Brad


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] client_encoding directive is ignored in

2003-02-14 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> Yep.  Tatsuo, you should apply the patch to fix the problem.  Shame this
> didn't make it into 7.3.2.  Should we backpatch?

No.  I'm not happy that we're breaking libpq for pre-7.2 servers, and
I definitely don't want to see it done in 7.3.  That's way too short
notice.  Especially it's not something to spring on people in a
dot-release.  Put it in 7.4 with a fat compatibility warning in the
release notes.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

[HACKERS] plpython trigger code is wrong (Re: [GENERAL] Potential bug -- script that drops postgres server)

2003-02-14 Thread Tom Lane

Lance Thomas <[EMAIL PROTECTED]> writes:
> Below is something that may be of interest -- a short, 7-statement script
> that seems to drop my postgres server.

It appears that the plpython trigger implementation assumes that any
given procedure will be used as a trigger for only one relation.  The
reason it crashes is it's trying to use the rowtype info of the relation
it was first compiled for with the other relation.

Probably the easiest fix is to include the relation OID as part of the
Python name of a trigger procedure, so that a separate copy is compiled
for each relation the procedure is used with.

Any plpython users want to step forward and fix this?  I have other
things on my plate ...

regards, tom lane

PS: I haven't tested, but I wonder whether any of our other PLs have the
same bug.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Linux.conf.au 2003 Report

2003-02-14 Thread Bruce Momjian


Is Linux.conf.au the event PostgreSQL should use for coverage in
Australia next year?

---

Christopher Kings-Lynne wrote:
> Linux.conf.au Report
> 
> 
> The Linux.conf.au is an international Linux/Open Source event that attracts
> lots of international speakers.  Total conf attendance was around 360, maybe
> even 400 I think.
> 
> Gavin Sherry was speaking at this particular conf, and I attended as a
> hobbyist.
> 
> PostgreSQL got a reasonable amount of attention, particularly since there
> were no representatives from other database products there.
> 
> Some pics of our PostgreSQL BOF and the Perth Bell Tower:
> http://www.zip.com.au/~swm/lca2003
> (Gavin is the beardy looking dude 3rd from the left :)  I'm taking the
> photo.)
> 
> These are the main questions we where asked, or features that were
> requested:
> 
> * Replication, replication, replication!
> 
> - We told them that there are a few solutions, none of them are particularly
> great.  Gavin got all sorts of ideas about log shipping.
> 
> * IPV6 data types
> 
> - Apparently there are some ISPs in some countries that have started to bill
> people for IPV6 bandwidth, and the lack of IPV6 address types is hurting
> them.
> 
> * Collisions in auto-generated names.
> 
> - The standard table modification tactic (that I also use) or renaming table
> to *_old and creating new one breaks because the primary key of the new
> table is assigned the same name as the PK of the old, causing CREATE TABLE
> to fail.  This is really annoying.  I think that auto-generated names should
> never collide.
> 
> * Problem:  person has large database with 4 or 5 humungous tables that they
> aren't interested in backing up.  However, they want to back up the rest.
> 
> - I suggested that if pg_dump could dump individual schemas, then they could
> move their 'don't backup' tables to another schema, and just dump the other
> one.
> 
> We found out all sorts of interesting places that PostgreSQL is being used:
> a large Australian Telco, several restaurants in the Perth area, the Debian
> inventory system and the Katie revision control system.  It is also being
> evaluated for process control analysis at a steel plant.  Maybe we should
> chase some people for case studies?
> 
> Chris Kings-Lynne
> 
> 
> ---(end of broadcast)---
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to [EMAIL PROTECTED] so that your
> message can get through to the mailing list cleanly
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] client_encoding directive is ignored in

2003-02-14 Thread Bruce Momjian

Tatsuo Ishii wrote:
> > Tatsuo Ishii <[EMAIL PROTECTED]> writes:
> > > + /* Flag to we need to initialize client encoding info */
> > > + static bool need_to_init_client_encoding = -1;
> > 
> > Surely that should be int, not bool.
> 
> Oops. Will fix.
> 
> > > ! if (!PQsendQuery(conn, "begin; select 
>pg_client_encoding(); commit"))
> > 
> > Doesn't this break compatibility with pre-7.2 databases?  AFAICT that
> > function was introduced in 7.2.
> 
> Yes, but there seems no other way to solve the problem and I thought we
> do not guarantee the compatibilty between 7.3 frontend and pre 7.3 servers.

Yep.  Tatsuo, you should apply the patch to fix the problem.  Shame this
didn't make it into 7.3.2.  Should we backpatch?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Berkeley and CMU classes adopt/extend PostgreSQL

2003-02-14 Thread Marc G. Fournier

On Tue, 11 Feb 2003, Joe Hellerstein wrote:

> Hi all:
>   I emailed Marc Fournier on this topic some weeks back, but haven't
> heard from him.

And most public apologies for that ... this past month has been a complete
nightmare all around ... we're just finishing up moving our office, and
finally have phone lines again, and hope to have internet again starting
tomorrow ... :(

> 1) We changed the course projects to make the students hack PostgreSQL
> internals, rather than the "minibase" eduware
> 2) We are coordinating the class with a class at CMU being taught by
> Prof. Anastassia ("Natassa") Ailamaki
>
> Our "Homework 2", which is being passed out this week, will ask the
> students to implement a hash-based grouping that spills to disk.  I
> understand this topic has been batted about the pgsql-hackers list
> recently.  The TAs who've prepared the assignment (Sailesh
> Krishnamurthy at Berkeley and Spiros Papadimitriou at CMU) have also
> implemented a reference solution to assignment.  Once we've got the
> students' projects all turned in, we'll be very happy to contribute our
> code back the PostgreSQL project.
>
> I'm hopeful this will lead to many good things:
>
> 1) Each year we can pick another feature to assign in class, and
> contribute back.  We'll need to come up with well-scoped engine
> features that exercise concepts from the class -- eventually we'll run
> out of tractable things that PGSQL needs, but not in the next couple
> years I bet.
>
> 2) We'll raise a crop of good students who know Postgres internals.
> Roughly half the Berkeley EECS undergrads take the DB class, and all of
> them will be post-hackers!  (Again, I don't know the stats at CMU.)
>
> So consider this a heads up on the hash-agg front, and on the future
> contributions front.   I'll follow up with another email on
> PostgreSQL-centered research in our group at Berkeley as well.
>
> Another favor I'd ask is that people on the list be a bit hesitant
> about helping our students with their homework!  We would like them to
> do it themselves, more or less :-)
>
> Regards,
> Joe Hellerstein
>
> --
>
> Joseph M. Hellerstein
> Professor, EECS Computer Science Division
> UC Berkeley
> http://www.cs.berkeley.edu/~jmh
>
>
> On Tuesday, February 11, 2003, at 06:54  PM, Sailesh Krishnamurthy
> wrote:
>
> > From: Hannu Krosing <[EMAIL PROTECTED]>
> > Date: Tue Feb 11, 2003  12:21:26  PM US/Pacific
> > To: Tom Lane <[EMAIL PROTECTED]>
> > Cc: Bruno Wolff III <[EMAIL PROTECTED]>, Greg Stark <[EMAIL PROTECTED]>,
> > [EMAIL PROTECTED]
> > Subject: Re: [HACKERS] Hash grouping, aggregates
> >
> >
> > Tom Lane kirjutas T, 11.02.2003 kell 18:39:
> >> Bruno Wolff III <[EMAIL PROTECTED]> writes:
> >>>   Tom Lane <[EMAIL PROTECTED]> wrote:
>  Greg Stark <[EMAIL PROTECTED]> writes:
> > The neat thing is that hash aggregates would allow grouping on
> > data types that
> > have = operators but no useful < operator.
> 
>  Hm.  Right now I think that would barf on you, because the parser
>  wants
>  to find the '<' operator to label the grouping column with, even if
>  the
>  planner later decides not to use it.  It'd take some redesign of the
>  query data structure (specifically SortClause/GroupClause) to avoid
>  that.
> >>
> >>> I think another issue is that for some = operators you still might
> >>> not
> >>> be able to use a hash. I would expect the discussion for hash joins
> >>> in
> >>> http://developer.postgresql.org/docs/postgres/xoper-optimization.html
> >>> would to hash aggregates as well.
> >>
> >> Right, the = operator must be hashable or you're out of luck.  But we
> >> could imagine tweaking the parser to allow GROUP BY if it finds a
> >> hashable = operator and no sort operator.  The only objection I can
> >> see
> >> to this is that it means the planner *must* use hash aggregation,
> >> which
> >> might be a bad move if there are too many distinct groups.
> >
> > If we run out of sort memory, we can always bail out later, preferrably
> > with a descriptive error message. It is not as elegant as erring out at
> > parse (or even plan/optimise) time, but the result is /almost/ the
> > same.
> >
> > Relying on hash aggregation will become essential if we are ever going
> > to implement the "other" groupings (CUBE, ROLLUP, (), ...), so it would
> > be nice if hash aggregation could also overflow to disk - I suspect
> > that
> > this will still be faster that running an independent scan for each
> > GROUP BY grouping and merging the results.
> >
> > -
> > Hannu
> >
> >
> > ---(end of
> > broadcast)---
> > TIP 1: subscribe and unsubscribe commands go to
> > [EMAIL PROTECTED]
> >
> >
> >
> >
> > --
> > Pip-pip
> > Sailesh
> > http://www.cs.berkeley.edu/~sailesh
> >
>
>
> ---(end of broadcast)---
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

---

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Bruce Momjian


I don't want to over-engineer this.  Propogating -D into postmaster
makes sense, but grabbing PGDATA doesn't to me.

---

Kevin Brown wrote:
> Bruce Momjian wrote:
> > I see your point --- pg_ctl does a PGDATA trick when passed -D:
> > 
> > -D)
> > shift
> > # pass environment into new postmaster
> > PGDATA="$1"
> > export PGDATA
> > 
> > It should pass -D just like it was given.
> 
> Yes, exactly.
> 
> Now, the more interesting question in my mind is: if pg_ctl isn't
> passed -D but inherits PGDATA, should it nonetheless pass -D
> explicitly to the postmaster?  We can make it do that, and it would
> have the benefit of making transparent what would otherwise be opaque.
> 
> I'm inclined to answer "yes" to that question, but only because
> someone who *really* doesn't want the postmaster to show up with a -D
> argument in "ps" can start the postmaster directly without using
> pg_ctl at all.  Tom made a good argument for sometimes wanting to keep
> the ps output clean, but it's not clear to me that it should
> necessarily apply to pg_ctl.
> 
> But you guys might have a different perspective on that.  :-)
> 
> 
> 
> -- 
> Kevin Brown [EMAIL PROTECTED]
> 
> ---(end of broadcast)---
> TIP 4: Don't 'kill -9' the postmaster
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Tom Lane

"Curtis Faith" <[EMAIL PROTECTED]> writes:
>> "Stored in the index"?  And how will you do that portably?  

> Sorry for the lack of rigorous language. I meant that there would be one
> mutex per index stored in the header or internal data structures
> associated with each index somewhere. Probably in the same structure the
> root node reference for each btree is stored.

Hm.  A single lock that must be grabbed for operations anywhere in the
index is a concurrency bottleneck.  But maybe we could avoid that.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] client_encoding directive is ignored in postgresql.conf

2003-02-14 Thread Bruce Momjian

Tom Lane wrote:
> Tatsuo Ishii <[EMAIL PROTECTED]> writes:
> > + /* Flag to we need to initialize client encoding info */
> > + static bool need_to_init_client_encoding = -1;
> 
> Surely that should be int, not bool.
> 
> > !   if (!PQsendQuery(conn, "begin; select 
>pg_client_encoding(); commit"))
> 
> Doesn't this break compatibility with pre-7.2 databases?  AFAICT that
> function was introduced in 7.2.

I haven't seen this patch applied.  If this is the best way to fix the
bug, we may as well break libpq for pre-7.2 clients.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Bruce Momjian

Kevin Brown wrote:
> Bruce Momjian wrote:
> > The big question is whether PGDATA is still our driving config variable,
> > and PGCONFIG/-C is just an additional option, or whether we are moving
> > in a direction where PGCONFIG/-C is going to be the driving value, and
> > data_dir is going to be read as part of that.
> 
> I'm actually leaning towards PGCONFIG + PGDATA.
> 
> Yeah, it may be a surprise given my previous arguments, but I can't
> help but think that the advantages you get with PGDATA will also exist
> for PGCONFIG.
> 
> My previous arguments for removing PGDATA from postmaster can be dealt
> with by fixing pg_ctl to use explicit command line directives when
> invoking postmaster -- no changes to postmaster needed.  PGCONFIG
> would be no different in that regard.

The following patch propogates pg_ctl -D to the postmaster as a -D flag.
I see no other pg_ctl flags that make sense to propogate.

Applied.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

Index: src/bin/pg_ctl/pg_ctl.sh
===
RCS file: /cvsroot/pgsql-server/src/bin/pg_ctl/pg_ctl.sh,v
retrieving revision 1.30
diff -c -c -r1.30 pg_ctl.sh
*** src/bin/pg_ctl/pg_ctl.sh18 Oct 2002 22:05:35 -  1.30
--- src/bin/pg_ctl/pg_ctl.sh14 Feb 2003 22:04:56 -
***
*** 115,120 
--- 115,122 
  logfile=
  silence_echo=
  shutdown_mode=smart
+ PGDATAOPTS=""
+ POSTOPTS=""
  
  while [ "$#" -gt 0 ]
  do
***
*** 129,135 
;;
-D)
shift
!   # pass environment into new postmaster
PGDATA="$1"
export PGDATA
;;
--- 131,138 
;;
-D)
shift
!   # we need to do this so -D datadir shows in ps display
!   PGDATAOPTS="-D $1"
PGDATA="$1"
export PGDATA
;;
***
*** 333,344 
  fi
  
  if [ -n "$logfile" ]; then
! "$po_path" ${1+"$@"} >$logfile 2>&1 &
  else
  # when starting without log file, redirect stderr to stdout, so
  # pg_ctl can be invoked with >$logfile and still have pg_ctl's
  # stderr on the terminal.
! "$po_path" ${1+"$@"} &1 &
  fi
  
  # if had an old lockfile, check to see if we were able to start
--- 336,347 
  fi
  
  if [ -n "$logfile" ]; then
! "$po_path" ${1+"$@"} ${PGDATAOPTS+$PGDATAOPTS} >$logfile 2>&1 &
  else
  # when starting without log file, redirect stderr to stdout, so
  # pg_ctl can be invoked with >$logfile and still have pg_ctl's
  # stderr on the terminal.
! "$po_path" ${1+"$@"} ${PGDATAOPTS+$PGDATAOPTS} &1 &
  fi
  
  # if had an old lockfile, check to see if we were able to start


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] stats_command_string default?

2003-02-14 Thread Tom Lane

Kevin Brown <[EMAIL PROTECTED]> writes:
> Would it make more sense to enable stats_command_string by default?

I'd vote against it.  If we turn it on by default, people are paying
for a feature they may not even know exists.  Once they find out about
it and decide they want it, they can turn it on easily enough.

If you can show that the overhead is unmeasurable, that'd indicate that
this argument is bogus; but I suspect it's not negligible, at least on
simple queries.

> It would also be handy if users could see their own queries while the
> rest remain blank.  That would require changing
> pg_stat_get_backend_activity() so that it returns a value if the user
> is the superuser or if the user asking for the answer is the same as
> the user who owns the backend entry being looked up.  Are there any
> pitfalls to implementing that?

Seems reasonable offhand ...

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Changing the default configuration (was Re: [pgsql-advocacy]

2003-02-14 Thread Tilo Schwarz

Bruce Momjian writes:
> Tom Lane wrote:
> > Bruce Momjian <[EMAIL PROTECTED]> writes:
> > > So, my idea is to add a message at the end of initdb that states people
> > > should run the pgtune script before running a production server.
> >
> > Do people read what initdb has to say?
> >
> > IIRC, the RPM install scripts hide initdb's output from the user
> > entirely.  I wouldn't put much faith in such a message as having any
> > real effect on people...
>
> Yes, that is a problem.  We could show something in the server logs if
> pg_tune hasn't been run.  Not sure what else we can do, but it would
> give folks a one-stop thing to run to deal with performance
> configuration.
>
> We could prevent the postmaster from starting unless they run pg_tune or
> if they have modified postgresql.conf from the default.  Of course,
> that's pretty drastic.

I don't think, that's drastic, if it's done in a user friendy way ;-):

I work with Postgresql for half a year now (and like it very much), but I must 
admit, that it takes time to understand the various tuning parameters (what 
is not surprising, because you need understand to a certain degree, what's 
going on under the hood). Now think of the following reasoning:

- If the resouces of a system (like shared mem, max open files etc.) are not 
known, it's pretty difficult to set good default values. That's why it is so 
difficult to ship Postgresql with a postgresql.conf file which works nicely 
on all systems on this planet.
- On the other hand, if the resouces of a system _are_ known, I bet the people 
on this list can set much better default values than any newbie or a static 
out-of-the-box postgresql.conf.

Thus the know how which is somehow in the heads of the gurus should be 
condensed into a tune program which can be run on a system to detect the 
system resources and which dumps a reasonable postgresql.conf. Those defaults 
won't be perfect (because the application is not known yet) but much better 
than the newbie or out-of-the-box settings.

If the tune program detects, that the system resouces are so limited, that it 
makes basically no sense to run Postgresql there, it tells the user what the 
options are: Increase the system resources (and how to do it if possible) or 
"downtune" the "least reasonable" postgresql.conf file by hand. Given the 
resources of average systems today, the chances are much higher, that users 
leave Postgresql because "it's slower than other databases" than that they 
get upset, because it doesn't start right away the first time.

Now how to make sure, the tune program gets run before postmaster starts the 
first time? Prevent postmaster from starting, unless the tune program was run 
and fail with a clear error message. The message should state, that the tune 
program needs to be run first, why it needs to be run first and the command 
line showing how to do that.

If I think back, I would have been happy to see such a message, you just copy 
and paste the command to your shell, run the command and a few seconds later 
you can restart postmaster with resonable settings. And the big distributors 
have their own scipts anyway, so they can put it just before initdb.

Regards,

Tilo

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

[HACKERS] stats_command_string default?

2003-02-14 Thread Kevin Brown


One of the functions of the DBA is to monitor what people are doing to
the database.  My experience is that "ps" is often sorely lacking in
this regard: its output is somewhat limited, from what I've seen, and
in any case the DBA's domain is the database itself: that's the
environment he's going to be most proficient in.

The ability to select the list of current connections from
pg_stat_activity is really handy for monitoring the database, but the
default configuration disables stats_command_string -- so the
current_query winds up being blank by default.

That's not exactly the most useful configuration for the DBA.

Would it make more sense to enable stats_command_string by default?
It could be a problem if doing so would have a significant impact on
performance, but that's the only reason I can think of for not doing
it.  Are there others?


It would also be handy if users could see their own queries while the
rest remain blank.  That would require changing
pg_stat_get_backend_activity() so that it returns a value if the user
is the superuser or if the user asking for the answer is the same as
the user who owns the backend entry being looked up.  Are there any
pitfalls to implementing that?




-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> I think the big question is whether we want the default to install the
> configs in a separate directory, pgsql/etc, or just allow the
> specification of a separate location.  Advantages of pgsql/etc are
> initdb-safe, and easier backups.

I don't see why we don't just let initdb install suggested config files
into the new $PGDATA directory, same as it ever did.  Then (as long as
we don't use relative paths in the config files) people can move them
somewhere else if they like, or not if they prefer not to.  Adding more
mechanism than that just adds complexity without buying much (except the
possibility of initdb overwriting your old config files, which is
exactly what I thought we wanted to avoid).

> The big question is whether PGDATA is still our driving config variable,
> and PGCONFIG/-C is just an additional option, or whether we are moving
> in a direction where PGCONFIG/-C is going to be the driving value, and
> data_dir is going to be read as part of that.

I thought the idea was to allow both approaches.  We are not moving in
the direction of one or the other, we are giving people a choice of how
they want to drive it.

> Weren't you just showing how you set environment variables to easily
> configure stuff.  If you use a separate configure dir, isn't PGCONFIG
> part of that?

I'm just pointing out that there's no backward-compatibility argument
for PGCONFIG.  It should only be put in if the people who want to use
the -C-is-driver approach want it.  Kevin clearly doesn't ;-).

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Curtis Faith

tom lane wrote:
> Hmmm ... that might be made to work, but it would complicate 
> inserts. By the time an insert navigates to the page it 
> should insert on, it might find the page is dead, and then it 
> would have no easy way to get to the replacement page (unless 
> you want to dedicate another link field in every index page 
> for that one use).  I suppose it could recover by starting 
> the search over again.

Inserts could reread just the parent page if they encountered a dead
leaf since the parent would have been correctly updated.

> Another problem is that the two dead pages are now outside of 
> the btree structure and so their left and right links won't 
> get updated in the face of subsequent splits and merges of 
> the nearby pages.

That seems like a "show stopper" that just defers the problem. A split
of the left sibling would still screw up a scan that was using the old
left leaf page and wanted to move left.

Oh, well, the idea seemed worth exploring.

- Curtis

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Bruce Momjian

Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > I think the big question is whether we want the default to install the
> > configs in a separate directory, pgsql/etc, or just allow the
> > specification of a separate location.  Advantages of pgsql/etc are
> > initdb-safe, and easier backups.
> 
> I don't see why we don't just let initdb install suggested config files
> into the new $PGDATA directory, same as it ever did.  Then (as long as
> we don't use relative paths in the config files) people can move them
> somewhere else if they like, or not if they prefer not to.  Adding more
> mechanism than that just adds complexity without buying much (except the
> possibility of initdb overwriting your old config files, which is
> exactly what I thought we wanted to avoid).
> 
> > The big question is whether PGDATA is still our driving config variable,
> > and PGCONFIG/-C is just an additional option, or whether we are moving
> > in a direction where PGCONFIG/-C is going to be the driving value, and
> > data_dir is going to be read as part of that.
> 
> I thought the idea was to allow both approaches.  We are not moving in
> the direction of one or the other, we are giving people a choice of how
> they want to drive it.

That's where I am unsure.  Is the initdb-safe and backup advantages
enough to start to migrate those out to data/?  I need to hear
comments on that.

One new idea is to move the config files into data/etc.  That makes it
clear which are config files, and makes backup a little easier.  It
would make -C more logical because you are not moving a clear directory.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Peter Eisentraut

Tom Lane writes:

> I would favor a setup that allows a -C *directory* (not file) to be
> specified as a postmaster parameter separately from the -D directory;

A directory is not going to satisfy people.

> I don't see any great value in a separate postgresql.conf parameter for
> each secondary config file; that just means clutter to me,

Not to other people.

> 1. No -C switch, no -D switch, no PGDATA found in environment: seek
> postgresql.conf in the default -C directory established at configure
> time.  Use the 'datadir' specified therein as -D.  Fail if postgresql.conf
> doesn't define a datadir value.

OK.

> 2. No -C switch, no -D switch, PGDATA found in environment: use $PGDATA
> as both -C and -D.

This behavior would be pretty inconsistent.  But maybe it's the best we
can do.

> 3. No -C switch, -D switch on command line: use -D value as both -C and -D,
> proceed as in case 2.

Same as above.

> 4. -C switch, no -D switch on command line: seek postgresql.conf in
> -C directory, use the datadir it specifies.

OK.

> 5. -C and -D on command line: seek postgresql.conf in -C directory,
> use -D as datadir overriding what is in postgresql.conf (this is just
> the usual rule that command line switches override postgresql.conf).

But that usual rule seems to be in conflict with cases 2 and 3 above.
(The usual rule is that a command-line option overrides a postgresql.conf
parameter.  The rule in 3, for example is, that a command-line option (the
same one!) overrides where postgresql.conf is in the first place.)

> I would venture that the configure-time-default for -C should be
> ${prefixdir}/etc if configure is not told differently,

Yeah, we already have that as --sysconfdir.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Tom Lane

Peter Eisentraut <[EMAIL PROTECTED]> writes:
> Tom Lane writes:
>> I would favor a setup that allows a -C *directory* (not file) to be
>> specified as a postmaster parameter separately from the -D directory;

> A directory is not going to satisfy people.

Why not?  Who won't it satisfy, and what's their objection?

AFAICS, you can either set -C to /etc if you want your PG config files
loose in /etc, or you can set it to /etc/postgresql/ if you want them
in a privately-owned directory.  Which other arrangements are needed?

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Tom Lane

"Curtis Faith" <[EMAIL PROTECTED]> writes:
> 4) This could easily be reordered into:

>   buf = ReadBuffer(rel, blkno);   /* pin next page
> */
>   LockBuffer(buf, BUFFER_LOCK_UNLOCK);/* release lock on
> current page */
>   LockBuffer(buf, BT_READ); /* lock next page */
>   ReleaseBuffer(buf);   /* now release pin on
> previously current page */

>without affecting the logic of the code or causing any deadlock
> problems since the release still occurs before the lock of the next
> page.

Sorry, that *does* create deadlocks.  Remember the deletion process is
going to need superexclusive lock (not only a BT_WRITE buffer lock,
but no concurrent pins) in order to be sure there are no scans stopped
on the page it wants to delete.  (In the above pseudocode, the fact that
you still hold a pin on the previously-current page makes you look
exactly like someone who's in the middle of scanning that page, rather
than trying to leave it.)  The same would be true of both pages
if it's trying to merge.

> 5) A mutex/spinlock that was stored in the index could be acquired by
> the scan code like this:

"Stored in the index"?  And how will you do that portably?  But it
doesn't matter, because the approach deadlocks.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Bruce Momjian

Peter Eisentraut wrote:
> Tom Lane writes:
> 
> > I would favor a setup that allows a -C *directory* (not file) to be
> > specified as a postmaster parameter separately from the -D directory;
> 
> A directory is not going to satisfy people.

Who is asking to put postgresql.conf, pg_hba.conf, and pg_ident.conf in
different directories?  I haven't heard anyone ask for that.

> > I don't see any great value in a separate postgresql.conf parameter for
> > each secondary config file; that just means clutter to me,
> 
> Not to other people.
> 
> > 1. No -C switch, no -D switch, no PGDATA found in environment: seek
> > postgresql.conf in the default -C directory established at configure
> > time.  Use the 'datadir' specified therein as -D.  Fail if postgresql.conf
> > doesn't define a datadir value.
> 
> OK.
> 
> > 2. No -C switch, no -D switch, PGDATA found in environment: use $PGDATA
> > as both -C and -D.
> 
> This behavior would be pretty inconsistent.  But maybe it's the best we
> can do.

What happens if postgresql.conf then defines data_dir?  Seems we ignore it.

This brings up the same issue of whether -C/PGCONFIG is a inferior
option to -D/PGDATA, and whether we keep the config files in /data by
default.

> > 3. No -C switch, -D switch on command line: use -D value as both -C and -D,
> > proceed as in case 2.
> 
> Same as above.

> 
> > 4. -C switch, no -D switch on command line: seek postgresql.conf in
> > -C directory, use the datadir it specifies.
> 
> OK.

Here we are saying the -C doesn't override postgresql.conf as the proper
PGDATA value.  Is that what we want?  We had the question above over how
a data_dir in postgresql.conf is handled.

> 
> > 5. -C and -D on command line: seek postgresql.conf in -C directory,
> > use -D as datadir overriding what is in postgresql.conf (this is just
> > the usual rule that command line switches override postgresql.conf).
> 
> But that usual rule seems to be in conflict with cases 2 and 3 above.
> (The usual rule is that a command-line option overrides a postgresql.conf
> parameter.  The rule in 3, for example is, that a command-line option (the
> same one!) overrides where postgresql.conf is in the first place.)

Yes, the big question seems to be if we are defaulting -C to be the same
as -D, whether that is an actual specification of -D that should
override postgresql.conf.

This is part of the reason I don't like the -D assumes -C and stuff like
that.

I think we need to move the config files to pgsql/etc, for backup and
initdb safety, and move toward having PGCONFIG/-C as the driving
parameter.  I think having both function equally and defaulting if the
other is not specified is going to breed confusion.

I am willing to make thing a little difficult for backward compatibility
to do this, and I think because it is only administrators, they will
welcome the improvement.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] pg_dump and inserts

2003-02-14 Thread Bruce Momjian


Attached is a patch the more clearly handles autocommit in pg_dump.  I
had already fixed pg_dump for doing autocommit while dumping, but didn't
handle setting autocommit on restore.

I focused on the initial script file startup, every \\connect,
pg_restore, and pg_dumpall.  I think I got them all.

New pg_dump output is:

--
-- PostgreSQL database dump
--

SET autocommit TO 'on';

\connect - postgres

SET autocommit TO 'on';

---

Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > I just checked and pg_dump -d _doesn't_ place the INSERT's in a
> > transsaction.  Seems it should,
> 
> I think this is a bad idea.  If one were after speed, one would have
> used the COPY format in the first place.  If one uses INSERTs, there
> may be a reason for it --- like, say, wanting each row insertion to
> succeed or fail independently.  Put a begin/end around it, and you
> lose that.
> 
> > and perhaps add a:
> > SET autocommit TO 'on'
> > as well.
> 
> This is probably a good idea, since pg_dump scripts effectively assume
> that anyway.
> 
> > Of course, that SET would fail when restoring to prior
> > releases,
> 
> Irrelevant; current pg_dump scripts already issue a SET that pre-7.3
> servers won't recognize (search_path).  A failed SET is harmless anyway,
> or should be.  (What we really need is for someone to fix pg_restore to
> not abort on SQL errors...)
> 
>   regards, tom lane
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

Index: src/bin/pg_dump/pg_backup_archiver.c
===
RCS file: /cvsroot/pgsql-server/src/bin/pg_dump/pg_backup_archiver.c,v
retrieving revision 1.67
diff -c -c -r1.67 pg_backup_archiver.c
*** src/bin/pg_dump/pg_backup_archiver.c1 Feb 2003 22:06:59 -   1.67
--- src/bin/pg_dump/pg_backup_archiver.c14 Feb 2003 19:39:08 -
***
*** 206,212 
sav = SetOutput(AH, ropt->filename, ropt->compression);
  
ahprintf(AH, "--\n-- PostgreSQL database dump\n--\n\n");
! 
/*
 * Drop the items at the start, in reverse order
 */
--- 206,213 
sav = SetOutput(AH, ropt->filename, ropt->compression);
  
ahprintf(AH, "--\n-- PostgreSQL database dump\n--\n\n");
!   ahprintf(AH, "SET autocommit TO 'on';\n\n");
!   
/*
 * Drop the items at the start, in reverse order
 */
***
*** 2109,2115 
  dbname ? fmtId(dbname) : "-");
appendPQExpBuffer(qry, " %s\n\n",
  fmtId(user));
! 
ahprintf(AH, qry->data);
  
destroyPQExpBuffer(qry);
--- 2110,2117 
  dbname ? fmtId(dbname) : "-");
appendPQExpBuffer(qry, " %s\n\n",
  fmtId(user));
!   appendPQExpBuffer(qry, "SET autocommit TO 'on';\n\n");
!   
ahprintf(AH, qry->data);
  
destroyPQExpBuffer(qry);
Index: src/bin/pg_dump/pg_backup_db.c
===
RCS file: /cvsroot/pgsql-server/src/bin/pg_dump/pg_backup_db.c,v
retrieving revision 1.45
diff -c -c -r1.45 pg_backup_db.c
*** src/bin/pg_dump/pg_backup_db.c  13 Feb 2003 04:54:15 -  1.45
--- src/bin/pg_dump/pg_backup_db.c  14 Feb 2003 19:39:09 -
***
*** 213,218 
--- 213,233 
if (password)
free(password);
  
+   /* check for version mismatch */
+   _check_database_version(AH, true);
+ 
+   /* Turn autocommit on */
+   if (AH->public.remoteVersion >= 70300)
+   {
+   PGresult   *res;
+ 
+   res = PQexec(AH->connection, "SET autocommit TO 'on'");
+   if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
+   die_horribly(AH, NULL, "SET autocommit TO 'on' failed: %s",
+ PQerrorMessage(AH->connection));
+   PQclear(res);
+   }
+ 
PQsetNoticeProcessor(newConn, notice_processor, NULL);
  
return newConn;
Index: src/bin/pg_dump/pg_dumpall.c
===
RCS file: /cvsroot/pgsql-server/src/bin/pg_dump/pg_dumpall.c,v
retrieving revision 1.13
diff -c -c -r1.13 pg_dumpall.c
*** src/bin/pg_dump/pg_dumpall.c16 Jan 2003 15:27:59 -  1.13
--- src/bin/pg_dump/pg_dumpall.c14 Feb 2003 19:39:09

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Curtis Faith

I previously wrote:
> 5) A mutex/spinlock that was stored in the index could be 
> acquired by the scan code like this:
> 
>   buf = ReadBuffer(rel, blkno);   /* pin next page
> */
> 
>   SpinLockAcquire( indexSpecificMutex );/* lock the index
> reorg mutex */
> 
>   LockBuffer(buf, BUFFER_LOCK_UNLOCK);/* release lock on
> current page */
>   LockBuffer(buf, BT_READ); /* lock next page */
> 
>   SpinLockRelease( indexSpecificMutex );/* unlock the index
> reorg mutex */
>  
>   ReleaseBuffer(buf);   /* now release pin on
> previously current page */
> 
> 6) The same index specific mutex/spinlock could be used for 
> the merge code surrounding only the acquisition of the four 
> page locks. This would obviate any problems with scans and 
> page merges, since the lock acquisition for the merge could 
> never occur while a scan was between pages.
> 
> Further, with the reordering, the spinlock for the scan code 
> doesn't seem particularly onerous since it would be 
> surrounding only two LWLock calls.  To reduce the overhead to 
> an absolute minimum for the scan case these could be pushed 
> down into a new IW call (probably necessary since the 
> LockBuffer, ReleaseBuffer code does some error checking and 
> such that one wouldn't want in code guarded by a mutex.

I forgot to mention that the mutex would have to be release in the event
the next page lock could not be immediately acquired just after the
addition of the scan process to the lock waiters list to avoid blocking
all scans and probably causing severe deadlock problems.

- Curtis



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Berkeley and CMU classes adopt/extend PostgreSQL

2003-02-14 Thread Bruce Momjian


Any chance of giving them all separate TODO items?  That way, we would
get more items completed;  greedy request, I know.  ;-)

---

Anastassia Ailamaki wrote:
> Hi everyone,
> 
> > > with some frequency.  We have the usual 180 students  we get every 
> > > semester (yep: 180!), but this year we've instituted 2 changes:
> 
> We're looking at >100 students taking the class here every year.
> 
> > Double cool.  I'm just down the road, if Natassa needs a visiting
> > lecturer.
> 
> Tom - that's really super-cool! Tom, let's take it offline to schedule a
> visit.
> We will be delighted to have you lecture.
> 
> > Yes.  As of CVS tip, we have hash-based grouping but it doesn't spill
> > to disk.  Want to ask them to start from CVS tip and fix that little
> > detail?  Or fix the various other loose ends that have been mentioned
> > lately?  (make it work with DISTINCT, improve the estimation logic,
> > some other things I'm forgetting)
> 
> As Joe said, this is what we are doing. We intend to use your todo-list to 
> design projects for future semesters... so all such suggestions
> are greatly appreciated.
>  
> Natassa
> 
> ---(end of broadcast)---
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to [EMAIL PROTECTED] so that your
> message can get through to the mailing list cleanly
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Steve Crawford

On Friday 14 February 2003 6:07 am, Martin Coxall wrote:
> On Thu, 2003-02-13 at 20:28, Steve Crawford wrote:
> > I don't see why we can't keep everyone happy and let the users choose the
> > setup they want. To wit, make the following, probably simple, changes:
> >
> > 1) Have postgresql default to using /etc/postgresql.conf
>
> /etc/postgres/postgresql.conf, if we want to be proper FHS-bitches.
>
> > 2) Add a setting in postgresql.conf specifying the data directory
> > 3) Change the meaning of -D to mean "use this config file"
> > 4) In the absence of a specified data directory in postgresql.conf, use
> > the location of the postgresql.conf file as the data directory
>
> Shouldn't it in that case default to, say /var/lib/postgres?

Idea 4 was just a way to preserve current behaviour for those who desire. 
Moving postgresql.conf requires adding the data directory info into 
postgresql.conf or specifying it in some other way. If, in the absence of any 
specification in postgresql.conf, postgres just looks in the same directory 
as postgresql.conf then it will be almost identical to the current setup.

Cheers,
Steve

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Kevin Brown

Bruce Momjian wrote:
> I see your point --- pg_ctl does a PGDATA trick when passed -D:
> 
> -D)
> shift
> # pass environment into new postmaster
> PGDATA="$1"
> export PGDATA
> 
> It should pass -D just like it was given.

Yes, exactly.

Now, the more interesting question in my mind is: if pg_ctl isn't
passed -D but inherits PGDATA, should it nonetheless pass -D
explicitly to the postmaster?  We can make it do that, and it would
have the benefit of making transparent what would otherwise be opaque.

I'm inclined to answer "yes" to that question, but only because
someone who *really* doesn't want the postmaster to show up with a -D
argument in "ps" can start the postmaster directly without using
pg_ctl at all.  Tom made a good argument for sometimes wanting to keep
the ps output clean, but it's not clear to me that it should
necessarily apply to pg_ctl.

But you guys might have a different perspective on that.  :-)

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Bruce Momjian

Kevin Brown wrote:
> Bruce Momjian wrote:
> > The big question is whether PGDATA is still our driving config variable,
> > and PGCONFIG/-C is just an additional option, or whether we are moving
> > in a direction where PGCONFIG/-C is going to be the driving value, and
> > data_dir is going to be read as part of that.
> 
> I'm actually leaning towards PGCONFIG + PGDATA.
> 
> Yeah, it may be a surprise given my previous arguments, but I can't
> help but think that the advantages you get with PGDATA will also exist
> for PGCONFIG.
> 
> My previous arguments for removing PGDATA from postmaster can be dealt
> with by fixing pg_ctl to use explicit command line directives when
> invoking postmaster -- no changes to postmaster needed.  PGCONFIG
> would be no different in that regard.

I see your point --- pg_ctl does a PGDATA trick when passed -D:

-D)
shift
# pass environment into new postmaster
PGDATA="$1"
export PGDATA

It should pass -D just like it was given.

> Sorry if I seem a big gung-ho on the administrator point of view, but
> as a system administrator myself I understand and feel their pain.

Making things easy for sysadmins is an important feature of PostgreSQL.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Incremental backup

2003-02-14 Thread Kevin Brown

Christopher Browne wrote:
> What PITR generally consists of is the notion that you want to recover
> to the state at a particular moment in time.
> 
> In O*-nomenclature, this means that you recover as at some earlier
> moment for which you have a good backup, and then re-apply changes,
> which in their terms, are kept in "archive logs," which are somewhat
> analagous to WAL files.

Yeah, that's pretty much what I figured.

Oracle has something they call "rollback segments" which I assume are
separate bits of data that have enough information to reverse changes
that were made to the database during a transaction, and I figured
PITR would (or could) apply particular saved rollback segments to the
current state in order to "roll back" a table, tablespace, or database
to the state it was in at a particular point in time.

As it is, it sounds like PITR is a bit less refined than I expected.

So the relevant question is: how is *our* PITR going to work?  In
particular, how is it going to interact with our WAL files and the
table store?  If I'm not mistaken, right now (well, as of 7.2 anyway)
we round robin through a fixed set of WAL files.  For PITR, I assume
we'd need an archivelog function that would copy the WAL files as
they're checkpointed to some other location (with destination names
that reflect their order in time), just for starters.

It'd be *awfully* nice if you could issue a command to roll a table
(or, perhaps, a tablespace, if you've got a bunch of foreign keys and
such) back to a particular point in time, from the command line, with
no significant advance preparation (so long as the required files are
still around, and if they're not then abort the operation with the
appropriate error message).  But it doesn't sound like that's what
we're talking about when we talk about PITR...

I wouldn't expect the O* docs to be particularly revealing about
how the database manages PITR at the file level, but if it does, would
you happen to know where so I can look at it?  What I've seen so far
is very basic and not very revealing at all...

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Bruno Wolff III

On Fri, Feb 14, 2003 at 02:58:49 -0800,
  Kevin Brown <[EMAIL PROTECTED]> wrote:
> 
> Lest you think that this is an unlikely scenario, keep in mind that
> most startup scripts, including pg_ctl, currently start the postmaster
> without arguments and rely on PGDATA, so a shop that hasn't already
> been bitten by this *will* be.  Right now shops that wish to avoid the
> trap I described have to go to *extra* lengths: they have to make
> exactly the same kinds of changes to the scripts that I'm talking
> about us making (putting an explicit '-D "$PGDATA"' where none now
> exists) or they have to resort to tricks like renaming the postmaster
> executable and creating a shell script in its place that will invoke
> the (renamed) postmaster with '-D "$PGDATA"'.

On at least some systems ps will dump process' environment and could be
easily used to check PGDATA.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Do we always need the socket file?

2003-02-14 Thread Greg Stark

Michael Brusser <[EMAIL PROTECTED]> writes:

> We're trying to avoid creating any unnecessary files, especially outside
> of the product installation tree. Look at this as a policy.
> Experience shows that sooner or later some of your customers ask you:
> what is this /tmp/.s.PGSQL.xxx file is? What do I need it for?
> Also some admins known to periodically clean out /tmp, /var/run, etc.

Well if you clean out files programs are using you should expect those
programs to break. Other programs that create sockets in /tmp include screen
and X for example.

Unix domain sockets are sometimes more efficient and sometimes more secure
than TCP/IP connections. So preferring TCP/IP just to avoid /tmp pollution
might be a bit of a loss for aesthetic value.

-- 
greg

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Greg Stark

"scott.marlowe" <[EMAIL PROTECTED]> writes:

> But this isn't the same thing at all.  Apache, when built from a tar ball, 
> goes into /usr/local/apache/ and ALL it's configuration files are there.  

Two comments:

1) Even in that case the config files go into /usr/local/apache/conf and the
   other kinds of files like data logs and cache files, all go in other
   subdirectories.

2) What you describe is only true if you configure with the default
   "--with-layout=Apache". The naming should perhaps be a clue that this isn't
   a conventional layout. If you configure with --with-layout=GNU you get the
   conventional Unix layout in /usr/local, If you use --with-layout=RedHat you
   get the conventional layout in /usr directly which is mainly useful for
   distribution packagers.

Putting stuff in a subdirectory like /usr/local/apache or /usr/local/pgsql is
unfortunately a widespread practice. It does have some advantages over the
conventional layout in /usr/local/{etc,bin,...} directly. But the major
disadvantage is that users can't run programs without adding dozens of entries
to their paths, can't compile programs without dozens of -L and -I lines, etc.

GNU autoconf script makes it pretty easy to configure packages to work either
though, and /usr/local is the purview of the local admin. As long as it's easy
to configure postgres to install "properly" with --prefix=/usr/local it won't
be any more of an offender than lots of other packages like apache, kde, etc.

Though I'll mention, please make it $prefix/etc not $prefix/conf. No need to
be gratuitously non-standard on an arbitrary name, and no need to pollute
/usr/local with multiple redundant directories.

-- 
greg

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Bruno Wolff III

On Thu, Feb 13, 2003 at 11:53:26 -0500,
  mlw <[EMAIL PROTECTED]> wrote:
> 
> Where, specificaly are his arguements against a configuration file 
> methodology?

I don't think he is argueing against a configuration methodology, but
rather against the methodology being used in Unix distributions.
In particular he doesn't file the Linux File Standard because it
puts the same software in different places depending on whether the
vendor or using installed it.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Kevin Brown

Bruce Momjian wrote:
> The big question is whether PGDATA is still our driving config variable,
> and PGCONFIG/-C is just an additional option, or whether we are moving
> in a direction where PGCONFIG/-C is going to be the driving value, and
> data_dir is going to be read as part of that.

I'm actually leaning towards PGCONFIG + PGDATA.

Yeah, it may be a surprise given my previous arguments, but I can't
help but think that the advantages you get with PGDATA will also exist
for PGCONFIG.

My previous arguments for removing PGDATA from postmaster can be dealt
with by fixing pg_ctl to use explicit command line directives when
invoking postmaster -- no changes to postmaster needed.  PGCONFIG
would be no different in that regard.

Sorry if I seem a big gung-ho on the administrator point of view, but
as a system administrator myself I understand and feel their pain.
:-)

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] Berkeley and CMU classes adopt/extend PostgreSQL

2003-02-14 Thread Anastassia Ailamaki

Hi everyone,

> > with some frequency.  We have the usual 180 students  we get every 
> > semester (yep: 180!), but this year we've instituted 2 changes:

We're looking at >100 students taking the class here every year.

> Double cool.  I'm just down the road, if Natassa needs a visiting
> lecturer.

Tom - that's really super-cool! Tom, let's take it offline to schedule a
visit.
We will be delighted to have you lecture.

> Yes.  As of CVS tip, we have hash-based grouping but it doesn't spill
> to disk.  Want to ask them to start from CVS tip and fix that little
> detail?  Or fix the various other loose ends that have been mentioned
> lately?  (make it work with DISTINCT, improve the estimation logic,
> some other things I'm forgetting)

As Joe said, this is what we are doing. We intend to use your todo-list to 
design projects for future semesters... so all such suggestions
are greatly appreciated.
 
Natassa

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Incremental backup

2003-02-14 Thread Christopher Browne

A long time ago, in a galaxy far, far away, [EMAIL PROTECTED] (Kevin Brown) wrote:
> Bruce Momjian wrote:
>> 
>> OK, once we have PITR, will anyone want incremental backups?
>
> None of my database references (Date's "Introduction to Database
> Systems" and Garcia-Molina's "Database Systems - The Complete Book",
> in particular) seem to talk about PITR at all.  At least, there's no
> index entry for it.  And a google search for "point in time recovery"
> yields mostly marketing fluff.

Well, from an "academic DBMS" standpoint, it isn't terribly
interesting, since it involves assumptions of messy imperfection that
academics prefer to avoid.  

And that's not intended to insult the academics; it is often
reasonable to leave that "out of scope" much as an academic OS
researcher might prefer to try to avoid putting attention on things
like binary linkers, text file editors, and SCM systems like CVS,
which, while terribly important from a practical standpoint, don't
make for interesting OS research.

> Is there a good reference for this that someone can point me to?
> I'm interested in exactly how it'll work, especially in terms of how
> logs are stored versus the main data store, effects on performance,
> etc.
>
> Thanks, and sorry for the newbie question.  :-(

Unfortunately, the best sources I can think of are in the "O-Word"
literature, and the /practical/ answers require digging into really
messy bits of the documentation.

What it amounts to is that anyone that isn't a "near-O*-guru" would be
strongly advised not to engage in PITR activity.  

It doesn't surprise me overly that the documentation is poor: those
that can't figure it out despite the challenges almost surely
shouldn't be using the functionality...

What PITR generally consists of is the notion that you want to recover
to the state at a particular moment in time.

In O*-nomenclature, this means that you recover as at some earlier
moment for which you have a good backup, and then re-apply changes,
which in their terms, are kept in "archive logs," which are somewhat
analagous to WAL files.
-- 
If this was helpful,  rate me
http://www3.sympatico.ca/cbbrowne/x.html
"We blew it -- too big, too slow..." - Bill Gates talking about NT, as
noted by Steven McGeady of Intel during a meeting with Gates

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Changing the default configuration (was Re: [pgsql-advocacy]

2003-02-14 Thread Jason Hihn

>Nutshell:
>   "Easy to install but is horribly slow."
>
>   or
>
>   "Took a couple of minutes to configure and it rocks!"

Since when is it easy to install on win32?
The easiest way I know of is through Cygwin, then you have to worry about
installing the IPC service (an getting the right version too!) I've
installed versions 6.1 to 7.1, but I almost gave up on the windows install.
At least in 6.x you had very comprehensive installation guide with a TOC.

Versus the competition which are you going to choose if you're a wanna-be
DBA? The one with all he hoops to jump through, or the one that comes with a
setup.exe?

Now I actually am in support of making it more aggressive, but it should
wait until we too have a setup.exe for the native windows port. (Changing it
on *n*x platforms is of little benefit because most benchmarks seem to run
it on w32 anyway :-( )

Just my $.02. I reserve the right to be wrong.
-J


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[HACKERS] Berkeley and CMU classes adopt/extend PostgreSQL

2003-02-14 Thread Joe Hellerstein

Hi all:
	I emailed Marc Fournier on this topic some weeks back, but haven't 
heard from him.

I am teaching the undergrad DB course at UC Berkeley, something I do 
with some frequency.  We have the usual 180 students  we get every 
semester (yep: 180!), but this year we've instituted 2 changes:

1) We changed the course projects to make the students hack PostgreSQL 
internals, rather than the "minibase" eduware
2) We are coordinating the class with a class at CMU being taught by 
Prof. Anastassia ("Natassa") Ailamaki

Our "Homework 2", which is being passed out this week, will ask the 
students to implement a hash-based grouping that spills to disk.  I 
understand this topic has been batted about the pgsql-hackers list 
recently.  The TAs who've prepared the assignment (Sailesh 
Krishnamurthy at Berkeley and Spiros Papadimitriou at CMU) have also 
implemented a reference solution to assignment.  Once we've got the 
students' projects all turned in, we'll be very happy to contribute our 
code back the PostgreSQL project.

I'm hopeful this will lead to many good things:

1) Each year we can pick another feature to assign in class, and 
contribute back.  We'll need to come up with well-scoped engine 
features that exercise concepts from the class -- eventually we'll run 
out of tractable things that PGSQL needs, but not in the next couple 
years I bet.

2) We'll raise a crop of good students who know Postgres internals.  
Roughly half the Berkeley EECS undergrads take the DB class, and all of 
them will be post-hackers!  (Again, I don't know the stats at CMU.)

So consider this a heads up on the hash-agg front, and on the future 
contributions front.   I'll follow up with another email on 
PostgreSQL-centered research in our group at Berkeley as well.

Another favor I'd ask is that people on the list be a bit hesitant 
about helping our students with their homework!  We would like them to 
do it themselves, more or less :-)

Regards,
Joe Hellerstein

--

Joseph M. Hellerstein
Professor, EECS Computer Science Division
UC Berkeley
http://www.cs.berkeley.edu/~jmh


On Tuesday, February 11, 2003, at 06:54  PM, Sailesh Krishnamurthy 
wrote:

From: Hannu Krosing <[EMAIL PROTECTED]>
Date: Tue Feb 11, 2003  12:21:26  PM US/Pacific
To: Tom Lane <[EMAIL PROTECTED]>
Cc: Bruno Wolff III <[EMAIL PROTECTED]>, Greg Stark <[EMAIL PROTECTED]>, 
[EMAIL PROTECTED]
Subject: Re: [HACKERS] Hash grouping, aggregates


Tom Lane kirjutas T, 11.02.2003 kell 18:39:
Bruno Wolff III <[EMAIL PROTECTED]> writes:

  Tom Lane <[EMAIL PROTECTED]> wrote:

Greg Stark <[EMAIL PROTECTED]> writes:

The neat thing is that hash aggregates would allow grouping on 
data types that
have = operators but no useful < operator.

Hm.  Right now I think that would barf on you, because the parser 
wants
to find the '<' operator to label the grouping column with, even if 
the
planner later decides not to use it.  It'd take some redesign of the
query data structure (specifically SortClause/GroupClause) to avoid 
that.


I think another issue is that for some = operators you still might 
not
be able to use a hash. I would expect the discussion for hash joins 
in
http://developer.postgresql.org/docs/postgres/xoper-optimization.html
would to hash aggregates as well.

Right, the = operator must be hashable or you're out of luck.  But we
could imagine tweaking the parser to allow GROUP BY if it finds a
hashable = operator and no sort operator.  The only objection I can 
see
to this is that it means the planner *must* use hash aggregation, 
which
might be a bad move if there are too many distinct groups.

If we run out of sort memory, we can always bail out later, preferrably
with a descriptive error message. It is not as elegant as erring out at
parse (or even plan/optimise) time, but the result is /almost/ the 
same.

Relying on hash aggregation will become essential if we are ever going
to implement the "other" groupings (CUBE, ROLLUP, (), ...), so it would
be nice if hash aggregation could also overflow to disk - I suspect 
that
this will still be faster that running an independent scan for each
GROUP BY grouping and merging the results.

-
Hannu


---(end of 
broadcast)---
TIP 1: subscribe and unsubscribe commands go to 
[EMAIL PROTECTED]




--
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Curtis Faith

tom lane wrote:
> How does that help?  The left-moving indexscan still has no 
> way to recover.  It can't go back to the page it was on 
> before and try to determine which entries you added there, 
> because it has no reliable reference point to do so.  The 
> entry it previously returned might not be there anymore, and 
> in a non-unique index key comparisons won't help. And even 
> more to the point, how would it know you've changed the left 
> page?  It has no idea what the previous "page version" on the 
> left page was, because it was never there before.

Revisiting the idea I proposed in a previous email after more research,
I believe it is definitely possible to implement a mutex based scheme
that will prevent scans from being polluted by merges of index pages and
that does not result in the mutex being held for any significant
duration.

1) Current scan code does this:

bool
_bt_step(IndexScanDesc scan, Buffer *bufP, ScanDirection dir)
{
... definitions go here...

if (ScanDirectionIsForward(dir))
{
if (!PageIsEmpty(page) && offnum < maxoff)
offnum = OffsetNumberNext(offnum);
else
{
/* walk right to the next page with data */
for (;;)
{
/* if we're at end of scan, release the
buffer and return */
... skip code here...

/* step right one page */
blkno = opaque->btpo_next;
_bt_relbuf(rel, *bufP);
*bufP = _bt_getbuf(rel, blkno, BT_READ);

... skip rest of code...

3) Note that the calls

_bt_relbuf(rel, *bufP);
*bufP = _bt_getbuf(rel, blkno, BT_READ);

   a) appear adjacent to each other
   b) relbuf calls:

  LockBuffer(buf, BUFFER_LOCK_UNLOCK);
  ReleaseBuffer(buf);

   c) getbuf only calls:

  buf = ReadBuffer(rel, blkno);
LockBuffer(buf, access);

  in the case of an existing buffer, rather than a new one.

4) This could easily be reordered into:

  buf = ReadBuffer(rel, blkno); /* pin next page
*/
  LockBuffer(buf, BUFFER_LOCK_UNLOCK);  /* release lock on
current page */
LockBuffer(buf, BT_READ); /* lock next page */
  ReleaseBuffer(buf);   /* now release pin on
previously current page */

   without affecting the logic of the code or causing any deadlock
problems since the release still occurs before the lock of the next
page.

5) A mutex/spinlock that was stored in the index could be acquired by
the scan code like this:

  buf = ReadBuffer(rel, blkno); /* pin next page
*/

SpinLockAcquire( indexSpecificMutex );/* lock the index
reorg mutex */

  LockBuffer(buf, BUFFER_LOCK_UNLOCK);  /* release lock on
current page */
LockBuffer(buf, BT_READ); /* lock next page */

SpinLockRelease( indexSpecificMutex );/* unlock the index
reorg mutex */

  ReleaseBuffer(buf);   /* now release pin on
previously current page */

6) The same index specific mutex/spinlock could be used for the merge
code surrounding only the acquisition of the four page locks. This would
obviate any problems with scans and page merges, since the lock
acquisition for the merge could never occur while a scan was between
pages.

Further, with the reordering, the spinlock for the scan code doesn't
seem particularly onerous since it would be surrounding only two LWLock
calls.  To reduce the overhead to an absolute minimum for the scan case
these could be pushed down into a new IW call (probably necessary since
the LockBuffer, ReleaseBuffer code does some error checking and such
that one wouldn't want in code guarded by a mutex.

- Curtis

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

[HACKERS] Views and unique indicies optimisation

2003-02-14 Thread Robert Osowiecki

Here's an example:

create view master_detail
as select * from master 
left join detail on (master.detail_key=detail.key);

I noticed that PostgreSQ performs join even when I execute

select some_master_fields from master_detail;

but there exists an unique index on detail.key, so above query
gives the same result as

select some_master_fields from master;

but with much loss in performance, of course.

This problem emerged when I replaced renundant table in poorly
designed database with two tables, view and set of rules.
I thought that I was very clever, because I didn't have to 
replace any code in applications using this database, but
it seems that there's always a catch...

May I have hope that such optimisation will be included in
TODO list in near future?

-- 
   <><><><><><><><><><><><><><><><><><><><><>
   <>  [EMAIL PROTECTED]  <>
   <>  http://rainbow.mimuw.edu.pl/~robson <>
   <><><><><><><><><><><><><><><><><><><><><>

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Bruce Momjian

Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > So, we add data_dir to postgresql.conf, and add -C/PGCONFIG to
> > postmaster.
> 
> Wait one second.  You are blithely throwing in a PGCONFIG variable
> without any detailed proposal of exactly how it will work.  Does
> that override a PGDATA environment variable?  How do they interact?

I am just throwing out ideas. I don't think we are near interaction
issues yet.

I think the big question is whether we want the default to install the
configs in a separate directory, pgsql/etc, or just allow the
specification of a separate location.  Advantages of pgsql/etc are
initdb-safe, and easier backups.

I do think PGCONFIG would be helpful for the same reason that PGDATA is.
However, there is clearly a problem of how does data_dir interact with
PGDATA.

The big question is whether PGDATA is still our driving config variable,
and PGCONFIG/-C is just an additional option, or whether we are moving
in a direction where PGCONFIG/-C is going to be the driving value, and
data_dir is going to be read as part of that.

> Also, please note Kevin Brown's nearby arguments against using PGDATA
> at all, which surely apply with equal force to a PGCONFIG variable.
> Now, I don't buy that Kevin's arguments are enough reason to break
> backwards compatibility by removing PGDATA --- but I think they are
> enough reason not to introduce a new environment variable.  PGCONFIG
> wouldn't offer any backwards-compatibility value, and that tilts the
> scales against it.

Weren't you just showing how you set environment variables to easily
configure stuff.  If you use a separate configure dir, isn't PGCONFIG
part of that?

> > Regarding Tom's idea of replacing data_dir with a full path during
> > initdb, I think we are better having it be relative to the config
> > directory, that way if they move pgsql/, the system still works.
> 
> Good thought, but you're assuming that initdb knows where the config
> files will eventually live.  If we do that, then moving the config
> files breaks the installation.  I think it will be fairly common to
> let initdb drop its proposed config files into $PGDATA, and then
> manually place them where they should go (or even more likely,
> manually merge them with a prior version).  Probably better to force
> datadir to be an absolute path in the config file.  (In fact, on safety
> grounds I'd argue in favor of rejecting a datadir value taken from the
> config file that wasn't absolute.)

Maybe.  Not sure.

> > I also think we should start telling people to use PGCONFIG rather than
> > PGDATA.  Then, in 7.5, we move the default config file location to
> > pgsql/etc, and tell folks to point there rather than /data.
> 
> I agree with none of this.  This is not improvement, this is only change
> for the sake of change.  The packagers will do what they want to do
> (and are already doing, mostly) regardless.

Well, it is a step forward in terms of initdb-safe and easier backups.
Several people said they liked that.  I thought you were one of them.

This is back to the big question, who drives things in the default
install, config file or pgdata.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

[HACKERS] help me!!

2003-02-14 Thread XIE, Rong




Hello, 
 
My name is rong xie. I am a Student at 
TU-Munich.
I have a question to Postgresql an Linux. 

e.g:
for IBM DB2: I can write a test.sql 
file.
 
--test.sql
connect to database1;
set schema xie;
select * from table1;
insert table1 value('rong','xie',22);

select * from table1;
terminate;
 
Then I can execute the file "test.sql" in 
terminal.
db2 -tvf test.sql
 
In mysql: I can execute it so.
mysql > test.sql
 
How can I execute it in Postgresql?
is there similar command in linux for 
Postgresql?
 
Thank you very mach!
 
Best wish!
 
rong xie

[HACKERS] Problems with insert rule called from plpython

2003-02-14 Thread Pavel Hanak

Hello,

I've noticed one problem by upgrading from postgresql-7.3.1 to 7.3.2.
The example of this problem looks like this:

   create table test (a int, b text);
   create view testview as select * from test;

   create or replace rule testview_ins as
   on insert to testview do instead
   (
   insert into test values (1, 'a');
   insert into test values (2, 'b');
   );

   create or replace function testfun() returns text as
   '
   plpy.execute("""insert into testview values (3, ''c'')""")
   return "test"
   ' language plpython;

Now calling "select testfun()" shows this fatal error:

   FATAL:  SPI: improper call to spi_dest_setup
   server closed the connection unexpectedly
   This probably means the server terminated abnormally
   before or while processing the request.
   The connection to the server was lost. Attempting reset: Succeeded.


This example worked in postgresql-7.3.1. In postgresql-7.3.2 works only
one command used in instead:

   create or replace rule testview_ins as
   on insert to testview do instead
   (
   insert into test values (1, 'a');
   );

Can you explain me this problem? Is it possible to reply also to my
email address, since I am not member of any postgresql mailing list.
Thanks.

-- 
Pavel Hanak


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Do we always need the socket file?

2003-02-14 Thread Michael Brusser

We're trying to avoid creating any unnecessary files, especially outside
of the product installation tree. Look at this as a policy.
Experience shows that sooner or later some of your customers ask you:
what is this /tmp/.s.PGSQL.xxx file is? What do I need it for?
Also some admins known to periodically clean out /tmp, /var/run, etc.

Mike.

> -Original Message-
> From: Bruce Momjian [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, February 13, 2003 8:38 PM
> To: [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED]
> Subject: Re: [HACKERS] Do we always need the socket file?
> 
> 
> Michael Brusser wrote:
> > I have Postgres 7.2.1 configured to listen on TCP/IP port.
> > When the server starts up it still creates the socket file in /tmp.
> > Removing this file manually does not seem to cause any problem for the
> > application.
> > 
> > Is there a way to prevent postmaster from creating this file?
> > Is this really safe to remove the socket file, or would it create
> > some problem that I won't necessarily see?
> 
> I guess the big question is why you don't want the file created?  If you
> have 'local' disabled in pg_hba.conf, it doesn't allow connections.
> 
> -- 
>   Bruce Momjian|  http://candle.pha.pa.us
>   [EMAIL PROTECTED]   |  (610) 359-1001
>   +  If your life is a hard drive, |  13 Roberts Road
>   +  Christ can be your backup.|  Newtown Square, 
> Pennsylvania 19073
> 


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

[HACKERS] postgresql 7.3 versus 7.2

2003-02-14 Thread Ulf Rehmann


Hi!

Is there any way to adjust dumps from postgresql 7.3 for use by
version 7.2 and vice versa ?

Or is there a way to transport a full database via csv or in any other
kind to 7.2 and vice versa ?

We are running a couple of servers/mirrors at various locations in the
world, handling data based on postgresql, and now we have trouble to
transport data from and to sites with different versions.

(E.g. http://genealogy.math.ndsu.nodak.edu/
  http://genealogy.mathematik.uni-bielefeld.de/ )

Some have upgraded and others cannot immediately follow, since an
upgrade from 7.2 to 7.3 seems to require an upgrade of php4 and maybe
even of apache, which takes time etc.

So it would be helpful to have an easy tool to convert between data
for both versions.

Any help or recommendation is very much appreciated!

Best regards,
Ulf Rehmann

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Someone's broken psql's connection-failure error reporting

2003-02-14 Thread Kurt Roeckx

On Thu, Feb 13, 2003 at 08:55:23PM -0500, Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> I've done a little bit of cleanup, but that code is still a mess...
> >> someone should rewrite these routines.
> 
> > Yes, I looked at it and struggled to get both IPv4 and IPv6 cleanly
> > working.  Any ideas on how to improve it?
> 
> The major problem is the huge amount of #ifdefs, most of which seem to
> come from the fact that we deal with a list of possible addresses in
> one case and not the other.  It would help a lot if we fixed things so
> that we dealt with a list in either case --- only a one-element list,
> if we don't have getaddrinfo, but getaddrinfo2 could hide that and
> provide a uniform API regardless.

I'm actually working on getting rid of all those #ifdef's, but
it's going slowly.  (I have very little free time.)


Kurt


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] location of the configuration files

2003-02-14 Thread scott.marlowe

On Fri, 14 Feb 2003, Curt Sampson wrote:

> On Thu, 13 Feb 2003, scott.marlowe wrote:
> 
> > If I do a .tar.gz install of apache, I get /usr/local/apache/conf, which
> > is not the standard way you're listing.
> 
> I'm going to stay out of this argument from now on, but this struck a sore
> point.
> 
> /usr is designed to be a filesystem that can be shared. Is the stuff in
> /usr/local/apache/conf really supposed to be shared amongst all machines
> of that architecture on your site that run apache?

Interesting.  I've always viewed usr EXCEPT for local this way.  In 
fact, on most of my boxes I create a seperate mount point for /usr/local 
so it's easier to backup and maintain, and it doesn't fill up the /usr 
directory.

Asking for everything in a directory with the name local in it to be 
shared is kind of counter intuitive to me.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Martin Coxall


> I seriously doubt your ability to judge anyone's mental stability.
> I can also see that you prefer cross-platform INcompatibility.  Your
> position and mindset are now crystal clear.

Come now- don't take it personally. All I said is, as someone who
wrestles daily with QMail, we should prefer the FHS over DJB's way of
doing things and that DJB is a little, ahem, egocentric at times.
Neither of these things was meant as a mortal insult to you personally,
and if I offended you I apologise.

Anyway, it looks like it's all been agreed over there, anyway.

-- 
Martin Coxall <[EMAIL PROTECTED]>


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Offering tuned config files

2003-02-14 Thread Rod Taylor

On Fri, 2003-02-14 at 07:41, Manfred Koizar wrote:
> On Fri, 14 Feb 2003 14:24:23 +0200, Daniel Kalchev <[EMAIL PROTECTED]>
> wrote:
> >The default [cache] on BSD systems is 10% of the total RAM, so on a 256MB machine 
>this 
> >would be ~26MB or effective_cache_size = 32000.
> 
> I was a bit too Linux-minded, where every peace of memory not needed
> for anything else can be used as cache.  Thanks for the clarification.
> And sorry for my ignorance.

I think you're getting the two confused. I'm not sure about linux, but
on BSD (FreeBSD) the cache and buffer are mostly unrelated.

   Cache: number of pages used for VM-level disk caching
 
   Buf:   number of pages used for BIO-level disk caching
-- 
Rod Taylor <[EMAIL PROTECTED]>

PGP Key: http://www.rbt.ca/rbtpub.asc



signature.asc
Description: This is a digitally signed message part

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Oliver Elphick

On Fri, 2003-02-14 at 15:35, Tom Lane wrote:
> Here's a pretty topic for a flamewar: should it be /etc/postgres/ or
> /etc/postgresql/ ?

It should be configurable!

Debian uses /etc/postgresql, if you want to stick to what quite a lot of
people are familiar with.

-- 
Oliver Elphick[EMAIL PROTECTED]
Isle of Wight, UK http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
 
 "God be merciful unto us, and bless us; and cause his 
  face to shine upon us."  Psalms 67:1 


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] psql and readline

2003-02-14 Thread Tom Lane

Patrick Welche <[EMAIL PROTECTED]> writes:
> On Thu, Feb 13, 2003 at 10:25:52AM -0500, Tom Lane wrote:
>> Well, is that a bug in your wrapper?  Or must we add a configure test
>> for the presence of replace_history_entry()?

> Good question. Easiest for now for me would be add a configure test.

Okay with me --- Ross, can you handle that?

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Tom Lane

Manfred Koizar <[EMAIL PROTECTED]> writes:
> Given that we have a mostly empty metapage per index, and the metapage
> is in memory most of the time, using it for the freelist looks almost
> like a free lunch.

No, because of locking.  Every time you write-lock the metapage to add
or remove freelist entries, you are denying all other processes the
ability to start an index scan.  Check the btree literature ---
exclusive locks near the root of the tree are death for concurrent
performance, and must be avoided as much as possible.

If I were planning to use a freelist I would keep it in a different page
so as not to need to lock the metapage for freelist manipulations.  But
I don't see the value of having one at all.  It just adds that much more
disk traffic (and WAL traffic) for each page split or merge.  There are
also atomicity concerns --- is addition/removal of a freelist entry an
atomic part of the page merge or split operation, or is it a separate
atomic operation with its own WAL record?  If the former, you have
deadlocking problems; if the latter, post-crash-consistency problems.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] location of the configuration files

2003-02-14 Thread scott.marlowe

On Thu, 13 Feb 2003, mlw wrote:

> 
> 
> scott.marlowe wrote:
> 
> >>These are not issues at all. You could put the configuration file 
> >>anywhere, just as you can for any UNIX service.
> >>
> >>postmaster --config=/home/myhome/mydb.conf
> >>
> >>I deal with a number of PG databases on a number of sites, and it is a 
> >>real pain in the ass to get to a PG box and hunt around for data 
> >>directory so as to be able to administer the system. What's really 
> >>annoying is when you have to find the data directory when someone else 
> >>set up the system.
> >>
> >>
> >
> >Really?  I would think it's easier to do this:
> >
> >su - pgsuper
> >cd $PGDATA
> >pwd
> >
> >Than to try to figure out what someone entered when they ran ./configure 
> >--config=...
> >  
> >
> Why do you think PGDATA would be set for root?

Did you not notice the "su - pgsuper" line above?  You know, the one where 
you become the account that runs that instance of the database.  Again, I 
ask you, isn't that easier than trying to find out what someone typed when 
they typed ./configure --config=?

> >>Configuring postgresql via a configuration file which specifies all the 
> >>data, i.e. data directory, name of other configuration files, etc. is 
> >>the right way to do it. Even if you have reasons against it, even if you 
> >>think it is a bad idea, a bad standard is almost always a better 
> >>solution than an arcane work of perfection.
> >>
> >>
> >
> >Wrong, I strongly disagree with this sentament.  Conformity to standards 
> >for simple conformity's sake is as wrong as sticking to the old way 
> >because it's what we're all comfy with.  
> >
> It isn't conformity for conformitys sake. It is following an established 
> practice, like driving on the same side of the road or stopping at red 
> lights.

But this isn't the same thing at all.  Apache, when built from a tar ball, 
goes into /usr/local/apache/ and ALL it's configuration files are there.  
When installed as a package, my OS manufacturer decides where that goes.  
Those are the two "standard" ways of doing things.  I like that postgresql 
installs into the /usr/local/pgsql directory from a tar ball.  I like the 
fact that it uses $PGDATA to tell it where the cluster is, so that all my 
scripts, like pg_ctl, just know where it is without a -D switch each time.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [pgsql-advocacy] [HACKERS] Changing the default configuration

2003-02-14 Thread Curt Sampson

On Thu, 13 Feb 2003, Bruce Momjian wrote:

> To auto-tune, you would need to monitor swap usage and other stuff that
> may vary too much based on load from other systems.  Only the admin
> knows how to answer some of those questions.

No, to "auto-tune" many parameters that currently require manual
configuration, you merely need to change things so that you work with
the OS, rather than beside it.

There are a lot of things we could mmap that would remove the need for
tuning certain things altogether. The buffer cache is the biggest one;
mmaping that would let the OS take care of it all, and a few hundred KB
of shared memory would be all postgres would need. (Or none at all, if
you used mmap'd memory regions where for that part of your IPC as well.)

You could probably also make sort_mem need a lot less tuning if you
sorted in mmap'd memory and let the OS deal with paging only when and
if it needed it (as well as asking the OS not to page memory you're
randomly accessing, since it really destroys your peformance when you do
that).

I'm not sure if you could get rid of semaphores or not, but if you
could somehow do that, that would be another limited resource that you
wouldn't have to deal with.

cjs
-- 
Curt Sampson  <[EMAIL PROTECTED]>   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Tom Lane

"Curtis Faith" <[EMAIL PROTECTED]> writes:
> Another way this could be handled is by not merging onto one of the
> existing pages but to a brand new page, a kind of special case shadow
> index page.

Hmmm ... that might be made to work, but it would complicate inserts.
By the time an insert navigates to the page it should insert on, it
might find the page is dead, and then it would have no easy way to get
to the replacement page (unless you want to dedicate another link field
in every index page for that one use).  I suppose it could recover by
starting the search over again.

Another problem is that the two dead pages are now outside of the btree
structure and so their left and right links won't get updated in the
face of subsequent splits and merges of the nearby pages.  I spent quite
a bit of time sweating over the recovery navigation details in my
original proposal; I can assure you it's not easy to get right.  You can
chain right from a dead page with some reliability, but left is another
matter.  There's no good way to know, when following a left link,
whether you've arrived at the proper place or need to chain right to
make up for a subsequent split.  The recovery procedure I proposed works
for removing single pages, but it won't work with substituted pages.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Shridhar Daithankar<[EMAIL PROTECTED]>

On Friday 14 Feb 2003 9:05 pm, you wrote:
> Martin Coxall <[EMAIL PROTECTED]> writes:
> Here's a pretty topic for a flamewar: should it be /etc/postgres/ or
> /etc/postgresql/ ?

I vote for /etc/pgsql. Keeping in line of unix philosophy of cryptic and short 
names. Who wants a descriptive names anyway..:-)

Seriously, the traffic on last three days ahd very high noise ratio. 
Especially the whole discussion of PGDATA stuff fails to register as 
significant IMO. Right now, I can do things the way I want to do and I guess 
it is pretty much same with everyone else. Is it last topic left to improve?

Keep it simple and on tpoic guys. This is hackers. Keep it low volume 
otherwise, two years down the lines, archives will be unsearchable..

 Shridhar

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] psql and readline

2003-02-14 Thread Patrick Welche

On Thu, Feb 13, 2003 at 10:25:52AM -0500, Tom Lane wrote:
> Patrick Welche <[EMAIL PROTECTED]> writes:
> > The sad thing is that my readline wrapper for libedit doesn't wrap
> > replace_history_entry,
> 
> Well, is that a bug in your wrapper?  Or must we add a configure test
> for the presence of replace_history_entry()?

Good question. Easiest for now for me would be add a configure test. Long
term libedit needs tweeking... In fact for now, I just comment out the call
the replace_history_entry, as I am more than happy with the rest of the
readline behaviour (as implemented in libedit).

Patrick

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [pgsql-advocacy] [HACKERS] Changing the default configuration

2003-02-14 Thread Robert Treat

On Fri, 2003-02-14 at 03:00, Daniel Kalchev wrote:
> >>>Jason Hihn said:
>  > Pardon my ignorance, but there's no way to auto-tune? Ship it with a thread
>  > that gathers statistics and periodically re-tunes the database parameters.
>  > Of course, be able to turn it off. People that actually take the time to run
>  > tune manually will turn it off as to not have the overhead or interruption.
>  > Those that don't care about pg_tune shouldn't care about having a thread
>  > around retuning. Those that will care will tune manually.
> 
> This is related to my proposition, but trouble is, there is not such thing as 
> 'well tuned database' that will suit all queries. You can tune the database to 
> the hardware for example (still remember that old argument on random access 
> and fast disks).
> 
> It seems the system could 'self-tune' itself on minor choices. I believe it 
> does this today for a number of things already. More significant changes 
> require the DBA consent and choice - but they need to be well informed of the 
> current usage statistics when making the choice.
> 

I agree. Given that we don't have solid explanations on telling people
how to tune the different parameters, nor do we have enough mechanisms
for actually giving people the information they need to determine the
changes they need, a complete auto-tune seems premature.

Robert Treat



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Manfred Koizar

On Wed, 12 Feb 2003 17:42:44 -0500, Tom Lane <[EMAIL PROTECTED]>
wrote:
>Instead of an actively maintained freelist on disk as per Alvaro Herrera's
>patch, I plan to use the FSM to remember where recyclable pages are, much
>as we do for tables.

Given that we have a mostly empty metapage per index, and the metapage
is in memory most of the time, using it for the freelist looks almost
like a free lunch.

I've picked up Alvaro's patch and played around a bit.  Reviewing my
changes it turns out that most of them deal with the freelist.  If you
are interested I can send my version of the patch to you or to the
list.

Servus
 Manfred

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> My point is that folks with multiple postmasters may not want to use
> PGDATA, but for folks who have single postmasters, it makes things
> easier and less error-prone.

Actually, for multi postmasters too.  I have little shell-environment
config files that switch my entire world view between different source
trees and installation trees, for example this one sets me up to mess
with the 7.2 branch:

STDPATH=${STDPATH:-$PATH}
STDMANPATH=${STDMANPATH:-$MANPATH}

PGSRCROOT=$HOME/REL7_2/pgsql
PGINSTROOT=$HOME/version72
PATH=$PGINSTROOT/bin:/opt/perl5.6.1/bin:$STDPATH
MANPATH=$PGINSTROOT/man:$STDMANPATH
PGLIB=$PGINSTROOT/lib
PGDATA=$PGINSTROOT/data
PMOPTIONS="-p 5472 -i -F"
PMLOGFILE=server72.log

export PGSRCROOT PGINSTROOT PATH MANPATH PGLIB PGDATA STDPATH STDMANPATH
export PMOPTIONS PMLOGFILE

After sourcing one of these, I can use pg_ctl as well as a half dozen
other convenient little scripts that do things like remake and reinstall
the backend:

#!/bin/sh

pg_ctl -w stop

cd $PGSRCROOT/src/backend

make install-bin

startpg

or this one that fires up gdb on a crashed backend:

#!/bin/sh

# Usage: gdbcore

cd $HOME

CORES=`find $PGDATA/base -name core -type f -print`

if [ x"$CORES" != x"" ]
then
ls -l $CORES
fi

if [ `echo "$CORES" | wc -w` -eq 1 ]
then
exec gdb $PGINSTROOT/bin/postgres "$CORES"
else
exec gdb $PGINSTROOT/bin/postgres
fi

This is vastly less error-prone than keeping track of the various
related elements in my head.

Now, it's certainly true that I could still make this work if I had
to explicitly say -D $PGDATA to the postmaster.  But that would clutter
my ps display.  I am happy with -p as the ps indicator of which
postmaster is which; I don't want more stuff in there.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Curtis Faith

tom lane wrote:

> How does that help?  The left-moving indexscan still has no 
> way to recover.  It can't go back to the page it was on 
> before and try to determine which entries you added there, 
> because it has no reliable reference point to do so.  The 
> entry it previously returned might not be there anymore, and 
> in a non-unique index key comparisons won't help. And even 
> more to the point, how would it know you've changed the left 
> page?  It has no idea what the previous "page version" on the 
> left page was, because it was never there before.

Another way this could be handled is by not merging onto one of the
existing pages but to a brand new page, a kind of special case shadow
index page. That way the sibling pointers, and leaf page pointer in the
parent could all be updated atomically to point to the new page. 

In-process index scans would still reference the merged pages which
would not be deleted but marked as dead using a mechanism like you
proposed for marking empty pages dead with the next-transaction-ID
counter. 

Merging could be done after a VACUUM pass that performs deletion of
empty pages in order to provide a pool of empty pages to use for the
merge. This would keep the index from temporarily growing during the
merge process.

A similar approach could be used to reorder the index pages in the
background. An index that was reordered to fairly closely reflect the
tree as a breadth first traversal would provide much faster index scans
if the file is not heavily fragmented.

- Curtis

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Robert Treat

On Thu, 2003-02-13 at 19:22, Adam Haberlach wrote:
> On Thu, Feb 13, 2003 at 05:59:17PM -0500, Robert Treat wrote:
> > On Thu, 2003-02-13 at 15:08, mlw wrote:
> > > Stephan Szabo wrote:
> > > 
> > > On Thu, 13 Feb 2003, mlw wrote:
> > Personally I think a postgresql installation is much more like an apache
> > installation, which generally contains all of the files (data and
> > config) under /usr/local/apache. Maybe someone can dig more to see if
> > that system is more appropriate a comparison than something like bind.
> 
>   I think you are making a pretty uninformed, if not just plain wrong 
> generalization.  I've run exactly one system with apache configuration 
> files in /usr/local/apache, and even then, the data was not there.

Uh... the last time I built apache from source, it stuck everything
under /usr/local/apache. It uses a conf directory for the config files,
and htdocs for the "data" files...  That is it's default configuration.

You know, this is why I actually suggested looking closer at apache. By
default, everything is crammed in one directory, but if you want to, you
can configure it "six different ways to sunday". That seems to be a big
plus IMO

> 
> What does this mean?
> 
> People will put things in different places, and there are typically
> very good reasons for this.  This is ESPECIALLY true when one wants to
> have configuration files, at least the base ones in a common place such
> as /etc or /usr/local/etc in order to make backup of configuration easy
> and clean, while leaving data somewhere else for performance or magnitude
> of partition reasons.  It just makes sense to ME to have postgresql.conf
> reside in /etc, yet put my data in /var/data/postgresql, yet retain the
> option to put my data in /raid/data/postgresql at a later date, when the
> new hardware comes in.

Is anyone arguing against this? I'm certainly not. But maybe my needs
are more varied than yours. On my local development box, I run multiple
versions of apache, compiled with different versions of php. It really
helps to keep all of apache's stuff centralized, and using things like
rpms actually overly complicates this. Now sure, that's a development
machine, but on the phppgadmin demo server, which is essentially a
production system, I run three different versions of postgresql. In
fact, I need to upgrade two of those (to 7.2.4 and 7.3.2), I shudder to
think about doing that if postgresql forced me to use the /etc/
directory for all of my config files. Now sure, this probably isn't
typical use, but I would say that when it comes time to upgrade major
versions, unless you running an operation where you can have large
amounts of downtime, postgresql needs to have the ability to have
multiple versions install that don't conflict with each other, and it
needs to do this easily. The upgrade process is hard enough already.

> However, this seems, to me, to be a very small addition that has some real-world
> (and yes, we need to start paying attention to the real world) advantages.
> 
> And finally, don't go telling me that I'm wrong to put my data and config files
> where I am.  You can offer advice, but I'm probably going to ignore it because
> I like where they are and don't need to explain why.
> 

Have I wronged you in some former life? I've very little concern for
where you put your data files, and have no idea why you'd think I'd
criticize your setup. 

Robert Treat

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Tom Lane

Martin Coxall <[EMAIL PROTECTED]> writes:
> Partially true. The FHS specifies that the /etc top layer is for system-own3d
> stuff, but the subdirectories off it are explicitly used for user space programs
> and, well, everything. (/etc/apache, /etc/postgres, /etc/tomcat3,
> /etc/tomcat4...)

FHS or no FHS, I would think that the preferred arrangement would be to
keep Postgres' config files in a postgres-owned subdirectory, not
directly in /etc.  That way you need not be root to edit them.  (My idea
of an editor, Emacs, always wants to write a backup file, so I dislike
having to edit files that live in directories I can't write.)

Here's a pretty topic for a flamewar: should it be /etc/postgres/ or
/etc/postgresql/ ?

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Incremental backup

2003-02-14 Thread Kevin Brown

Bruce Momjian wrote:
> 
> OK, once we have PITR, will anyone want incremental backups?

None of my database references (Date's "Introduction to Database
Systems" and Garcia-Molina's "Database Systems - The Complete Book",
in particular) seem to talk about PITR at all.  At least, there's no
index entry for it.  And a google search for "point in time recovery"
yields mostly marketing fluff.

Is there a good reference for this that someone can point me to?  I'm
interested in exactly how it'll work, especially in terms of how logs
are stored versus the main data store, effects on performance, etc.

Thanks, and sorry for the newbie question.  :-(

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Tom Lane

Kevin Brown <[EMAIL PROTECTED]> writes:
> I'm quite interested in knowing why there seems to be so much resistance to
> removing the "potential_DataDir = getenv("PGDATA");" line from
> postmaster.c.

Backwards compatibility.  Also, you still haven't explained why
"I don't want to use PGDATA" should translate to "no one should
be allowed to use PGDATA".  If you don't set PGDATA, what problem
is there for you in that line being there?

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Tom Lane

Oliver Elphick <[EMAIL PROTECTED]> writes:
> On Fri, 2003-02-14 at 12:17, Bruce Momjian wrote:
>> If you want ps to display the data dir, you should use -D.  Remember, it
>> is mostly important for multiple postmaster, so if you are doing that,
>> just use -D, but don't prevent single-postmaster folks from using
>> PGDATA.

> Could not the ps line be rewritten to show this, as the backend's ps
> lines are rewritten?

I for one would rather it didn't do that.  I already set my postmaster
command lines the way I want 'em, and I don't want the code overriding
that.  (I prefer to use explicit -p arguments to distinguish the various
postmasters I have running --- shorter and easier to read than explicit
-D would be.  At least for me.)

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> So, we add data_dir to postgresql.conf, and add -C/PGCONFIG to
> postmaster.

Wait one second.  You are blithely throwing in a PGCONFIG variable
without any detailed proposal of exactly how it will work.  Does
that override a PGDATA environment variable?  How do they interact?

Also, please note Kevin Brown's nearby arguments against using PGDATA
at all, which surely apply with equal force to a PGCONFIG variable.
Now, I don't buy that Kevin's arguments are enough reason to break
backwards compatibility by removing PGDATA --- but I think they are
enough reason not to introduce a new environment variable.  PGCONFIG
wouldn't offer any backwards-compatibility value, and that tilts the
scales against it.

> Regarding Tom's idea of replacing data_dir with a full path during
> initdb, I think we are better having it be relative to the config
> directory, that way if they move pgsql/, the system still works.

Good thought, but you're assuming that initdb knows where the config
files will eventually live.  If we do that, then moving the config
files breaks the installation.  I think it will be fairly common to
let initdb drop its proposed config files into $PGDATA, and then
manually place them where they should go (or even more likely,
manually merge them with a prior version).  Probably better to force
datadir to be an absolute path in the config file.  (In fact, on safety
grounds I'd argue in favor of rejecting a datadir value taken from the
config file that wasn't absolute.)

> I also think we should start telling people to use PGCONFIG rather than
> PGDATA.  Then, in 7.5, we move the default config file location to
> pgsql/etc, and tell folks to point there rather than /data.

I agree with none of this.  This is not improvement, this is only change
for the sake of change.  The packagers will do what they want to do
(and are already doing, mostly) regardless.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Incremental backup

2003-02-14 Thread Greg Copeland

On Fri, 2003-02-14 at 06:52, Bruce Momjian wrote:
> OK, once we have PITR, will anyone want incremental backups?
> 
> ---
> 
> Martin Marques wrote:
> > On Jue 13 Feb 2003 16:38, Bruce Momjian wrote:
> > > Patrick Macdonald wrote:
> > > > Bruce Momjian wrote:
> > > > > Someone at Red Hat is working on point-in-time recovery, also known as
> > > > > incremental backups.
> > > >
> > > > PITR and incremental backup are different beasts.  PITR deals with a
> > > > backup + logs.  Incremental backup deals with a full backup + X
> > > > smaller/incremental backups.
> > > >
> > > > So... it doesn't look like anyone is working on incremental backup at the
> > > > moment.
> > >
> > > But why would someone want incremental backups compared to PITR?  The
> > > backup would be mixture of INSERTS, UPDATES, and DELETES, right?  Seems
> > > pretty weird.  :-)
> > 
> > Good backup systems, such as Informix (it's the one I used) doesn't do a query 
> > backup, but a pages backup. What I mean is that it looks for pages in the 
> > system that has changed from the las full backup and backs them up.
> > 
> > That's how an incremental backup works. PITR is another thing, which is even 
> > more important. :-)

I do imagine for some people it will register high on their list.

-- 
Greg Copeland <[EMAIL PROTECTED]>
Copeland Computer Consulting


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Kevin Brown

Tom Lane wrote:
> I am happy to design an arrangement that allows you not to depend on
> PGDATA if you don't want to.  But I don't see why you need to break
> my configuration procedures in order to fix yours.  As I outlined last
> night, it's possible to do what you want without breaking backwards
> compatibility for those that like PGDATA.

Yes, I agree.  I hadn't really thought of all the possible benefits of
PGDATA.  Sorry.  :-(

Would you agree that it would be a beneficial change to have pg_ctl
pass explicit arguments to postmaster?  It would go a long way towards
eliminating most of the situations I described.

A warning in the documentation about the consequences of using PGDATA
might not be a bad idea, either...

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Offering tuned config files

2003-02-14 Thread Tom Lane

Manfred Koizar <[EMAIL PROTECTED]> writes:
> In postgresql.conf.sample-writeheavy you have:
>   commit_delay = 1
> Is this still needed with "ganged WAL writes"?  Tom?

I doubt that the current options for grouped commits are worth anything
at the moment.  Chris, do you have any evidence backing up using
commit_delay with 7.3?

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Vince Vielhaber

On 14 Feb 2003, Martin Coxall wrote:

> On Fri, 2003-02-14 at 14:21, Vince Vielhaber wrote:
> > On 14 Feb 2003, Martin Coxall wrote:
> >
> > >
> > > > > If you are interested in reading a contrary position, you can read
> > > > > Berstein's arguments for his recommended way to install services at:
> > > > > http://cr.yp.to/unix.html
> > >
> > > But since DJB is a class-A monomaniac, he may not be the best person to
> > > listen to. /var/qmail/control for qmail configuration files? Yeah, good
> > > one, DJB.
> >
> > I'm guessing that rather than reading it the above mentioned link you
> > chose to waste our time with this instead.  Good one, MC.
>
> Yeah, I've read it several times, and have often linked to it as an
> example of why one should be wary of DJB's software. It seems to me that
> since DJB doesn't follow his own advice regarding the filesystem
> hierarchy (see both qmail and djbdns), it'd be odd for him to expect
> anyone else to. *Especially* seing as he's a bit mental. ("I'm not going
> to take this any more. I demand cross-platform compatibility!")

I seriously doubt your ability to judge anyone's mental stability.
I can also see that you prefer cross-platform INcompatibility.  Your
position and mindset are now crystal clear.

Vince.
-- 
 Fast, inexpensive internet service 56k and beyond!  http://www.pop4.net/
   http://www.meanstreamradio.com   http://www.unknown-artists.com
 Internet radio: It's not file sharing, it's just radio.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Tom Lane

Kevin Brown <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> The other side of that coin is, what's the good reason to remove it?
>> There's a long way between "I don't want my setup to depend on PGDATA"
>> and "I don't think your setup should be allowed to depend on PGDATA".
>> If you don't want to use it, then don't use it.  Why do you need to
>> tell me how I'm allowed to run my installation?

> I'm not talking about getting rid of ALL dependency on PGDATA in our
> entire distribution, only postmaster's.

We're obviously talking past each other.  You are arguing that under
circumstances X, Y, or Z, depending on a PGDATA setting is a bad idea.
You are then drawing the conclusion that I should not be allowed to
depend on PGDATA, whether or not I care about X, Y, or Z.

I am happy to design an arrangement that allows you not to depend on
PGDATA if you don't want to.  But I don't see why you need to break
my configuration procedures in order to fix yours.  As I outlined last
night, it's possible to do what you want without breaking backwards
compatibility for those that like PGDATA.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Kevin Brown

Bruce Momjian wrote:
> > This way, people who start the database using the standard tools we
> > supply will know exactly what's going on when they get a "ps" listing.
> 
> No.  If you want ps to display, don't use environment variables. Many
> don't care --- especially those with only one postmaster.

You know that the code in pg_ctl doesn't send an explicit -D to the
postmaster even if pg_ctl itself is invoked with a -D argument, right?
The only way to make pg_ctl do that is by using the "-o" option.

A typical vendor-supplied install is going to invoke pg_ctl to do the
dirty work.  That's why I'm focusing on pg_ctl.

I completely understand your need for keeping PGDATA in postmaster.  I
don't understand why pg_ctl *shouldn't* be changed to invoke
postmaster with an explicit -D option.  It might be desirable for ps
to not show any arguments to postmaster in some circumstances (I have
no idea what those would be), but why in the world would you want that
to be the *default*?  Why would we want the default behavior to make
things harder on administrators and not easier?

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] [GENERAL] createlang and Schemas

2003-02-14 Thread Bruce Momjian


Added to TODO:

* Add schema option to createlang


---

Justin Clift wrote:
> Bruce Momjian wrote:
> > Is this a TODO?  Not sure.
> 
> Is this a mis-feature (not really a bug) that should be fixed in 7.3.2?
> 
> Regards and best wishes,
> 
> Justin Clift
> 
> > ---
> > 
> > Tom Lane wrote:
> > 
> >>"Ian Harding" <[EMAIL PROTECTED]> writes:
> >>
> >>>createlang seems not to take a schema argument.  Should it?
> >>
> >>Hm.  The language itself does not have any associated schema --- but the
> >>underlying call handler function does.  Perhaps there should be a way to
> >>tell createlang which schema to put the function in.  Right now it will
> >>be effectively determined by the default search path, so you could do
> >>
> >>PGOPTIONS='--search_path=myschema' createlang plpgsql mydb
> >>
> >>but this seems inelegant.
> >>
> >>regards, tom lane
> >>
> >>---(end of broadcast)---
> >>TIP 5: Have you checked our extensive FAQ?
> >>
> >>http://www.postgresql.org/users-lounge/docs/faq.html
> >>
> > 
> > 
> 
> 
> -- 
> "My grandfather once told me that there are two kinds of people: those
> who work and those who take the credit. He told me to try to be in the
> first group; there was less competition there."
> - Indira Gandhi
> 
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Bruce Momjian

Kevin Brown wrote:
> Bruce Momjian wrote:
> > OK, here is an updated proposal.  I think we have decided:
> > 
> > Moving postmaster.pid and postmaster.opts isn't worth it.
> > 
> > We don't want per-file GUC variables, but assume it is in
> > the same config directory as postgresql.conf.  I don't
> > see any valid reason they would want to put them somewhere
> > different than postgresql.conf.
> > 
> > So, we add data_dir to postgresql.conf, and add -C/PGCONFIG to
> > postmaster.
> 
> Agreed.  One additional thing: when pg_ctl invokes the postmaster, it
> should explicitly specify -C on the postmaster command line, and if it
> doesn't find a data_dir in $PGCONFIG/postgresql.conf then it should
> explicitly specify a -D as well.  Pg_ctl is going to have to be
> modified to take a -C argument anyway, so we may as well go all the
> way to do the right thing here.
> 
> This way, people who start the database using the standard tools we
> supply will know exactly what's going on when they get a "ps" listing.

No.  If you want ps to display, don't use environment variables. Many
don't care --- especially those with only one postmaster.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Kevin Brown

Bruce Momjian wrote:
> 
> Give it up.  As long as we have -D, we will allow PGDATA. If you don't
> want to use it, don't use it.

Agreed.

I'm not sure I see how this diminishes the argument for fixing pg_ctl
so that it passes an explicit -D option to the postmaster when
invoking it...

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Kevin Brown

Bruce Momjian wrote:
> OK, here is an updated proposal.  I think we have decided:
> 
>   Moving postmaster.pid and postmaster.opts isn't worth it.
> 
>   We don't want per-file GUC variables, but assume it is in
>   the same config directory as postgresql.conf.  I don't
>   see any valid reason they would want to put them somewhere
>   different than postgresql.conf.
> 
> So, we add data_dir to postgresql.conf, and add -C/PGCONFIG to
> postmaster.

Agreed.  One additional thing: when pg_ctl invokes the postmaster, it
should explicitly specify -C on the postmaster command line, and if it
doesn't find a data_dir in $PGCONFIG/postgresql.conf then it should
explicitly specify a -D as well.  Pg_ctl is going to have to be
modified to take a -C argument anyway, so we may as well go all the
way to do the right thing here.

This way, people who start the database using the standard tools we
supply will know exactly what's going on when they get a "ps" listing.

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Martin Coxall

On Fri, 2003-02-14 at 14:21, Vince Vielhaber wrote:
> On 14 Feb 2003, Martin Coxall wrote:
> 
> >
> > > > If you are interested in reading a contrary position, you can read
> > > > Berstein's arguments for his recommended way to install services at:
> > > > http://cr.yp.to/unix.html
> >
> > But since DJB is a class-A monomaniac, he may not be the best person to
> > listen to. /var/qmail/control for qmail configuration files? Yeah, good
> > one, DJB.
> 
> I'm guessing that rather than reading it the above mentioned link you
> chose to waste our time with this instead.  Good one, MC.

Yeah, I've read it several times, and have often linked to it as an
example of why one should be wary of DJB's software. It seems to me that
since DJB doesn't follow his own advice regarding the filesystem
hierarchy (see both qmail and djbdns), it'd be odd for him to expect
anyone else to. *Especially* seing as he's a bit mental. ("I'm not going
to take this any more. I demand cross-platform compatibility!")

-- 
Martin Coxall <[EMAIL PROTECTED]>

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Tom Lane

Hannu Krosing <[EMAIL PROTECTED]> writes:
> Tom Lane kirjutas R, 14.02.2003 kell 01:13:
>> How is returning the same data twice not an "ill effect"?

> From earlier discussions I understood that there had been some work done
> on using btrees for indexing arrays by storing each separate element in
> a löeaf node. Surely that work must deal with not returning the same
> tuple twice.

The only mechanism that exists for that is to discard tuples that meet
the qualification tests of previous indexscans.  This cannot prevent
returning the same tuple twice in one scan, if the index is so
ill-behaved as to return the same pointer twice.  I don't know what
Vadim had in mind to support multiple index entries per tuple, but
it's certainly not in the code yet.

>> How will you wait for scans that you know nothing of to go past?
>> Especially when they are going to be blocked by your own write lock
>> on the left page?

> could we just not lock (for more than just to ensure atomic writes) the
> page but instead increment a "page version" on each write to detect
> changes?

How does that help?  The left-moving indexscan still has no way to
recover.  It can't go back to the page it was on before and try to
determine which entries you added there, because it has no reliable
reference point to do so.  The entry it previously returned might not be
there anymore, and in a non-unique index key comparisons won't help.
And even more to the point, how would it know you've changed the left
page?  It has no idea what the previous "page version" on the left page
was, because it was never there before.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] WAL replay logic (was Re: [PERFORM] Mount options for Ext3?)

2003-02-14 Thread Bruce Momjian


Is there a TODO here, like "Allow recovery from corrupt pg_control via
WAL"?

---

Kevin Brown wrote:
> Tom Lane wrote:
> > Kevin Brown <[EMAIL PROTECTED]> writes:
> > > One question I have is: in the event of a crash, why not simply replay
> > > all the transactions found in the WAL?  Is the startup time of the
> > > database that badly affected if pg_control is ignored?
> > 
> > Interesting thought, indeed.  Since we truncate the WAL after each
> > checkpoint, seems like this approach would no more than double the time
> > for restart.  
> 
> Hmm...truncating the WAL after each checkpoint minimizes the amount of
> disk space eaten by the WAL, but on the other hand keeping older
> segments around buys you some safety in the event that things get
> really hosed.  But your later comments make it sound like the older
> WAL segments are kept around anyway, just rotated.
> 
> > The win is it'd eliminate pg_control as a single point of
> > failure.  It's always bothered me that we have to update pg_control on
> > every checkpoint --- it should be a write-pretty-darn-seldom file,
> > considering how critical it is.
> > 
> > I think we'd have to make some changes in the code for deleting old
> > WAL segments --- right now it's not careful to delete them in order.
> > But surely that can be coped with.
> 
> Even that might not be necessary.  See below.
> 
> > OTOH, this might just move the locus for fatal failures out of
> > pg_control and into the OS' algorithms for writing directory updates.
> > We would have no cross-check that the set of WAL file names visible in
> > pg_xlog is sensible or aligned with the true state of the datafile
> > area.
> 
> Well, what we somehow need to guarantee is that there is always WAL
> data that is older than the newest consistent data in the datafile
> area, right?  Meaning that if the datafile area gets scribbled on in
> an inconsistent manner, you always have WAL data to fill in the gaps.
> 
> Right now we do that by using fsync() and sync().  But I think it
> would be highly desirable to be able to more or less guarantee
> database consistency even if fsync were turned off.  The price for
> that might be too high, though.
> 
> > We'd have to take it on faith that we should replay the visible files
> > in their name order.  This might mean we'd have to abandon the current
> > hack of recycling xlog segments by renaming them --- which would be a
> > nontrivial performance hit.
> 
> It's probably a bad idea for the replay to be based on the filenames.
> Instead, it should probably be based strictly on the contents of the
> xlog segment files.  Seems to me the beginning of each segment file
> should have some kind of header information that makes it clear where
> in the scheme of things it belongs.  Additionally, writing some sort
> of checksum, either at the beginning or the end, might not be a bad
> idea either (doesn't have to be a strict checksum, but it needs to be
> something that's reasonably likely to catch corruption within a
> segment).
> 
> Do that, and you don't have to worry about renaming xlog segments at
> all: you simply move on to the next logical segment in the list (a
> replay just reads the header info for all the segments and orders the
> list as it sees fit, and discards all segments prior to any gap it
> finds.  It may be that you simply have to bail out if you find a gap,
> though).  As long as the xlog segment checksum information is
> consistent with the contents of the segment and as long as its
> transactions pick up where the previous segment's left off (assuming
> it's not the first segment, of course), you can safely replay the
> transactions it contains.
> 
> I presume we're recycling xlog segments in order to avoid file
> creation and unlink overhead?  Otherwise you can simply create new
> segments as needed and unlink old segments as policy dictates.
> 
> > Comments anyone?
> > 
> > > If there exists somewhere a reasonably succinct description of the
> > > reasoning behind the current transaction management scheme (including
> > > an analysis of the pros and cons), I'd love to read it and quit
> > > bugging you.  :-)
> > 
> > Not that I know of.  Would you care to prepare such a writeup?  There
> > is a lot of material in the source-code comments, but no coherent
> > presentation.
> 
> Be happy to.  Just point me to any non-obvious source files.
> 
> Thus far on my plate:
> 
> 1.  PID file locking for postmaster startup (doesn't strictly need
>   to be the PID file but it may as well be, since we're already
>   messing with it anyway).  I'm currently looking at how to do
>   the autoconf tests, since I've never developed using autoconf
>   before.
> 
> 2.  Documenting the transaction management scheme.
> 
> I was initially interested in implementing the explicit JOIN
> reordering but based on your recent comments I think you have a much
> better handl

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Bruce Momjian


My point is that folks with multiple postmasters may not want to use
PGDATA, but for folks who have single postmasters, it makes things
easier and less error-prone.

---

Bruce Momjian wrote:
> 
> Give it up.  As long as we have -D, we will allow PGDATA. If you don't
> want to use it, don't use it.
> 
> ---
> 
> Kevin Brown wrote:
> > Bruce Momjian wrote:
> > > I just set PGDATA in my login and I don't have to deal with it
> > > again.
> > 
> > DuhI just realized a reason you guys might care about this so
> > much.
> > 
> > It's because you want to be able to start the postmaster from within a
> > debugger (or profiler, or whatever), and you don't want to have to
> > mess with command line options from there, right?
> > 
> > 
> > Sounds like fixing pg_ctl to use -D explicitly when invoking the
> > postmaster is the right change to make here, since that's probably how
> > the majority of the production shops are going to be starting the
> > database anyway.  Takes care of the majority of the "visibility"
> > problem and leaves PGDATA intact.  Thoughts?
> > 
> > 
> > 
> > -- 
> > Kevin Brown   [EMAIL PROTECTED]
> > 
> > ---(end of broadcast)---
> > TIP 2: you can get off all lists at once with the unregister command
> > (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
> > 
> 
> -- 
>   Bruce Momjian|  http://candle.pha.pa.us
>   [EMAIL PROTECTED]   |  (610) 359-1001
>   +  If your life is a hard drive, |  13 Roberts Road
>   +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073
> 
> ---(end of broadcast)---
> TIP 5: Have you checked our extensive FAQ?
> 
> http://www.postgresql.org/users-lounge/docs/faq.html
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Bruce Momjian


Give it up.  As long as we have -D, we will allow PGDATA. If you don't
want to use it, don't use it.

---

Kevin Brown wrote:
> Bruce Momjian wrote:
> > I just set PGDATA in my login and I don't have to deal with it
> > again.
> 
> DuhI just realized a reason you guys might care about this so
> much.
> 
> It's because you want to be able to start the postmaster from within a
> debugger (or profiler, or whatever), and you don't want to have to
> mess with command line options from there, right?
> 
> 
> Sounds like fixing pg_ctl to use -D explicitly when invoking the
> postmaster is the right change to make here, since that's probably how
> the majority of the production shops are going to be starting the
> database anyway.  Takes care of the majority of the "visibility"
> problem and leaves PGDATA intact.  Thoughts?
> 
> 
> 
> -- 
> Kevin Brown [EMAIL PROTECTED]
> 
> ---(end of broadcast)---
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Vince Vielhaber

On 14 Feb 2003, Martin Coxall wrote:

>
> > > If you are interested in reading a contrary position, you can read
> > > Berstein's arguments for his recommended way to install services at:
> > > http://cr.yp.to/unix.html
>
> But since DJB is a class-A monomaniac, he may not be the best person to
> listen to. /var/qmail/control for qmail configuration files? Yeah, good
> one, DJB.

I'm guessing that rather than reading it the above mentioned link you
chose to waste our time with this instead.  Good one, MC.

Vince.
-- 
 Fast, inexpensive internet service 56k and beyond!  http://www.pop4.net/
   http://www.meanstreamradio.com   http://www.unknown-artists.com
 Internet radio: It's not file sharing, it's just radio.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Kevin Brown

Bruce Momjian wrote:
> I just set PGDATA in my login and I don't have to deal with it
> again.

DuhI just realized a reason you guys might care about this so
much.

It's because you want to be able to start the postmaster from within a
debugger (or profiler, or whatever), and you don't want to have to
mess with command line options from there, right?

Sounds like fixing pg_ctl to use -D explicitly when invoking the
postmaster is the right change to make here, since that's probably how
the majority of the production shops are going to be starting the
database anyway.  Takes care of the majority of the "visibility"
problem and leaves PGDATA intact.  Thoughts?

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

[HACKERS] unsubscribe

2003-02-14 Thread John Liu

unsubscribe


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Bruce Momjian

Kevin Brown wrote:
> Bruce Momjian wrote:
> > I just set PGDATA in my login and I don't have to deal with it
> > again.
> 
> Hmm...you don't use pg_ctl to start/stop/whatever the database?  You
> invoke the postmaster directly (I can easily see that you would, just
> asking if you do)?

I can use either to start/stop it.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Kevin Brown

Bruce Momjian wrote:
> I just set PGDATA in my login and I don't have to deal with it
> again.

Hmm...you don't use pg_ctl to start/stop/whatever the database?  You
invoke the postmaster directly (I can easily see that you would, just
asking if you do)?

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Martin Coxall

On Thu, 2003-02-13 at 20:28, Steve Crawford wrote:
> I don't see why we can't keep everyone happy and let the users choose the 
> setup they want. To wit, make the following, probably simple, changes:
> 
> 1) Have postgresql default to using /etc/postgresql.conf

/etc/postgres/postgresql.conf, if we want to be proper FHS-bitches.

> 2) Add a setting in postgresql.conf specifying the data directory
> 3) Change the meaning of -D to mean "use this config file"
> 4) In the absence of a specified data directory in postgresql.conf, use the 
> location of the postgresql.conf file as the data directory

Shouldn't it in that case default to, say /var/lib/postgres?

-- 
Martin Coxall <[EMAIL PROTECTED]>


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Martin Coxall


> Generally things that live in /etc are owned and operated by the OS.  
> Postgresql, by it's definition is a userspace program, not an OS owned 
> one.

Partially true. The FHS specifies that the /etc top layer is for system-own3d 
stuff, but the subdirectories off it are explicitly used for user space programs
and, well, everything. (/etc/apache, /etc/postgres, /etc/tomcat3, /etc/tomcat4...)

Martin Coxall


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Martin Coxall


> > If you are interested in reading a contrary position, you can read
> > Berstein's arguments for his recommended way to install services at:
> > http://cr.yp.to/unix.html

But since DJB is a class-A monomaniac, he may not be the best person to
listen to. /var/qmail/control for qmail configuration files? Yeah, good
one, DJB.

-- 
Martin Coxall <[EMAIL PROTECTED]>


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Bruce Momjian

Kevin Brown wrote:
> Bruce Momjian wrote:
> > If you want ps to display the data dir, you should use -D.  Remember, it
> > is mostly important for multiple postmaster, so if you are doing that,
> > just use -D, but don't prevent single-postmaster folks from using
> > PGDATA.
> 
> Perhaps the best compromise would be to change pg_ctl so that it uses
> -D explicitly when invoking postmaster.  That's an easy change.
> 
> Could you describe how you and other developers use PGDATA?  I'm quite
> interested in knowing why there seems to be so much resistance to
> removing the "potential_DataDir = getenv("PGDATA");" line from
> postmaster.c.

I just set PGDATA in my login and I don't have to deal with it again.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Kevin Brown

Bruce Momjian wrote:
> If you want ps to display the data dir, you should use -D.  Remember, it
> is mostly important for multiple postmaster, so if you are doing that,
> just use -D, but don't prevent single-postmaster folks from using
> PGDATA.

Perhaps another reasonable approach would be to put an #ifdef/#endif
around the "potential_DataDir = getenv("PGDATA");" line in postmater.c
and create a configure option to enable it.  That way you guys get the
behavior you want for testing but production builds could disable it
if that's viewed as desirable.  You'd want to make the error message
that's produced when no data directory is specified depend on the same
#ifdef variable, of course.

Then the group would get to fight it out over whether the configure
default should be "enable" or "disable".  :-)

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Kevin Brown

Bruce Momjian wrote:
> If you want ps to display the data dir, you should use -D.  Remember, it
> is mostly important for multiple postmaster, so if you are doing that,
> just use -D, but don't prevent single-postmaster folks from using
> PGDATA.

Perhaps the best compromise would be to change pg_ctl so that it uses
-D explicitly when invoking postmaster.  That's an easy change.

Could you describe how you and other developers use PGDATA?  I'm quite
interested in knowing why there seems to be so much resistance to
removing the "potential_DataDir = getenv("PGDATA");" line from
postmaster.c.



-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Incremental backup

2003-02-14 Thread Martin Marques

On Vie 14 Feb 2003 09:52, Bruce Momjian wrote:
> OK, once we have PITR, will anyone want incremental backups?

I will probably not need it, but I know of people how have databases which 
build dumps of more then 20GB.
They are interested in live incremental backups.

-- 
Porqué usar una base de datos relacional cualquiera,
si podés usar PostgreSQL?
-
Martín Marqués  |[EMAIL PROTECTED]
Programador, Administrador, DBA |   Centro de Telematica
   Universidad Nacional
del Litoral
-


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Bruce Momjian


I am not sure if it is a good idea to be mucking with it.  For backend,
we do the entire thing, so it is clear we modified something.

---

Oliver Elphick wrote:
> On Fri, 2003-02-14 at 12:17, Bruce Momjian wrote:
> > If you want ps to display the data dir, you should use -D.  Remember, it
> > is mostly important for multiple postmaster, so if you are doing that,
> > just use -D, but don't prevent single-postmaster folks from using
> > PGDATA.
> 
> Could not the ps line be rewritten to show this, as the backend's ps
> lines are rewritten?
> 
> -- 
> Oliver Elphick[EMAIL PROTECTED]
> Isle of Wight, UK http://www.lfix.co.uk/oliver
> GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
>  
>  "God be merciful unto us, and bless us; and cause his 
>   face to shine upon us."  Psalms 67:1 
> 
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] location of the configuration files

2003-02-14 Thread Oliver Elphick

On Fri, 2003-02-14 at 12:17, Bruce Momjian wrote:
> If you want ps to display the data dir, you should use -D.  Remember, it
> is mostly important for multiple postmaster, so if you are doing that,
> just use -D, but don't prevent single-postmaster folks from using
> PGDATA.

Could not the ps line be rewritten to show this, as the backend's ps
lines are rewritten?

-- 
Oliver Elphick[EMAIL PROTECTED]
Isle of Wight, UK http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
 
 "God be merciful unto us, and bless us; and cause his 
  face to shine upon us."  Psalms 67:1 


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Terrible performance on wide selects

2003-02-14 Thread Bruce Momjian


Added to TODO:

* Cache last known per-tuple offsets to speed long tuple access


---

Tom Lane wrote:
> Hannu Krosing <[EMAIL PROTECTED]> writes:
> >> i.e. for tuple with 100 cols, allocate an array of 100 pointers, plus
> >> keep count of how many are actually valid,
> 
> > Additionally, this should also make repeted determining of NULL fields
> > faster - just put a NULL-pointer in and voila - no more bit-shifting and
> > AND-ing to find out if the field is null.
> 
> Right, the output of the operation would be a pair of arrays: Datum
> values and is-null flags.  (NULL pointers don't work for pass-by-value
> datatypes.)
> 
> I like the idea of keeping track of a last-known-column position and
> incrementally extending that as needed.
> 
> I think the way to manage this is to add the overhead data (the output
> arrays and last-column state) to TupleTableSlots.  Then we'd have
> a routine similar to heap_getattr except that it takes a TupleTableSlot
> and makes use of the extra state data.  The infrastructure to manage
> the state data is already in place: for example, ExecStoreTuple would
> reset the last-known-column to 0, ExecSetSlotDescriptor would be
> responsible for allocating the output arrays using the natts value from
> the provided tupdesc, etc.
> 
> This wouldn't help for accesses that are not in the context of a slot,
> but certainly all the ones from ExecEvalVar are.  The executor always
> works with tuples stored in slots, so I think we could fix all the
> high-traffic cases this way.
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 6: Have you searched our list archives?
> 
> http://archives.postgresql.org
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Incremental backup

2003-02-14 Thread Bruce Momjian


OK, once we have PITR, will anyone want incremental backups?

---

Martin Marques wrote:
> On Jue 13 Feb 2003 16:38, Bruce Momjian wrote:
> > Patrick Macdonald wrote:
> > > Bruce Momjian wrote:
> > > > Someone at Red Hat is working on point-in-time recovery, also known as
> > > > incremental backups.
> > >
> > > PITR and incremental backup are different beasts.  PITR deals with a
> > > backup + logs.  Incremental backup deals with a full backup + X
> > > smaller/incremental backups.
> > >
> > > So... it doesn't look like anyone is working on incremental backup at the
> > > moment.
> >
> > But why would someone want incremental backups compared to PITR?  The
> > backup would be mixture of INSERTS, UPDATES, and DELETES, right?  Seems
> > pretty weird.  :-)
> 
> Good backup systems, such as Informix (it's the one I used) doesn't do a query 
> backup, but a pages backup. What I mean is that it looks for pages in the 
> system that has changed from the las full backup and backs them up.
> 
> That's how an incremental backup works. PITR is another thing, which is even 
> more important. :-)
> 
> -- 
> Porqu? usar una base de datos relacional cualquiera,
> si pod?s usar PostgreSQL?
> -
> Mart?n Marqu?s  |[EMAIL PROTECTED]
> Programador, Administrador, DBA |   Centro de Telematica
>Universidad Nacional
> del Litoral
> -
> 
> 
> ---(end of broadcast)---
> TIP 5: Have you checked our extensive FAQ?
> 
> http://www.postgresql.org/users-lounge/docs/faq.html
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Offering tuned config files

2003-02-14 Thread Manfred Koizar

On Fri, 14 Feb 2003 14:24:23 +0200, Daniel Kalchev <[EMAIL PROTECTED]>
wrote:
>The default [cache] on BSD systems is 10% of the total RAM, so on a 256MB machine 
>this 
>would be ~26MB or effective_cache_size = 32000.

I was a bit too Linux-minded, where every peace of memory not needed
for anything else can be used as cache.  Thanks for the clarification.
And sorry for my ignorance.

BTW 26MB ~ effective_cache_size = 3200.

Servus
 Manfred

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

1 2 >

1 - 100 of 112 matches

Mail list logo