date:20031121

Re: [HACKERS] 7.4 logging bug.

2003-11-21 Thread Kurt Roeckx

On Thu, Nov 20, 2003 at 04:08:28PM -0500, Tom Lane wrote:
 Kurt Roeckx [EMAIL PROTECTED] writes:
  I just installed a 7.4 on windows/cygwin.  I restored a dump but
  ran out of disk space during the creating of an index.  In psql I
  saw the ERROR: could not extend relation .
  From that point on it seems to have stopped logging most things.
 
 Do you suppose that the stdio support under cygwin somehow permanently
 closes the stderr output descriptor after the first failure to write?
 That would be unpleasant.  (Are you sure there *is* now space to write?)

There is now space to write yes.  I just started to delete some
things at the moment it ran out of disk space.

It's still logging the recycled transation log file.  Is that
send to stdout instead of stderr maybe?


Kurt


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] 7.4 logging bug.

2003-11-21 Thread Tom Lane

Kurt Roeckx [EMAIL PROTECTED] writes:
 It's still logging the recycled transation log file.  Is that
 send to stdout instead of stderr maybe?

No, it all goes to stderr.  But that output comes from a different
subprocess.  Not sure why that subprocess would still have working
stderr if others don't ... any ideas anyone?

I'd suggest you restart the postmaster and see if things get better,
except I'm afraid of destroying the evidence.

Does cygwin have any equivalent of strace/ktrace?  It'd be useful
to see whether the write() calls are still being issued or not.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] code question: rewriteDefine.c

2003-11-21 Thread Tom Lane

Neil Conway [EMAIL PROTECTED] writes:
 Under what circumstances do we convert a relation to a view? Is this
 functionality exposed to the user?

This is a backwards-compatibility hangover.  pg_dump scripts from
somewhere back in the Dark Ages (6.something) would represent a view
as
CREATE TABLE v (column-list);
CREATE RULE _RETURN AS ON SELECT TO v DO INSTEAD ...;
and the code you are looking at is intended to convert this locution
into a genuine-per-current-representation view.

I'm not sure how important it is to continue supporting that.  But I'd
not want to break it just because someone thinks the hack is ugly.
It was ugly from day one.

 Furthermore, it seems broken: it checks the pgclass.relhassubclass
 attribute for this relation to see if it has child tables, but this
 is wrong, as relhassubclass only indicates that the relation MAY have
 a subclass, not that is definitely does[1]. It also doesn't drop the
 relation's TOAST table, if any, as the code itself notes.

There could not be any child tables, either current or former, in the
intended application.  There could be a TOAST table, but getting rid of
it would only save some useless entries in pg_class etc, not prevent any
functional problems, so no one bothered.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] tsearch2 patch for 7.4.1

2003-11-21 Thread Teodor Sigaev

No problem, dictionary with support of compounds will be avaliable as separate 
contrib module from our site till 7.5.

Hannu Krosing wrote:
Tom Lane kirjutas N, 20.11.2003 kell 17:18:

Oleg Bartunov [EMAIL PROTECTED] writes:

we have a patch for contrib/tsearch2 we'd like to commit for 7.4.1.
Is it ok ?

We added support for compound words using ispell dictionaries.
It's rather important feature for agglutinative languages like
german, norwegian (tested). This work was sponsored by ABC Startsiden
(www.startsiden.no).
This strikes me as a feature addition, and therefore not appropriate for
the 7.4 branch.  You should only commit bug fixes into stable branches,
not feature additions.


Is this so even for contrib ?

I seem to remember that there have been new feature popping up in
contrib in stable branches earlier.
But if the general direction is to stabilise contrib as well, what about
committing it as a separate patch file 
contrib/tsearch2/patches/compond-words-patch.1 , so that these people
(or german/norvegian linux distributors) who consider lack of support
for compound words a bug can apply it themselves before building ?

---
Hannu
---(end of broadcast)---
TIP 8: explain analyze is your friend
--
Teodor Sigaev  E-mail: [EMAIL PROTECTED]
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] tsearch2 patch for 7.4.1

2003-11-21 Thread Oleg Bartunov

On Fri, 21 Nov 2003, Teodor Sigaev wrote:

 No problem, dictionary with support of compounds will be avaliable as separate
 contrib module from our site till 7.5.

Hmm, I think better not to introduce another dictionary, which require
additional efforts to configure tsearch2, but maintain whole module
with new ispell dictionary builtin.

Oleg



 Hannu Krosing wrote:
  Tom Lane kirjutas N, 20.11.2003 kell 17:18:
 
 Oleg Bartunov [EMAIL PROTECTED] writes:
 
 we have a patch for contrib/tsearch2 we'd like to commit for 7.4.1.
 Is it ok ?
 
 We added support for compound words using ispell dictionaries.
 It's rather important feature for agglutinative languages like
 german, norwegian (tested). This work was sponsored by ABC Startsiden
 (www.startsiden.no).
 
 This strikes me as a feature addition, and therefore not appropriate for
 the 7.4 branch.  You should only commit bug fixes into stable branches,
 not feature additions.
 
 
  Is this so even for contrib ?
 
  I seem to remember that there have been new feature popping up in
  contrib in stable branches earlier.
 
  But if the general direction is to stabilise contrib as well, what about
  committing it as a separate patch file
  contrib/tsearch2/patches/compond-words-patch.1 , so that these people
  (or german/norvegian linux distributors) who consider lack of support
  for compound words a bug can apply it themselves before building ?
 
  ---
  Hannu
 
  ---(end of broadcast)---
  TIP 8: explain analyze is your friend



Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] tsearch2 patch for 7.4.1

2003-11-21 Thread Andreas Joseph Krogh

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Friday 21 November 2003 09:42, Oleg Bartunov wrote:
 On Fri, 21 Nov 2003, Teodor Sigaev wrote:
  No problem, dictionary with support of compounds will be avaliable as
  separate contrib module from our site till 7.5.

 Hmm, I think better not to introduce another dictionary, which require
 additional efforts to configure tsearch2, but maintain whole module
 with new ispell dictionary builtin.

 Oleg

I would very much like to use this new feature, and understand if it cannot be 
included in tsearch2 until 7.5. The easiest way for me as a user would be if, 
like Oleg suggests, it would be maintained as a whole module with ispell 
dictionary built in and was available fro download from the 
tsearch2-web-site. This should be mentioned in future (pg-7.4.x)READMEs.

- -- 
Andreas Joseph Krogh [EMAIL PROTECTED]
Managing Director, Senior Software Developer
OfficeNet AS

- - Writing software is more fun than working.

gpg public_key: http://dev.officenet.no/~andreak/public_key.asc
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/vdIvUopImDh2gfQRAnLnAJ9uUIH1TNLms6kuKC7ptNgOHCbGlQCffzU7
1sNuWfL1PWsqbJ0xIfxo8VA=
=lRte
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] tsearch2 patch for 7.4.1

2003-11-21 Thread Oleg Bartunov

On Fri, 21 Nov 2003, Andreas Joseph Krogh wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On Friday 21 November 2003 09:42, Oleg Bartunov wrote:
  On Fri, 21 Nov 2003, Teodor Sigaev wrote:
   No problem, dictionary with support of compounds will be avaliable as
   separate contrib module from our site till 7.5.
 
  Hmm, I think better not to introduce another dictionary, which require
  additional efforts to configure tsearch2, but maintain whole module
  with new ispell dictionary builtin.
 
  Oleg

 I would very much like to use this new feature, and understand if it cannot be
 included in tsearch2 until 7.5. The easiest way for me as a user would be if,
 like Oleg suggests, it would be maintained as a whole module with ispell
 dictionary built in and was available fro download from the
 tsearch2-web-site. This should be mentioned in future (pg-7.4.x)READMEs.


ok, we decided to maintain patch to tsearch2 module on our web site.
This would require a little efforts to compile but measy to maintain.

Oleg

  - --
 Andreas Joseph Krogh [EMAIL PROTECTED]
 Managing Director, Senior Software Developer
 OfficeNet AS

 - - Writing software is more fun than working.

 gpg public_key: http://dev.officenet.no/~andreak/public_key.asc
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.2.2 (GNU/Linux)

 iD8DBQE/vdIvUopImDh2gfQRAnLnAJ9uUIH1TNLms6kuKC7ptNgOHCbGlQCffzU7
 1sNuWfL1PWsqbJ0xIfxo8VA=
 =lRte
 -END PGP SIGNATURE-


Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] logical column position

2003-11-21 Thread Andreas Pflug

Tom Lane wrote:

It's completely fallacious to imagine that we could make this change be
transparent to external applications.  To take two examples:
1. How many places do you think know that pg_attribute.attnum links to
pg_attrdef.adnum?  pg_dump, psql, and the JDBC driver all appear to
know that, in a quick search of the CVS tree; I haven't even bothered to
look at pgadmin and the other apps that are likely to have such
dependencies.
2. How about linking pg_attribute.attnum to entries in pg_index.indkey?
Lots of apps know about that too.
Unless you are going to change the meanings of pg_index.indkey and
pg_attrdef.adnum, you can't simply redefine attnum as a logical column
position.  And if you do make such a change you will break code
elsewhere.
If we add a *new* column attlogpos, without changing the semantics
of attnum, then I think we have a fighting chance of making this work
without an undue amount of effort.  I see no prospect that we can
change the meaning of attnum without breaking things far and wide.
 

I don't quite understand your argumentation.
Currently, attnum is used
1) to determine position (the concern)
2) as part of the unique identifier, as used by index, FK etc
3) as pointer for data retrieval.
If only the retrieval functions would use attstoragenum or however you'd 
call it, all other references to attnum can remain untouched. Actual 
physical reordering would be hidden almost completely. This is a bit 
like abstracting a primary key from the row's physical storage location.

Regards,
Andreas
			regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend
 



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] Release cycle length

2003-11-21 Thread Alvaro Herrera

On Fri, Nov 21, 2003 at 09:38:50AM +0800, Christopher Kings-Lynne wrote:
 Yeah, I think the main issue in all this is that for real production
 sites, upgrading Postgres across major releases is *painful*.  We have
 to find a solution to that before it makes sense to speed up the
 major-release cycle.
 
 Well, I think one of the simplest is to do a topological sort of objects 
  in pg_dump (between object classes that need it), AND regression 
 testing for pg_dump :)

One of the most complex would be to avoid the need of pg_dump for
upgrades ...

-- 
Alvaro Herrera ([EMAIL PROTECTED])
I call it GNU/Linux. Except the GNU/ is silent. (Ben Reiter)

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Robert Treat

On Thu, 2003-11-20 at 19:40, Matthew T. O'Connor wrote:
 I'm open to discussion on changing the defaults.  Perhaps what it would 
 be better to use some non-linear (perhaps logorithmic) scaling factor.  
 So that you wound up with something roughly like this:
 
 #tuples   activity% for vacuum
 1k   100%
 10k   70%
 100k 45%
 1M20%
 10M  10%
 100M  8%
 


Just thinking out loud here, so disregard if you think its chaff but...
if we had a system table pg_avd_defaults that held what we generally
consider the best default percentages based on reltuples/pages, and
added a column to pg_class (could be some place better but..) which
could hold an overriding percentage, you could then have a column added
to pg_stat_all_tables called vacuum_percentage, which would be a
coalesce of the override percentage or the default percentages based on
rel_tuples (or rel_pages).  This would give autovacuum a place to look
for each table as to when it should vacuum, and gives administrators the
option to tweak it on a per table basis if they find they need a
specific table to vacuum at a different rate than the standard.   

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] logical column position

2003-11-21 Thread Robert Treat

On Thu, 2003-11-20 at 23:27, Tom Lane wrote:
 Neil Conway [EMAIL PROTECTED] writes:
  Actually, I deliberately chose attpos rather than attlognum (which is
  what some people had been calling this feature earlier). My reasoning
  was that the logical number is really a nonsensical idea: we just
  invented it on the spot.
 
 True ...
 
  In contrast, a position is a fairly natural
  thing for an attribute to have -- it's a notion with some counterpart
  in the real world.
 
 But position could at least as logically be considered to mean the
 physical position in the tuple.  I still say that these names are ripe
 for confusion.
 
 I don't have a better choice of name offhand, but if we spend 1% of the
 time already spent arguing about these issues on finding a better name,
 I'm sure we can think of one ;-)
 

Seems merging the two would work... attlogpos, the attributes logical
position.

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] logical column position

2003-11-21 Thread Tom Lane

Andreas Pflug [EMAIL PROTECTED] writes:
 I don't quite understand your argumentation.

My point is that to change attnum into a logical position without
breaking client apps (which is the ostensible reason for doing it
that way), we would need to redefine all system catalog entries that
reference columns by attnum so that they also store logical rather than
physical position.  That has a number of serious problems, one big one
being the difficulty of updating them all correctly during a column
renumbering operation.  More, it turns what would otherwise be a
relatively localized patch into a massive and bug-prone backend
modification.

I think it is better to consider attnum as sort of a mini-OID: any one
column has a uniquely assigned attnum that will never change and can
be relied on to identify that column.  This is essentially how it is
being used now (remember attnum is part of the PK for pg_attribute)
and the fact that it is also the physical position is really rather
incidental as far as the system catalogs are concerned.

You're quite right that attnum is serving three purposes, but that
doesn't mean that we can choose at random which purpose(s) to decouple.
Abandoning the assumption that attnum is a permanent identifier would
break a lot of things --- probably not only in the backend, either.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

[HACKERS] Transaction Rollback problen (3.0 Protocol)

2003-11-21 Thread Carlos Guzmán Álvarez

Hello:

I'm having a little problem with my .net data provider for postgresql 7.4.

I'm executing a little sample that does:

1. Connect to the server.
2. Start transaction.
3. Execution of an invalid SQL command.
4. Catch exception and rollback transaction.
After send the rollbact transaction command i'm not receiving any 
response from the server, instead, if the SQL command is a valid SQL 
command all runs fine, any idea about what can be the problem ??

Thanks in advance.



--
Best regards
Carlos Guzmán Álvarez
Vigo-Spain


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] logical column position

2003-11-21 Thread Andreas Pflug

Tom Lane wrote:

Andreas Pflug [EMAIL PROTECTED] writes:
 

I don't quite understand your argumentation.
   

My point is that to change attnum into a logical position without
breaking client apps (which is the ostensible reason for doing it
that way), we would need to redefine all system catalog entries that
reference columns by attnum so that they also store logical rather than
physical position.  That has a number of serious problems, one big one
being the difficulty of updating them all correctly during a column
renumbering operation.  More, it turns what would otherwise be a
relatively localized patch into a massive and bug-prone backend
modification.
I think it is better to consider attnum as sort of a mini-OID: any one
column has a uniquely assigned attnum that will never change and can
be relied on to identify that column.  This is essentially how it is
being used now (remember attnum is part of the PK for pg_attribute)
and the fact that it is also the physical position is really rather
incidental as far as the system catalogs are concerned.
 

I agree considering attrelid/attnum as kind-of OID, but a relation's 
pg_class.oid won't change at ALTER TABLE either, I'd expect the same 
from ALTER COLUMN.

You're quite right that attnum is serving three purposes, but that
doesn't mean that we can choose at random which purpose(s) to decouple.
Abandoning the assumption that attnum is a permanent identifier would
break a lot of things --- probably not only in the backend, either.
 

Maybe my proposal wasn't clear enough:
Just as an index references a pg_class entry by it's OID, not some value 
identifying it's physical storage, all objects might continue 
referencing columns by attnum. Only tuple handling functions like 
heap_getattr and heap_formtuple need to know how to extract a Datum by 
its attnum from a HeapTuple or how to compile a HeapTuple correctly. If 
reshuffling columns is done inside of these functions, it would be 
transparent to the rest of the backend and the clients. Hopefully, there 
are not too much of such functions, or fancy modules bypassing them...

Regards,
Andreas


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] logical column position

2003-11-21 Thread Tom Lane

Andreas Pflug [EMAIL PROTECTED] writes:
 Maybe my proposal wasn't clear enough:
 Just as an index references a pg_class entry by it's OID, not some value 
 identifying it's physical storage, all objects might continue 
 referencing columns by attnum.

That's exactly the same thing I am saying.  Your mistake is to assume
that this function can be combined with identification of a (changeable)
logical column position.  It can't.  Changeability and immutability are
just not compatible requirements.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Transaction Rollback problen (3.0 Protocol)

2003-11-21 Thread Tom Lane

=?ISO-8859-1?Q?Carlos_Guzm=E1n_=C1lvarez?= [EMAIL PROTECTED] writes:
 After send the rollbact transaction command i'm not receiving any 
 response from the server, instead, if the SQL command is a valid SQL 
 command all runs fine, any idea about what can be the problem ??

Are you using the extended query protocol?  If so you probably have
forgotten the need for a Sync message.  After an error, the server
discards frontend messages until it sees a Sync.  This is needed so that
you can fire off several component messages in an extended-query
operation without waiting to see whether each one succeeds.  You put the
Sync at the point where you want to wait for results.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

[HACKERS] Anyone working on pg_dump dependency ordering?

2003-11-21 Thread Tom Lane

I'm thinking about attacking pg_dump's lack of knowledge about using
dependencies to determine a safe dump order.  But if there's someone
out there actively working on the problem, I don't want to tread on
your toes ... anyone?

Also, if you've got uncommitted patches for pg_dump, please let me know.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] logical column position

2003-11-21 Thread Andreas Pflug

Tom Lane wrote:

Andreas Pflug [EMAIL PROTECTED] writes:
 

Maybe my proposal wasn't clear enough:
Just as an index references a pg_class entry by it's OID, not some value 
identifying it's physical storage, all objects might continue 
referencing columns by attnum.
   

That's exactly the same thing I am saying.  Your mistake is to assume
that this function can be combined with identification of a (changeable)
logical column position.  It can't.  Changeability and immutability are
just not compatible requirements.
 

In the mind of a programmer, a ALTER COLUMN doesn't create a new column, 
but merely changes some attributes of an existing column. In this sense, 
changeability and immutability are not controversal.

Digging deeper:

TupDesc contains an array of physical attr descriptions, and to access a 
column description attnum is taken as index into that array (taken from 
fastgetattr).

   return fetchatt(tupleDesc-attrs[attnum-1], ...)

The physical location can easily reordered if there's an additional 
array, to translate attnum into the array index.

   return fetchatt(tupleDesc-attrs[tupleDesc-attrpos[attnum-1]] ...

For sure, reordering (i.e. changing the attrpos array) may only be 
performed as long as the column isn't referenced.

Regards,
Andreas


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] 7.4 logging bug.

2003-11-21 Thread Kurt Roeckx

On Fri, Nov 21, 2003 at 02:49:28AM -0500, Tom Lane wrote:
 Kurt Roeckx [EMAIL PROTECTED] writes:
  It's still logging the recycled transation log file.  Is that
  send to stdout instead of stderr maybe?
 
 No, it all goes to stderr.  But that output comes from a different
 subprocess.  Not sure why that subprocess would still have working
 stderr if others don't ... any ideas anyone?

It seems quiting and starting psql fixed it.  So I guess it's
the backend that had the problem.  So it was a different
subprocess.

 Does cygwin have any equivalent of strace/ktrace?  It'd be useful
 to see whether the write() calls are still being issued or not.

It has a strace, but when I use it I get a nice exception.  I
never really got either strace or gdb to work properly under
cygwin when attachting to an existing process.


Kurt


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Sponsoring enterprise features

2003-11-21 Thread James Rogers

On Thu, 2003-11-20 at 22:20, Tom Lane wrote:
 It should be noted that because Oracle does it that way is a
 guaranteed nonstarter as a rationale for any Postgres feature proposal.


A method of doing something is not a feature; making something
possible that couldn't be done before is a feature.  I don't really
care how Oracle does something, though I am cognizant of *why* Oracle
does something.  s/Oracle/DB2/, and little changes.


 There are enough differences between Postgres and Oracle that you will
 need to do significant investigation before assuming that an Oracle-
 based feature design is appropriate for Postgres.  Aside from technical
 differences, we have fundamentally different priorities --- one of which
 is simplicity of administration.  You'll get no buyin on proposals that
 tend to create Oracle-like difficulties of installation and tuning.


I'm not sure what Oracle has to do with any of this.  If I wanted to use
Oracle, I would buy Oracle.  The thing is, I'm intimately familiar with
Oracle and there are a lot of things I despise about Oracle as a
consequence of this familiarity.  The features I'm talking about can be
added to any reasonable database engine, and are generically supported
features (or enterprise add-ons) in virtually all large commercial
databases.

As I stated previously, I/we are interested in adding features for
managing very large tables and working sets, and making Postgres scale
in general for these kinds of databases (currently, it does not).  These
kinds of features will be important for enterprise users, particularly
ones interested in migrating from Oracle/DB2/SQLServer/etc, and would be
invisible to people that don't need them.  This is a matter of adding
important functionality that can be supported by any reasonable database
engine.  In a nutshell, the features on my short list are all about heap
management (e.g. partitioning).  This is really important when databases
reach a certain size, but something for which Postgres has almost no
support.  

From a large-scale enterprise database standpoint, heap management is
almost as important a capability as replication.  Replication is being
aggressively worked on, heap management is not and so we are interested
in making sure this part gets developed.  When PostgreSQL has this,
there will be little reason for anyone to use the big commercial
database-du-jour.  I don't care how its implemented specifically, just
as long as it is in there, and there is no technical reason that it
couldn't be implemented per previous discussions.

I've gotten the green light (and many responses from people interested
in doing it) to start writing up RFQs for specific features, which I
will post to the pg-hackers list.  It is all stuff previously determined
to be doable within the current PostgreSQL framework, and just requiring
some work that my company is willing to help pay for.

Cheers,

-James Rogers
 [EMAIL PROTECTED]




---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] calling plpgsql from c

2003-11-21 Thread Tom Lane

Max Jacob [EMAIL PROTECTED] writes:
 I'm trying to call plpgsql functions from c functions directly through 
 the Oid, but i have a problem: it seems that the plpgsql interpreter 
 calls SPI_connect and fails even if the caller has already 
 spi-connected.

This is a safety check.  If you are connected to SPI, you need to call
SPI_push() and SPI_pop() around any operation that might involve
recursive use of SPI.  That helps delimit your calls versus their
calls versus no man's land.

It does seem that this is quite undocumented though.  Jan?

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Transaction Rollback problen (3.0 Protocol)

2003-11-21 Thread Carlos Guzmán Álvarez

Hello:

Are you using the extended query protocol?  If so you probably have
forgotten the need for a Sync message.  
You are right, thanks very much, it's working well now.





--
Best regards
Carlos Guzmán Álvarez
Vigo-España
---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Release cycle length

2003-11-21 Thread Jan Wieck

Alvaro Herrera wrote:

On Fri, Nov 21, 2003 at 09:38:50AM +0800, Christopher Kings-Lynne wrote:
Yeah, I think the main issue in all this is that for real production
sites, upgrading Postgres across major releases is *painful*.  We have
to find a solution to that before it makes sense to speed up the
major-release cycle.
Well, I think one of the simplest is to do a topological sort of objects 
 in pg_dump (between object classes that need it), AND regression 
testing for pg_dump :)
One of the most complex would be to avoid the need of pg_dump for
upgrades ...
We don't need a simple way, we need a way to create some sort of catalog 
diff and a safe way to apply that to an existing installation during 
the upgrade.

I think with a shutdown postmaster, a standalone backend used to check 
that no conflicts exist in any DB, then using the new backend in 
bootstrap mode to apply the changes, could be an idea to think of. It 
would still require some downtime, but nobody can avoid that when 
replacing the postgres binaries anyway, so that's not a real issue. As 
long as it eliminates dump, initdb, reload it will be acceptable.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] initdb segfaults - latest cvs

2003-11-21 Thread Tom Lane

strk [EMAIL PROTECTED] writes:
 It seems that the build system is missing something
 (make distclean made it work)

If you aren't using configure --enable-depend, you should count on doing
at least make clean, preferably make distclean anytime you do a CVS
update.  The default behavior is not to spend any cycles on tracking
header-file dependencies.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Build farm

2003-11-21 Thread Tom Lane

Peter Eisentraut [EMAIL PROTECTED] writes:
 Andrew Dunstan writes:
 Maybe it wouldn't be of great value to PostgreSQL. And maybe it would. I
 have an open mind about it. I don't think incompleteness is an argument
 against it, though.

 If you want to do it, by all means go for it.  I'm sure it would give
 everyone a fuzzy feeling to see the green lights everywhere.  But
 realistically, don't expect any significant practical benefits, such
 cutting beta time by 10%.

I think the main value of a build farm is that we'd get nearly immediate
feedback about the majority of simple porting problems.  Your previous
arguments that it wouldn't smoke everything out are certainly valid ---
but we wouldn't abandon the regression tests just because they don't
find everything.  Immediate feedback is good because a patch can be
fixed while it's still fresh in the author's mind.

I'm for it ...

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] logical column position

2003-11-21 Thread Tom Lane

Andreas Pflug [EMAIL PROTECTED] writes:
 To put it differently: a ALTER COLUMN command may never-ever change the 
 identifier of the column, i.e. attrelid/attnum.

If the ALTER is changing the column type, it's not really the same
column anymore; I see nothing wrong with assigning a new attnum in that
scenario.  It's not like you can simply change the type and not go visit
the references in such a case.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] logical column position

2003-11-21 Thread Andreas Pflug

Tom Lane wrote:

If the ALTER is changing the column type, it's not really the same
column anymore;
This doesn't strike. If the ALTER is changing the number of columns, 
it's not really the same table anymore is as true as your statement. 
Still, pg_class.oid remains the same for ADD and DROP column.

I see nothing wrong with assigning a new attnum in that
scenario.  It's not like you can simply change the type and not go visit
the references in such a case.
 

But this fix is about automatically updating references as well, making 
the ALTER COLUMN appear a low-impact change to the user (which obviously 
isn't true, unless my proposed shortcut for binary compatible type 
changes is implemented).

When dropping and recreating an object, nobody would expect to get the 
same identifier. When altering, I *do* expect the identifier to remain 
the same.

Regards,
Andreas


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] Release cycle length

2003-11-21 Thread Tom Lane

Jan Wieck [EMAIL PROTECTED] writes:
 Alvaro Herrera wrote:
 One of the most complex would be to avoid the need of pg_dump for
 upgrades ...

 We don't need a simple way, we need a way to create some sort of catalog 
 diff and a safe way to apply that to an existing installation during 
 the upgrade.

I still think that pg_upgrade is the right idea: load a schema dump from
the old database into the new one, then transfer the user data files and
indexes via cheating (doubly linking, if possible).  Obviously there is
a lot of work still to make this happen reliably, but we have seen
proof-of-concept some while ago, whereas catalog diffs are pie in the
sky IMHO.  (You could not use either the old postmaster version or the
new version to apply such a diff...)

A big advantage of the pg_upgrade concept in my mind is that if it fails
partway through, you need have made no changes to the original
installation.  Any mid-course problem with an in-place-diff approach
leaves you completely screwed :-(

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Build farm

2003-11-21 Thread Andrew Dunstan

Tom Lane wrote:

I think the main value of a build farm is that we'd get nearly immediate
feedback about the majority of simple porting problems.  Your previous
arguments that it wouldn't smoke everything out are certainly valid ---
but we wouldn't abandon the regression tests just because they don't
find everything.  Immediate feedback is good because a patch can be
fixed while it's still fresh in the author's mind.
Yes, I seem to recall seeing several instances of things like you mean 
foonix version 97 1/2 has a bad frobnitz.h? over the last 6 months. 
Having that caught early is exactly the advantage, I believe.

I'm for it ...
 

I'm working on it :-)

Regarding make distcheck that Peter suggested I use, unless I'm 
mistaken it carefully does its own configure, thus ignoring the 
configure options set in the original directory. Perhaps we need either 
to have the distcheck target pick up all the --with/--without and 
--enable/--disable options, or to have a similar target that does that.

Thoughts?

cheers

andrew



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Release cycle length

2003-11-21 Thread Oli Sennhauser

Hello hackers

Sorry when I am talking to the gurus...

There is a database, which has a concept called Transportable 
Tablespace (TTS). Would it not be a verry easy and fast solution to 
just do this with the Tables, Index and all non catalog related files.
- You create a new db cluster (e.g 8.0).
- Generate a TTS export skript.
- Shut the (old) db-cluster (files should be consistent now, ev. do 
something with the log files before).
- Move the files (eventually not needed) and
- plug it in to the new db cluster (via the export skript).

Expected downtime (without moving data files) 5-10 minutes.

Regards Oli

---

Oli Sennhauser
Database-Engineer (Oracle  PostgreSQL)
Rebenweg 6
CH - 8610 Uster / Switzerland
Phone (+41) 1 940 24 82 or Mobile (+41) 79 450 49 14
e-Mail [EMAIL PROTECTED]
Website http://mypage.bluewin.ch/shinguz/PostgreSQL/
Secure (signed/encrypted) e-Mail with a Free Personal SwissSign ID: http://www.swisssign.ch

Import the SwissSign Root Certificate: http://swisssign.net/cgi-bin/trust/import



smime.p7s
Description: S/MIME Cryptographic Signature

[HACKERS] ObjectWeb/Clustered JDBC

2003-11-21 Thread Peter Eisentraut

I was at the ObjectWeb Conference today; ObjectWeb
(http://www.objectweb.org) being a consortium that has amassed quite an
impressive array of open-source, Java-based middleware under their
umbrella, including for instance our old friend Enhydra.  And they
regularly kept mentioning PostgreSQL in their presentations.

To those that are interested in distributed transactions/two-phase commit,
I recommend taking a look at Clustered JDBC
(http://c-jdbc.objectweb.org/).  While this is not exactly the same thing,
it looks to be a pretty neat solution for a similar class of applications.
In particular, it provides redundancy, load balancing, caching, and even
database independence.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] conversion dumping patch

2003-11-21 Thread Tom Lane

Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 When you get around to it, can you commit the patch I submitted that 
 dumps conversions in pg_dump.  I need that in to complete my COMMENT ON 
 patch.

Just for the record, this is committed as part of the COMMENT ON patch.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Greg Stark

Tom Lane [EMAIL PROTECTED] writes:

 Josh Berkus [EMAIL PROTECTED] writes:
  BTW, do we have any provisions to avoid overlapping vacuums?  That is, to 
  prevent a second vacuum on a table if an earlier one is still running?
 
 Yes, VACUUM takes a lock that prevents another VACUUM on the same table.

The second vacuum waits for the lock to become available. If the situation got
really bad there could end up being a growing queue of vacuums waiting.

I'm not sure how likely this is as the subsequent vacuums appear to finish
quite quickly though. But then the largest table I have to play with fits
entirely in memory.

-- 
greg


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Anyone working on pg_dump dependency ordering?

2003-11-21 Thread Christopher Kings-Lynne

I'm thinking about attacking pg_dump's lack of knowledge about using
dependencies to determine a safe dump order.  But if there's someone
out there actively working on the problem, I don't want to tread on
your toes ... anyone?
I've done a whole lot of _thinking_, but basically no _doing_, so go 
right ahead :)

I may as well let you know my thoughts:

There are two levels (sort of) of dependency.  The first is that whole 
classes of objects can be dependent on whole other classes.  eg. 
databases depend on users, or ALL FK's can be dumped after ALL tables, 
etc..  It would make the dump more readable if you dumped those definite 
dependencies in that order, rather than shuffling everything up.

The second level of dependency is when a bunch of object types can 
depend on each other.  The current solution for that is to sort by OID, 
but this fails when it is possible to add a dependency to an object 
after it has been created.

eg:

- Adding a column (with a type) to a table
- All the CREATE OR REPLACE commands
- etc.
Hence, a full db wide topological sort might not be necessary.

Lastly, I presume it's possible to create a system of circular 
dependencies (eg create or replace view), which really cannot be solved 
without a system of 'shells', similar to that needed to dump types and 
their i/o functions.

Views seem to be by far the nastiest object.  They can be dependent on 
almost everything in the database.

Also, if you've got uncommitted patches for pg_dump, please let me know.
Yes, my 'COMMENT ON' mega patch in the queue contains dumping of 
conversions and comments on a bunch of objects.

BTW, if you commit that patch - you might want to change my comment on 
type patch to put the  around any, and change the results file 
appropriately.  I noticed I accidentally included that in the patch, and 
was about to mention it.

CHris



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Matthew T. O'Connor

Robert Treat wrote:

Just thinking out loud here, so disregard if you think its chaff but...
if we had a system table pg_avd_defaults 

[snip]

As long as pg_autovacuum remains a contrib module, I don't think any 
changes to the system catelogs will be make.  If  pg_autovacuum is 
deemed ready to move out of contrib, then we can talk about the above.

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Matthew T. O'Connor

Josh Berkus wrote:

Matthew,
 

True, but I think it would be one hour once, rather than 30 minutes 4
times.
   

Well, generally it would be about 6-8 times at 2-4 minutes each.
 

Are you saying that you can vacuum a 1 million row table in 2-4 
minutes?  While a vacuum of the same table with an additional 1 million 
dead tuples would take an hour?

This is one of the things I had hoped to add to pg_autovacuum, but never
got to.  In addition to just the information from the stats collector on
inserts updates and deletes, pg_autovacuum should also look at the FSM,
and make decisions based on it.  Anyone looking for a project?
   

Hmmm ... I think that's the wrong approach.  Once your database is populated, 
it's very easy to determine how to set the FSM for a given pg_avd level.   If 
you're vacuuming after 20% updates, for example, just set fsm_pages to 20% of 
the total database pages plus growth  safety margins.
 

Ok.

I'd be really reluctant to base pv-avd frequency on the fsm settings instead.  
What if the user loads 8GB of data but leaves fsm_pages at the default of 
10,000?  You can't do much with that; you'd have to vacuum if even 1% of the 
data changed.

Ok, but as you said above it's very easy to set the FSM once you know 
your db size.

The other problem is that calculating data pages from a count of 
updates+deletes would require pg_avd to keep more statistics and do more math 
for every table.  Do we want to do this?
 

I would think the math is simple enough to not be a big problem.  Also, 
I did not recommend looking blindly at the FSM as our guide, rather 
consulting it as another source of information as to when it would be 
useful to vacuum.  I don't have a good plan as to how to incorporate 
this data, but to a large extent the FSM already tracks table activity 
and gives us the most accurate answer about storage growth (short of  
using something like contrib/pgstattuple which takes nearly the same 
amount of time as an actual vacuum)

But I can't imagine that 2% makes any difference on a large table.  In
fact I would think that 10-15% would hardly be noticable, beyond that
I'm not sure.
   

I've seen performance lag at 10% of records, especially in tables where both 
update and select activity focus on one subset of the table (calendar tables, 
for example).
 

Ok.

Valid points, and again I think this points to the fact that
pg_autovacuum needs to be more configurable.  Being able to set
different thresholds for different tables will help considerably.  In
fact, you may find that some tables should have a vac threshold much
larger than the analyze thresold, while other tables might want the
opposite.
   

Sure.  Though I think we can make the present configuration work with a little 
adjustment of the numbers.   I'll have a chance to test on production 
databases soon.
 

I look forward to hearing results from your testing.

I would be surprized if you can notice the difference between a vacuum
analyze and a vacuum, especially on large tables.
   

It's substantial for tables with high statistics settings.   A 1,000,000 row 
table with 5 columns set to statistics=250 can take 3 minutes to analyze on a 
medium-grade server.
 

In my testing, I never changed the default statistics settings.

I think you need two separate schedules.  There are lots of times where
a vacuum doesn't help, and an analyze is all that is needed
   

Agreed.  And I've just talked to a client who may want to use pg_avd's ANALYZE 
scheduling but not use vacuum at all.   BTW, I think we should have a setting 
for this; for example, if -V is -1, don't vacuum.
 

That would be nice.  Easy to add, and something I never thought of

I'm open to discussion on changing the defaults.  Perhaps what it would
be better to use some non-linear (perhaps logorithmic) scaling factor.
   

That would be cool, too.Though a count of data pages would be a better 
scale than a count of rows, and equally obtainable from pg_class.
 

But we track tuples because we can compare against the count given by 
the stats system.  I don't know of a way (other than looking at the FSM, 
or contrib/pgstattuple ) to see how many dead pages exist.



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Shridhar Daithankar

Matthew T. O'Connor wrote:

But we track tuples because we can compare against the count given by 
the stats system.  I don't know of a way (other than looking at the FSM, 
or contrib/pgstattuple ) to see how many dead pages exist.
I think making pg_autovacuum dependent of pgstattuple is very good idea.

Probably it might be a good idea to extend pgstattuple to return pages that are 
excessively contaminated and clean them ASAP. Step by step getting closer to 
daemonized vacuum.

 Shridhar

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Josh Berkus

Matthew,

 As long as pg_autovacuum remains a contrib module, I don't think any
 changes to the system catelogs will be make.  If  pg_autovacuum is
 deemed ready to move out of contrib, then we can talk about the above.

But we could create a config file that would store stuff in a flatfile table, 
OR we could add our own system table that would be created when one 
initializes pg_avd.

Just an idea.  Mind you, I'm not so sure that we want to focus immediately on 
per-table settings.   I think that we want to get the automatic settings 
working fairly well first; a lot of new DBAs would use the per-table settings 
to shoot themselves in the foot.  So we need to be able to make a strong 
recommendation to try the automatic settings first.

 Are you saying that you can vacuum a 1 million row table in 2-4
 minutes?  While a vacuum of the same table with an additional 1 million
 dead tuples would take an hour?

I'm probably exaggerating.  I do know that I can vacuum a fairly clean 1-5 
million row table in less than 4 mintues.   I've never let such a table get 
to 50% dead tuples, so I don't really know how long that takes.  Call me a 
coward if you  like ...

 I'd be really reluctant to base pv-avd frequency on the fsm settings
  instead. What if the user loads 8GB of data but leaves fsm_pages at the
  default of 10,000?  You can't do much with that; you'd have to vacuum if
  even 1% of the data changed.

 Ok, but as you said above it's very easy to set the FSM once you know
 your db size.

Actually, thinking about this I realize that PG_AVD and the Perl-based 
postgresql.conf configuration script I was working on (darn, who was doing 
that with me?) need to go togther.   With pg_avd, setting max_fsm_pages is 
very easy; without it its a bit of guesswork.

So I think we can do this:  for 'auto' settings:

If max_fsm_pages is between 13% and 100% of the total database pages, then set 
the vacuum scale factor to match 3/4 of the fsm_pages setting, e.g.
database = 18,000,000 data pages;
max_fsm_pages = 3,600,000;
set vacuum scale factor = 3.6mil/18mil * 3/4 = 0.15

If max_fsm_pages is less than 13% of database pages, issue a warning to the 
user (log it, if possible) and set scale factor to 0.1.   If it's greater 
than 100% set it to 1 and leave it alone.

 I don't have a good plan as to how to incorporate
 this data, but to a large extent the FSM already tracks table activity
 and gives us the most accurate answer about storage growth (short of
 using something like contrib/pgstattuple which takes nearly the same
 amount of time as an actual vacuum)

I don't really think we need to do dynamic monitoring at this point.   It 
would be a lot of engineering to check data page pollution without having 
significant performance impact.   It's doable, but something I think we 
should hold off until version 3.  It would mean hacking the FSM, which is a 
little beyond me right now.

 In my testing, I never changed the default statistics settings.

Ah.  Well, a lot of users do to resolve query problems.

 But we track tuples because we can compare against the count given by
 the stats system.  I don't know of a way (other than looking at the FSM,
 or contrib/pgstattuple ) to see how many dead pages exist.

No, but for scaling you don't need the dynamic count of tuples or of dead 
tuples; pg_class holds a reasonable accurate count of pages per table as of 
last vacuum.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Matthew T. O'Connor

Josh Berkus wrote:

Matthew,
 

But we could create a config file that would store stuff in a flatfile table, 
OR we could add our own system table that would be created when one 
initializes pg_avd.
 

I don't want to add tables to existing databases, as I consider that 
clutter and I never like using tools that clutter my production 
databases.  I had considered using a pg_autovacuum database that if 
found, would store customized settings for individual tables / 
databases.  Dunno if this is a  good idea, but it might make a good 
stopgap until people are comfortable modifying the system catalogs for 
autovacuum. 

Actually, this might be a necessary addition as pg_autovacuum currently 
suffers from the startup transients that the FSM used to suffer from, 
that is, it doesn't remember anything that happened the last time it 
ran.  A pg_autovacuum database could also be used to store thresholds 
and counts from the last time it ran.

Just an idea.  Mind you, I'm not so sure that we want to focus immediately on 
per-table settings.   I think that we want to get the automatic settings 
working fairly well first; a lot of new DBAs would use the per-table settings 
to shoot themselves in the foot.  So we need to be able to make a strong 
recommendation to try the automatic settings first.
 

I agree in principle, question is what are the best settings, I still 
think it will be hard to find a one size fits all, but I'm sure we can 
do better than what we have.

Actually, thinking about this I realize that PG_AVD and the Perl-based 
postgresql.conf configuration script I was working on (darn, who was doing 
that with me?) need to go togther.   With pg_avd, setting max_fsm_pages is 
very easy; without it its a bit of guesswork.

So I think we can do this:  for 'auto' settings:

If max_fsm_pages is between 13% and 100% of the total database pages, then set 
the vacuum scale factor to match 3/4 of the fsm_pages setting, e.g.
database = 18,000,000 data pages;
max_fsm_pages = 3,600,000;
set vacuum scale factor = 3.6mil/18mil * 3/4 = 0.15
 

Where are you getting 13% from?  Do you know of an easy way to get a 
count of the total pages used by a whole cluster?  I guess we can just 
iterate over all the tables in all the databases and sum up the total 
num of pages.  We already iterate over them all, we just don't sum it up.

If max_fsm_pages is less than 13% of database pages, issue a warning to the 
user (log it, if possible) and set scale factor to 0.1.   If it's greater 
than 100% set it to 1 and leave it alone.
 

Again I ask where 13% is coming from and also where is 0.1 coming from?  
I assume these are your best guesses right now, but not more than that.  
I do like the concept though as long as we find good values for 
min_fsm_percentage and min_autovac_scaling_factor.

But we track tuples because we can compare against the count given by
the stats system.  I don't know of a way (other than looking at the FSM,
or contrib/pgstattuple ) to see how many dead pages exist.
   

No, but for scaling you don't need the dynamic count of tuples or of dead 
tuples; pg_class holds a reasonable accurate count of pages per table as of 
last vacuum.
 

Which we already keep a copy of inside of pg_autovacuum, and update 
after we issue a vacuum.



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Josh Berkus

Matthew,

 Actually, this might be a necessary addition as pg_autovacuum currently 
 suffers from the startup transients that the FSM used to suffer from, 
 that is, it doesn't remember anything that happened the last time it 
 ran.  A pg_autovacuum database could also be used to store thresholds 
 and counts from the last time it ran.

I don't see how a seperate database is better than a table in the databases., 
except that it means scanning only one table and not one per database.   For 
one thing, making it a seperate database could make it hard to back up and 
move your database+pg_avd config.

But I don't feel strongly about it.

 Where are you getting 13% from? 

13% * 3/4 ~~ 10%

And I think both of use agree that vacuuming tables with less than 10% changes 
is excessive and could lead to problems on its own, like overlapping vacuums.

  Do you know of an easy way to get a 
 count of the total pages used by a whole cluster?

Select sum(relpages) from pg_class.

 I do like the concept though as long as we find good values for 
 min_fsm_percentage and min_autovac_scaling_factor.

See above.  I propose 0.13 and 0.1

 Which we already keep a copy of inside of pg_autovacuum, and update 
 after we issue a vacuum.

Even easier then.

BTW, do we have any provisions to avoid overlapping vacuums?  That is, to 
prevent a second vacuum on a table if an earlier one is still running?

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Matthew T. O'Connor

Josh Berkus wrote:

Matthew,

 

I don't see how a seperate database is better than a table in the databases., 
except that it means scanning only one table and not one per database.   For 
one thing, making it a seperate database could make it hard to back up and 
move your database+pg_avd config.
 

Basically, I don't like the idea of modifying users databases, besides, 
in the long run most of what needs to be tracked will be moved to the 
system catalogs.  I kind of consider the pg_autvacuum database to 
equivalent to the changes that will need to be made to the system catalogs.

I guess it could make it harder to backup if you are moving your 
database between clusters.  Perhaps, if you create a pg_autovacuum 
schema inside of your database then we would could use that.  I just 
don't like tools that drop things into your database.

Where are you getting 13% from? 
   

13% * 3/4 ~~ 10%

And I think both of use agree that vacuuming tables with less than 10% changes 
is excessive and could lead to problems on its own, like overlapping vacuums.

 

I certainly agree that less than 10% would be excessive, I still feel 
that 10% may not be high enough though.   That's why I kinda liked the 
sliding scale I mentioned earlier, because I agree that for very large 
tables, something as low as 10% might be useful, but most tables in a 
database would not be that large.

Do you know of an easy way to get a 
count of the total pages used by a whole cluster?
   

Select sum(relpages) from pg_class.

 

duh

BTW, do we have any provisions to avoid overlapping vacuums?  That is, to 
prevent a second vacuum on a table if an earlier one is still running?

 

Only that pg_autovacuum isn't smart enough to kick off more than one 
vacuum at a time.  Basically, pg_autovacuum issues a vacuum on a table 
and waits for it to finish, then check the next table in it's list to 
see if it needs to be vacuumed, if so, it does it and waits for that 
vacuum to finish.   There was some discussion of issuing concurrent 
vacuum against different tables, but it was decided that since vacuum is 
I/O bound, it would only make sense to issue concurrent vacuums that 
were on different spindles, which is not something I wanted to get 
into.  Also, given the recent talk about how vacuum is still such a 
performance hog, I can't imagine what multiple concurrent vacuums would 
do to performance.  Maybe as 7.5 develops and many of the vacuum 
performance issues are addressed, we can revisit this question.



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Josh Berkus

Matthew,

 Basically, I don't like the idea of modifying users databases, besides, 
 in the long run most of what needs to be tracked will be moved to the 
 system catalogs.  I kind of consider the pg_autvacuum database to 
 equivalent to the changes that will need to be made to the system catalogs.

OK.  As I said, I don't feel strongly about it.

 I certainly agree that less than 10% would be excessive, I still feel 
 that 10% may not be high enough though.   That's why I kinda liked the 
 sliding scale I mentioned earlier, because I agree that for very large 
 tables, something as low as 10% might be useful, but most tables in a 
 database would not be that large.

Yes, but I thought that we were taking care of that through the threshold 
value?

A sliding scale would also be OK.   However, that would definitely require a 
leap to storing per-table pg_avd statistics and settings.

 Only that pg_autovacuum isn't smart enough to kick off more than one 
 vacuum at a time.  Basically, pg_autovacuum issues a vacuum on a table 
 and waits for it to finish, then check the next table in it's list to 
 see if it needs to be vacuumed, if so, it does it and waits for that 
 vacuum to finish. 

OK, then, we just need to detect the condition of the vacuums piling up 
because they are happening too often.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Matthew T. O'Connor

Josh Berkus wrote:

Matthew,
 

I certainly agree that less than 10% would be excessive, I still feel 
that 10% may not be high enough though.   That's why I kinda liked the 
sliding scale I mentioned earlier, because I agree that for very large 
tables, something as low as 10% might be useful, but most tables in a 
database would not be that large.
   

Yes, but I thought that we were taking care of that through the threshold 
value?
 

Well the threshold is a combination of the base value and the scaling 
factor which you are proposing is 0.1, so the threshold is base + 
(scaling factor)(num of tuples)  So with the default base of 1000 and 
your 0.1 you would have this:

Num Rowsthreshold  Percent
   1,0001,100 110%
  10,0002,000  20% 
 100,000   11,000  11%
1,000,000  102,000  10%

I don't like how that looks, hence the thought of some non-linear 
scaling factor that would still allow the percent to reach 10%, but at a 
slower rate, perhaps just a larger base value would suffice, but I think 
small table performance is going to suffer much above 1000.  Anyone else 
have an opinion on the table above? Good / Bad / Indifferent?

A sliding scale would also be OK.   However, that would definitely require a 
leap to storing per-table pg_avd statistics and settings.

 

I don't think it would, it would correlate the scaling factor with the 
number of tuples, no per-table settings required.

Only that pg_autovacuum isn't smart enough to kick off more than one 
vacuum at a time.  Basically, pg_autovacuum issues a vacuum on a table 
and waits for it to finish, then check the next table in it's list to 
see if it needs to be vacuumed, if so, it does it and waits for that 
vacuum to finish. 
   

OK, then, we just need to detect the condition of the vacuums piling up 
because they are happening too often.

 

That would be good to look into at some point, especially if vacuum is 
going to get slower as a result of the page loop delay patch that has 
been floating around.



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Tom Lane

Josh Berkus [EMAIL PROTECTED] writes:
 BTW, do we have any provisions to avoid overlapping vacuums?  That is, to 
 prevent a second vacuum on a table if an earlier one is still running?

Yes, VACUUM takes a lock that prevents another VACUUM on the same table.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] with(isstrict) vs ISSTRICT

2003-11-21 Thread Tom Lane

strk [EMAIL PROTECTED] writes:
 Does with(isStrict) still work ?

regression=# create function foo(int) returns int as
regression-# 'select $1' language sql with(isStrict);
CREATE FUNCTION
regression=# select version();
   version
-
 PostgreSQL 7.5devel on hppa2.0-hp-hpux10.20, compiled by GCC 2.95.3
(1 row)

Looks like it.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[HACKERS] First generic/redhatish RPM's uploaded to ftp.postgresql.org.

2003-11-21 Thread Lamar Owen

I have uploaded a first cut at the RPM's to ftp.postgresql.org.  While I am 
not 100% convinced of the need to do so, I have restructured the directories, 
and await comment on that.

Currently the upload is for Fedora Core 1 only.  The source RPM should compile 
on most recent Red Hat's and close cousins.

See ftp://ftp.postgresql.org/pub/binary/v7.4/fedora for the SRPMS and 
fedora-core-1 directory.  As I build the set on other distributions, or as 
others do so, I will create the appropriate directories and will link the 
SRPMS dir in each of those to the fedora/SRPMS dir, since that is the master 
source RPM.

Please read README.rpm-dist, found in the postgresql-7.4-0.1PGDG.i386.rpm 
file, or unpacked in pub/binary/v7.4/fedora, for more details.

This set is similar to previous sets in many respects.  This is not what I 
wanted; I wanted to restructure the whole shooting match in concert with 
Oliver's Debian package restructure.  The fewer differences the better, and 
many parts of Oliver's proposal I plan on implementing in the RPMs verbatim.  
However, when Oliver released the 7.4 deb without those changes, and due to 
the SuSE RPM's release, I decided to go ahead with it.  Kaj's posting of the 
patches against the 7.3.4 specfile was a tremendous help in this regard, many 
thanks Kaj!  There were problems, but it was an excellent starting point.

There are a few outstanding patches and bugs I need to fix; thus, this RPMset 
has an 0.1PGDG release tag.  I have been somewhat ill this week; maybe by 
next week I can close some bugs and get us to 1PGDG.

Even though the python client is no longer included in the main tarball, 
thanks to Kaj we have not lost the python subpackage.

I expect RH 7.3, RH9,  and RH 6.2 packages shortly from Sander, once he reads 
this mail and gets the time to build them, as he has already asked to help do 
this.  I have RH 8.0 at my disposal, and will build those.  I will also be 
building Aurora 1.0 packages.
-- 
Lamar Owen
Director of Information Technology
Pisgah Astronomical Research Institute
1 PARI Drive
Rosman, NC  28772
(828)862-5554
www.pari.edu


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] [GENERAL] First generic/redhatish RPM's uploaded to ftp.postgresql.org.

2003-11-21 Thread Lamar Owen

On Friday 21 November 2003 01:13 pm, Lamar Owen wrote:
 I have uploaded a first cut at the RPM's to ftp.postgresql.org.  While I am
 not 100% convinced of the need to do so, I have restructured the
 directories, and await comment on that.

 I expect RH 7.3, RH9,  and RH 6.2 packages shortly from Sander, once he
 reads this mail and gets the time to build them, as he has already asked to
 help do this.  I have RH 8.0 at my disposal, and will build those.  I will
 also be building Aurora 1.0 packages.

Aurora 1.0 and Red Hat 8.0 source and binaries are uploaded.  The source RPM 
has changed a little for each of these, which is noted in the release tag; I 
have some work to do in the specfile portability, which I will do soon.
-- 
Lamar Owen
Director of Information Technology
Pisgah Astronomical Research Institute
1 PARI Drive
Rosman, NC  28772
(828)862-5554
www.pari.edu


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

47 matches

Mail list logo