Re: [PATCHES] [HACKERS] writing new regexp functions

2007-02-07 Thread Jeremy Drake
On Sun, 4 Feb 2007, David Fetter wrote:

 On Fri, Feb 02, 2007 at 07:01:33PM -0800, Jeremy Drake wrote:

  Let me know if you see any bugs or issues with this code, and I am
  open to suggestions for further regression tests ;)

I have not heard anything, so I guess at this point I should figure out
where to go next with this.  I see a couple options:

* Set this up as a pgfoundry project or contrib.  This would require
  merging the patch to expose some functions from regexp.c outside that
  file, which has raised some concerns about maintainability.

* Put together a patch to add these functions to core.  I could put them
  directly in regexp.c, so the support functions could stay static.  My
  concern here is that I don't know if there are any functions currently
  in core with OUT parameters.  I don't know the acceptable style for
  handling this: OUT parameters, a named composite type, ...?

Does anyone have any opinions either way, as to how I should proceed
from here?



  * maybe a join function that works as an aggregate
 SELECT join(',', col) FROM tbl
currently can be written as
 SELECT array_to_string(ARRAY(SELECT col FROM tbl), ',')

 The array_accum() aggregate in the docs works OK for this purpose.

I have decided not to pursue this function, I think the array construct,
or the array_accum option, is about the best possible currently.  If it
should become possible in the future to write aggregates with a non-sql
state type (structs with pointers) it may be worthwhile to re-evaluate
this.


-- 
The cost of living hasn't affected its popularity.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Proposal: TABLE functions

2007-02-07 Thread Pavel Stehule



 Hello,

 Currently PostgreSQL support set returning functions.

 ANSI SQL 2003 goes with new type of functions - table functions. With 
this

 syntax

 CREATE FUNCTION foo() RETURNS TABLE (c1 t1, ... )

 PostgreSQL equal statements are:

 CREATE TYPE tmptype AS (c1 t1, ...)
 CREATE FUNCTION ... RETURNS SETOF tmptype AS ...

or you can do
CREATE FUNCTION foo(OUT c1 t1, OUT ...) RETURNS SETOF record AS ...

But I think this would be nice, I think the OUT parameters make less sense
than saying RETURNS TABLE(...).  But what about functions not returning
SETOF?



This feature doesn't change current behaviour. And using TABLE function 
means using SETOF.


Regards
Pavel Stehule

_
Najdete si svou lasku a nove pratele na Match.com. http://www.msn.cz/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] misread release notes

2007-02-07 Thread ohp
Hi Tom,
On Tue, 6 Feb 2007, Tom Lane wrote:

 Date: Tue, 06 Feb 2007 11:50:56 -0500
 From: Tom Lane [EMAIL PROTECTED]
 To: ohp@pyrenet.fr
 Cc: pgsql-hackers list pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] misread release notes

 ohp@pyrenet.fr writes:
  I read the fix incorrect permission check in
  information_schema_key_column_usage_view chapter in HISTORY far too fast
  and fed psql on each database with share/information_schema.sql.
  Too late to stop it!
  What did I do wrong, and how can I go backwards (I imagine %I broke
  something!)

 I don't think you changed anything --- none of the commands say OR REPLACE
 so they'd just all fail.
Yes but I saw a lot of inserts...


   regards, tom lane


-- 
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
15, Chemin des Monges+33-5-61-50-97-01 (Fax)
31190 AUTERIVE   +33-6-07-63-80-64 (GSM)
FRANCE  Email: ohp@pyrenet.fr
--
Make your life a dream, make your dream a reality. (St Exupery)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Proposal: Commit timestamp

2007-02-07 Thread José Orlando Pereira
On Saturday 03 February 2007, Bruce Momjian wrote:
 Jan Wieck wrote:
  I don't have any such paper and the proof of concept will be the
  implementation of the system. I do however see enough resistance against
  this proposal to withdraw the commit timestamp at this time. The new
  replication system will therefore require the installation of a patched,
  non-standard PostgreSQL version, compiled from sources cluster wide in
  order to be used. I am aware that this will dramatically reduce it's
  popularity but it is impossible to develop this essential feature as an
  external module.
 
  I thank everyone for their attention.

 Going and working on it on your own doesn't seem like the proper
 solution.  I don't see people objecting to adding it, but they want it
 work, which I am sure you want too.  You have to show how it will work
 and convince others of that, and then you have a higher chance it will
 work, and be in the PostgreSQL codebase.

Hi,

Would it be possible to solve the problem using the GORDA on-commit hook?

Jan would be able reliably obtain a commit timestamp with the desired 
semantics and store it in a regular table within transaction boundaries.

PostgreSQL would not have to commit to a specific timestamp semantics and the 
patch is quite small.

Regards,

-- 
Jose Orlando Pereira

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Anyone going to the LinuxWorld Summit in NYC

2007-02-07 Thread Bruce Momjian
[ BCC to hackers list.]

Anyone going to the LinuxWorld Summit in New York City next week?

http://www.linuxworldsummit.com/live/14/

I am going on the 15th.  PostgreSQL doesn't have a booth at the event.

-- 
  Bruce Momjian [EMAIL PROTECTED]   http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Logging functions executed by queries in 8.2?

2007-02-07 Thread korryd
  If anyone is interested, let me know and I'll add this to my ToDo
  list.
 
 The Sun benchmarking team needs this.  However, we need to be able to feed 
 the data into some kind of mass analysis ala pg_fouine so that we can do 
 overall performance analysis.


I've also included a PL/pgSQL profiler in the PL debugger project - this
plugin spits out an XML profile so you should be able to massage
it/aggregate it however you like.

-- Korry



Re: [HACKERS] [PATCHES] Fix database is ready race condition

2007-02-07 Thread Alvaro Herrera
Jim Nasby wrote:
 On Feb 5, 2007, at 8:19 AM, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
 My suggestions would be
 1. Database system has completed recovery and
 2. Database system is ready to accept connections
 
 The second was in fact the wording I had in mind, sorry for not being
 clear.  As to the first, the question is whether a log message at
 that specific point has any use.  It's not completion of recovery,
 exactly, since we go through that spot whether it's a normal startup
 or  recovery (and there is another log message when we actually do
 any WAL replay).  AFAICS it's a fairly randomly chosen place in a
 long sequence of  Things That Must Happen.  Part of the reason Markus
 is seeing a race  condition is that this isn't the last thing done
 before the startup subprocess exits --- see BootstrapMain.  So I'm
 for just getting rid of it.
 
 It is useful to know if the database had to do recovery, though, and  
 if it did do recovery, it would be useful to know how long it took if  
 the subsequent startup took a real amount of time.

That's already logged elsewhere, so there's no loss of functionality.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Chatter on DROP SOMETHING IF EXISTS

2007-02-07 Thread Peter Eisentraut
What is the practical purpose of the notices emitted by DROP SOMETHING IF 
EXISTS when the object in fact does not exist?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Anyone going to the LinuxWorld Summit in NYC

2007-02-07 Thread Douglas McNaught
Bruce Momjian [EMAIL PROTECTED] writes:

 [ BCC to hackers list.]

 Anyone going to the LinuxWorld Summit in New York City next week?

   http://www.linuxworldsummit.com/live/14/

 I am going on the 15th.  PostgreSQL doesn't have a booth at the event.

I'm not a hacker, just a happy user, but I'll be there.

-Doug

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] writing new regexp functions

2007-02-07 Thread Tom Lane
Jeremy Drake [EMAIL PROTECTED] writes:
 * Put together a patch to add these functions to core.  I could put them
   directly in regexp.c, so the support functions could stay static.  My
   concern here is that I don't know if there are any functions currently
   in core with OUT parameters.

As of 8.2 there are.

If we are going to include these I would vote for core not contrib
status, exactly to avoid having to export those functions.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Chatter on DROP SOMETHING IF EXISTS

2007-02-07 Thread Andrew Dunstan

Peter Eisentraut wrote:
What is the practical purpose of the notices emitted by DROP SOMETHING IF 
EXISTS when the object in fact does not exist?


  

It was asked for ...

http://archives.postgresql.org/pgsql-patches/2005-11/msg00072.php

I realise that doesn't quite answer your question.

cheers

andrew

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Proposal: Commit timestamp

2007-02-07 Thread Jan Wieck

On 2/7/2007 2:37 AM, Markus Schiltknecht wrote:

Hi,

Jan Wieck wrote:
Whatever strategy one will use, in an async multimaster there are always 
cases that can be resolved by rules (last update being one of them), and 
some that I can't even imagine solving so far. I guess some of the cases 
will simply boil down to the application has to make sure that ... 
never occurs. Think of a multi-item order, created on one node, while 
another node is deleting the long unused item (which would have to be 
backordered). Now while those two nodes figure out what to do to make 
this consistent again, a third node does a partial shipment of that 
order.


It helps to categorize these conflict types. There basically are:


Are we still discussing if the Postgres backend may provide support for
a commit timestamp, that follows the rules for Lamport timestamps in a
multi-node cluster? It seems more like we are drifting into what type of
replication system I should design to please most people.


Jan




* data conflicts: simple row data, i.e. update - update conflicts.

* uniqueness conflicts: two rows conflict because they'd violate a 
uniquenes constraint, i.e. insert - insert, update - insert or update - 
update.


* visibility conflicts: basically the remaining update - delete and 
delete - delete cases. But also SELECT FOR UPDATE candidates, etc... 
Everything having to do with a rows not yet or no longer being visible 
to a transaction.


Your example certainly involves a visibility conflict (update - delete). 
Not even (sync) Postgres-R can guarantee consistency on the visibility 
level, i.e.  a first transaction's SELECT FOR UPDATE might not see some 
just recently committed transactions newly inserted rows (because that 
one isn't replayed yet on the node, thus the transaction is working on 
an 'old' snapshot of the database state). Another simpler example: 
Postgres-R doesn't raise a serialization error on delete-delete 
conflicts, it simply deletes the row once, even if two transactions 
confirmed to have committed a transaction which deleted a row.


Luckily, most applications don't need that anyway, though.

The solution is simple, reinsert the deleted item ... 


..at which point timestamps certainly won't help :-)   Sorry, couldn't 
resist...


only that 
there were rather nasty ON DELETE CASCADE's on that item that removed 
all the consumer reviews, product descriptions, data sheets and what 
not. It's going to be an awful lot of undo.


Huh? Are you planning on aborting *parts* of a transaction? I didn't 
think about that, but my gut feeling is that you don't want to do that.


I haven't really made up my mind about a user defined rule based 
conflict resolution interface yet. I do plan to have a unique and 
foreign key constraint based, synchronous advisory locking system on top 
of my system in a later version (advisory key locks would stay in place 
until the transaction, that placed them, replicates).


You'd have to elaborate on that...

I guess you see by now why I wanted to keep the discussion about the 
individual, rather generic support features in the backend separate from 
the particular features I plan to implement in the replication system.


Sure. I know, discussions about replication can get endless, probably 
even are so by definition ;-)  But hey, they're fun!


Everyone has different needs and consequently an async multi-master 
must do a whole range of mutually exclusive things altogether ... 
because Postgres can never accept a partial solution. We want the egg 
laying milk-wool-pig or nothing.


Like the one which would result from a merge of such an async 
replication with a sync one? Imagine being able to choose between sync 
and async per transaction...


Regards

Markus


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org



--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Modifying and solidifying contrib

2007-02-07 Thread Nikolay Samokhvalov

On 2/6/07, Andrew Dunstan [EMAIL PROTECTED] wrote:

If the extension installs everything in dedicated namespace(s), I think
we would want to have an option to add those namespaces easily to search
paths. Right now all we can do is to set a search path. It would be
nice, for example, to have support for appending or prepending something
to the search path.

I suspect most apps/extensions don't currently use namespaces much, or
we might well have seen more demand in this area.


I still do not understand why is it so needed.
Your argument is some apps aren't able to call functions as
schemaname.functionname(arg1, arg2, ..), right? First of all, I do
not think that the number of such apps is huge. Second, this is really
the problem of those apps themselves.

I still think that separate namespaces for extensions is a good idea
while adjusting search_path is not. I've explained my POV in details
several messages ago in this thread...

Separation of extensions with fully specified names
schemaname.functionname(...) is good improvement (for simplification
and clarity) and  while adjusting search_path should be DBA/DBD's
decision.
--
Best regards,
Nikolay

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Modifying and solidifying contrib

2007-02-07 Thread Nikolay Samokhvalov

On 2/7/07, Nikolay Samokhvalov [EMAIL PROTECTED] wrote:

I still think that separate namespaces for extensions is a good idea
while adjusting search_path is not. I've explained my POV in details
several messages ago in this thread...

Separation of extensions with fully specified names
schemaname.functionname(...) is good improvement (for simplification
and clarity) and  while adjusting search_path should be DBA/DBD's
decision.


Oh, I've just recalled the problem that could arise in this scenario...
We cannot use schema name as prefix for operator calling
(tsearch2.ts_debug(...) works, while ... tsearch2.@@ ... doesn't).
This is one specific issue, maybe it's worth to resolve it? Or it's
impossible for some reasons...

--
Best regards,
Nikolay

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Modifying and solidifying contrib

2007-02-07 Thread Andrew Dunstan

Nikolay Samokhvalov wrote:


I still do not understand why is it so needed.
Your argument is some apps aren't able to call functions as
schemaname.functionname(arg1, arg2, ..), right? 


wrong.



I still think that separate namespaces for extensions is a good idea
while adjusting search_path is not. I've explained my POV in details
several messages ago in this thread...



The difference between us is that I am less inclined to be prescriptive 
about such matters than you are. I think that as namespace use expands 
we should also probably provide better support for adding things to the 
search path (or indeed taking things away). If you don't want to use it 
then don't, but I don't see why you are so insistent on denying such 
facilities to others.



cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] HOT for PostgreSQL 8.3

2007-02-07 Thread Simon Riggs
Heap Only Tuples (HOT) is a simplification of earlier proposals for
improving the way the server handles frequent updates, based upon what's
been learned and feedback received.

Heap Only Tuples


The basic idea is that when a tuple is UPDATEd we can, in certain
circumstances, avoid inserting index tuples for a tuple. Such tuples are
marked HEAP_ONLY_TUPLE, but their storage is otherwise identical to
other tuples. The pre-conditions for allowing a HOT UPDATE are
- UPDATE doesn't change any indexed columns
- there is space on the same block as the tuple being updated

There is no restriction on tuple length changes, nor any requirement for
an additional field in the tuple header; as a result this change does
not require activation by an additional WITH parameter and this
technique can be used on *all* tables. 

HOT will, in some cases, perform better in conjunction with the use of
the fillfactor storage parameter. For smaller tables, this will seldom
be required, so database tuning will not increase in complexity (in
comparison with carefully planned VACUUM strategies in earlier
releases). In many cases, the update rate will cause a steady state to
be reached, with on-block space being reused cyclically.
 
At the same time we insert the HEAP_ONLY_TUPLE, the just-updated tuple
will be marked HEAP_UPDATE_ROOT. When we use an index to locate a heap
tuple, we start from this root tuple and hop forwards using the ctid
chain until we find the appropriate tuple. 

CREATE INDEX requires some careful work to allow it to identify and
correctly index HEAP_ONLY_TUPLEs that need to become ~HEAP_ONLY_TUPLE as
a result of the new index. This will cause additional work to be
required for those cases. CREATE INDEX on a newly loaded table will be
completely unaffected. There is some complexity there, though we don't
go into detail on those issues here. Please read on!

To allow HOT to work effectively we need to consider how we will VACUUM,
noting that in many cases we can remove HEAP_ONLY_TUPLEs much more
easily because they have no index tuples referencing them. There are
various options at this stage, but for clarity only one of those options
is presented here. 

When we try to UPDATE a tuple and the new tuple version doesn't fit on
the block, we get the BufferCleanupLock if possible and then perform a
single-block VACUUM. Any tuple that is both HEAP_DEAD  HEAP_ONLY_TUPLE
can be removed completely. This is possible by changing the t_ctid field
so that it points at the first visible-to-someone tuple in the chain, so
it points over the previous HOT tuples. The root tuple is also dead -
it cannot be removed completely, so it is replaced it with just a
TupleHeader, which is referred to as a TupleStub. (Credit to Itagaki
for this concept).

e.g.

t1 (t_ctid: t2 ) - info HEAP_UPDATE_ROOT status HEAPTUPLE_DEAD
t2 (t_ctid: t3 ) - info HEAP_ONLYstatus HEAPTUPLE_DEAD
t3 (t_ctid:self) - info HEAP_ONLYstatus HEAPTUPLE_LIVE

after single-page VACUUM

t1 (t_ctid: t3 ) - info HEAP_UPDATE_ROOT  HEAP_TUPLE_STUB
 - status HEAPTUPLE_RECENTLY_DEAD
 - t1 is now a TupleStub only
t3 (t_ctid:self) - info HEAP_ONLYstatus HEAPTUPLE_LIVE

Status shown is the return value from HeapTupleSatisfiesVacuum()

The single-block VACUUM would alter *all* tuple chains on the block, not
just the one for the current tuple being UPDATEd.

This technique means that a tuple never changes its CTID, so everything
that currently uses CTID can continue normally. SeqScan would also work
identically to the way it works today.

It also means that we can't easily remove the root tuple, even if it is
now just a TupleStub (unless the whole chain is also removable because
of DELETE). Removing the root tuple will require a VACUUM *FULL*. Even
so, this option is still space-neutral in the worst-case, in comparison
with inserting index tuples. 

When we perform the single-block VACUUM we don't change the FSM, nor do
we try to check/increment the table's freezelimit. HOT would alter
slightly the way that UPDATEs are signalled to stats, so that these HOT
UPDATEs don't count towards the threshold for autovacuuming - so that a
frequently HOT-updated table may only very seldom require a normal
VACUUM.

The number of itempointers would increase in many cases, though this
would still be limited by current maximums.

Various tweaks on this basic idea exist, which can be considered in more
detail if the basic concept is accepted. 

- - - 

This design is aimed at being a no-frills version of the code that has
already been written. The existing version is available for testing now
and will be made available on community.enterprisedb.com shortly.

Four PostgreSQL developers have various amounts of time to contribute to
developing the above solution and customising it further according to
the wishes of the Community. That is myself, Heikki Linnakangas, Pavan
Deolasee and Nikhil Sontakke. Taken together, it seems 

Re: [HACKERS] Proposal: Commit timestamp

2007-02-07 Thread Markus Schiltknecht

Hi,

Jan Wieck wrote:

Are we still discussing if the Postgres backend may provide support for
a commit timestamp, that follows the rules for Lamport timestamps in a
multi-node cluster?


No. And I think you know my opinion about that by now. ;-)


It seems more like we are drifting into what type of
replication system I should design to please most people.


Nobody is telling you what you should do. You're free to do whatever you 
want to.


I'm only trying to get a discussion going, because a) I'm interested in 
how you plan to solve these problems and b) in the past, most people 
were complaining that all the different replication efforts didn't try 
to work together. I'm slowly trying to open up and discuss what I'm 
doing with Postgres-R on the lists.


Just yesterday at the SFPUG meeting, I've experienced how confusing it 
is for the users to have such a broad variety of (existing and upcoming) 
replication solutions. And I'm all for working together and probably 
even for merging different replication solutions.


Regards

Markus


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Modifying and solidifying contrib

2007-02-07 Thread Nikolay Samokhvalov

On 2/7/07, Andrew Dunstan [EMAIL PROTECTED] wrote:

Nikolay Samokhvalov wrote:

 I still do not understand why is it so needed.
 Your argument is some apps aren't able to call functions as
 schemaname.functionname(arg1, arg2, ..), right?

wrong.


 I still think that separate namespaces for extensions is a good idea
 while adjusting search_path is not. I've explained my POV in details
 several messages ago in this thread...


The difference between us is that I am less inclined to be prescriptive
about such matters than you are. I think that as namespace use expands
we should also probably provide better support for adding things to the
search path (or indeed taking things away). If you don't want to use it
then don't, but I don't see why you are so insistent on denying such
facilities to others.


ok, looks like I've misunderstood your mesages. Sorry for that. Surely
additional capabilities for manipulation with search_path cannot
hinder anybody.

--
Best regards,
Nikolay

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Status of autovacuum and the sporadic stats failures ?

2007-02-07 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 It'd be interesting to try to gather stats on the length of the delay
 taken, but I don't see a good way to do that within the current
 regression-test infrastructure.

 Have it log something that will appear on the postmaster log but not the 
 client log? Buildfarm members mostly post their complete postmaster 
 logs, and we could postprocess those.

I've applied a patch along this line --- it'll emit LOG messages like

LOG:  wait_for_stats delayed 0.112799018621445 seconds

The patch itself is pretty ugly :-(.  I thought at first that we could
just have a plpgsql function loop until it saw a change in the stats,
but that does not work because the backend keeps its stats snapshot
until end of transaction --- so if the stats aren't updated when the
function first looks, they never will appear to.  My second try was to
watch the mod timestamp of pgstat.stat, but that didn't work real well
either because it has only one-second resolution.  As committed, the
patch is watching for a change in the size of pgstat.stat, which it
forces by making a new table.  Ugh.  I think it's worth doing as a means
of gathering information about what's happening in the buildfarm, but
I don't really want to leave it there for posterity.

We could make it cleaner by inventing a function to clear out the cached
statistics within a transaction, perhaps pg_stat_reset_snaphot() or
some such name.  If anyone thinks that that would be of general
usefulness, I'll see about making it happen.

regards, tom lane


-- save current stats-file size
CREATE TEMP TABLE prevfilesize AS
  SELECT size FROM pg_stat_file('global/pgstat.stat');

-- make and touch a previously nonexistent table
CREATE TABLE stats_hack (f1 int);
SELECT * FROM stats_hack;

-- wait for stats collector to update
create function wait_for_stats() returns void as $$
declare
  start_time timestamptz := clock_timestamp();
  oldsize bigint;
  newsize bigint;
begin
  -- fetch previous stats-file size
  select size into oldsize from prevfilesize;

  -- we don't want to wait forever; loop will exit after 30 seconds
  for i in 1 .. 300 loop

-- look for update of stats file
select size into newsize from pg_stat_file('global/pgstat.stat');

exit when newsize != oldsize;

-- wait a little
perform pg_sleep(0.1);

  end loop;

  -- report time waited in postmaster log (where it won't change test output)
  raise log 'wait_for_stats delayed % seconds',
extract(epoch from clock_timestamp() - start_time);
end
$$ language plpgsql;

SELECT wait_for_stats();

DROP TABLE stats_hack;

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Proposal: TABLE functions

2007-02-07 Thread Simon Riggs
On Tue, 2007-02-06 at 23:43 +0100, Pavel Stehule wrote:

 ANSI SQL 2003 goes with new type of functions - table functions. With this 
 syntax
...
 All necessary infrastructure is done. Implementation needs propably only 
 small changes in parser.
...
 * conformance with ansi sql 2003

Sounds good to me.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Chatter on DROP SOMETHING IF EXISTS

2007-02-07 Thread David Fetter
On Wed, Feb 07, 2007 at 02:13:48PM +0100, Peter Eisentraut wrote:
 What is the practical purpose of the notices emitted by DROP
 SOMETHING IF EXISTS when the object in fact does not exist?

DROP ... IF EXISTS is guaranteed not to throw an error.  This lets
people write idempotent scripts which run in a transaction :)

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Proposal: Commit timestamp

2007-02-07 Thread Richard Troy

 Jan Wieck wrote:
  Are we still discussing if the Postgres backend may provide support for
  a commit timestamp, that follows the rules for Lamport timestamps in a
  multi-node cluster?

...I thought you said in this thread that you haven't and weren't going to
work on any kind of logical proof of it's correctness, saw no value in
prototyping your way to a clear (convincing) argument, and were
withdrawing the proposal due to all the issues others raised which were,
in light of this, unanswerable beyond conjecture. I thought that the
thread was continuing because other people saw value in the kernel of the
idea, would support if if it could be shown to be correct/useful, were
disappointed you'd leave it at that and wanted to continue to see if
something positive might come of the dialogue. So, the thread weaved
around a bit. I think that if you want to nail this down, people here are
willing to be convinced, but that hasn't happened yet.

On Wed, 7 Feb 2007, Markus Schiltknecht wrote:
 I'm only trying to get a discussion going, because a) I'm interested in
 how you plan to solve these problems and b) in the past, most people
 were complaining that all the different replication efforts didn't try
 to work together. I'm slowly trying to open up and discuss what I'm
 doing with Postgres-R on the lists.

 Just yesterday at the SFPUG meeting, I've experienced how confusing it
 is for the users to have such a broad variety of (existing and upcoming)
 replication solutions. And I'm all for working together and probably
 even for merging different replication solutions.

In support of that idea, I offer this; When Randy Eash wrote the world's
first replication system for Ingres circa 1990, his work included ideas
and features that are right now in the Postgres world fragmented among
several existing replication / replication-related products, along with
some things that are only now in discussion in this group. As discussed at
the SFPUG meeting last night, real-world use cases are seldom if ever
completely satisfied with a one-size-fits-all replication strategy. For
example, a manufacturing company might want all factories to be capable of
being autonomous but both report activities and take direction from
corporate headquarters. To do this without having multiple databases at
each site, a single database instance would likely be both a master and
slave, but for differing aspects of the businesses needs. Business
decisions would resolve the conflicts, say, the manufacturing node always
wins when it comes to data that pertains to their work, rather than
something like a time-stamp, last timestamp/serialized update wins.

Like Markus, I would like to see the various replication efforts merged as
best they can be because even if the majority of users don't use a little
bit of everything, surely the more interesting cases would like to and the
entire community is better served if the various solutions are in
harmony.

Richard


-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Modifying and solidifying contrib

2007-02-07 Thread Tom Lane
Nikolay Samokhvalov [EMAIL PROTECTED] writes:
 We cannot use schema name as prefix for operator calling

You can, but it's kind of ugly:

regression=# SELECT 2 OPERATOR(pg_catalog.+) 2;
 ?column?
--
4
(1 row)


regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Status of autovacuum and the sporadic stats failures ?

2007-02-07 Thread Alvaro Herrera
Tom Lane wrote:

 We could make it cleaner by inventing a function to clear out the cached
 statistics within a transaction, perhaps pg_stat_reset_snaphot() or
 some such name.  If anyone thinks that that would be of general
 usefulness, I'll see about making it happen.

During the development of the launcher/worker autovac separation, I
thought a bit about how to handle stats invalidation.  The launcher
process needs to read the stat file every once in a while, but the
worker doesn't care if the stats are very up to date.

I came up with the idea of storing the stat hash in a particular memory
context, and have a flag in pgstat.c that's reset when that context is
reset.  So each caller can define what's the appropriate lifetime.  In
launcher, stats would be reset once every main loop (after which an
autovacuum_naptime sleep takes place, which means that the original
coding of reading it every autovac iteration is kept).  In worker, it's
read only once, at the start of the process.  And in backends (though I
didn't implement it), stats could be stored in TopTransationContext and
the flag reset in CommitTransaction and AbortTransaction.

It would be quite easy to provide a function to reset stats within a
transaction, by having the hash allocated in a context child of
TopTransation.

One problem is how to make sure that the flag is reset when the context
is.  If we had cleanup callbacks for memory contexts this would be
trivial and robust.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [HACKERS] writing new regexp functions

2007-02-07 Thread David Fetter
On Wed, Feb 07, 2007 at 09:23:58AM -0500, Tom Lane wrote:
 Jeremy Drake [EMAIL PROTECTED] writes:
  * Put together a patch to add these functions to core.  I could put them
directly in regexp.c, so the support functions could stay static.  My
concern here is that I don't know if there are any functions currently
in core with OUT parameters.
 
 As of 8.2 there are.
 
 If we are going to include these I would vote for core not contrib
 status, exactly to avoid having to export those functions.

+1 for core. :)

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-07 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 The basic idea is that when a tuple is UPDATEd we can, in certain
 circumstances, avoid inserting index tuples for a tuple. Such tuples are
 marked HEAP_ONLY_TUPLE, but their storage is otherwise identical to
 other tuples.

What is VACUUM FULL going to do when it wants to move one of these things?

 CREATE INDEX requires some careful work to allow it to identify and
 correctly index HEAP_ONLY_TUPLEs that need to become ~HEAP_ONLY_TUPLE as
 a result of the new index.

I think you've glossed over the CREATE INDEX problem much too easily.
The difficulty with that is going to be that de-HOT-ifying a tuple
is going to require multiple updates that can't possibly be put into
a single WAL record, and I don't think that WAL replay can clean up
after an incomplete update (since it can't run user-defined functions
and hence cannot be expected to compute index entries for itself).
So I don't think you can do that while preserving crash safety.

 Removing the root tuple will require a VACUUM *FULL*.

That seems unacceptable ... it won't take too long for your table to
fill up with stubs, and we don't want to return to the bad old days
when periodic VACUUM FULL was unavoidable.

ISTM we could fix that by extending the index VACUUM interface to
include two concepts: aside from remove these TIDs when you find them,
there could be replace these TIDs with those TIDs when you find them.
This would allow pointer-swinging to one of the child tuples, after
which the old root could be removed.  This has got the same atomicity
problem as for CREATE INDEX, because it's the same thing: you're
de-HOT-ifying the child.  So if you can solve the former, I think you
can make this work too.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Ooops ... seems we need a re-release pronto

2007-02-07 Thread Neil Conway
On Tue, 2007-02-06 at 12:33 -0600, Bruno Wolff III wrote:
 Is a test going to get added to the regression tests to catch similar
 regressions in the future?

While we can modify the regression tests to catch this specific problem
in the future, I wonder if there ought to be more testing of security
releases in the future. When a problem is reported, fixed, tested, and
the resulting security fix is publicly distributed all without public
discussion (e.g. on the -hackers list), that sounds like an invitation
to introduce regressions to me.

-Neil



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Chatter on DROP SOMETHING IF EXISTS

2007-02-07 Thread Bruno Wolff III
On Wed, Feb 07, 2007 at 10:53:34 -0800,
  David Fetter [EMAIL PROTECTED] wrote:
 On Wed, Feb 07, 2007 at 02:13:48PM +0100, Peter Eisentraut wrote:
  What is the practical purpose of the notices emitted by DROP
  SOMETHING IF EXISTS when the object in fact does not exist?
 
 DROP ... IF EXISTS is guaranteed not to throw an error.  This lets
 people write idempotent scripts which run in a transaction :)

I don't think that's what his question was. I think it was more along the
lines of why don't we get rid of the notices that are just cluttering things
up.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Status of autovacuum and the sporadic stats failures ?

2007-02-07 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 We could make it cleaner by inventing a function to clear out the cached
 statistics within a transaction, perhaps pg_stat_reset_snaphot() or
 some such name.  If anyone thinks that that would be of general
 usefulness, I'll see about making it happen.

 I came up with the idea of storing the stat hash in a particular memory
 context, and have a flag in pgstat.c that's reset when that context is
 reset.  So each caller can define what's the appropriate lifetime.

None of your use-cases require tracking multiple sets of stats within a
transaction, so I don't see why bother with that when we can just add a
flush the stats call.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Subject supressed

2007-02-07 Thread Bruce Momjian
Neil Conway wrote:
 On Tue, 2007-02-06 at 12:33 -0600, Bruno Wolff III wrote:
  Is a test going to get added to the regression tests to catch similar
  regressions in the future?
 
 While we can modify the regression tests to catch this specific problem
 in the future, I wonder if there ought to be more testing of security
 releases in the future. When a problem is reported, fixed, tested, and
 the resulting security fix is publicly distributed all without public
 discussion (e.g. on the -hackers list), that sounds like an invitation
 to introduce regressions to me.

Uh, did you have to post with this subject line just as 8.2.3 was being
released:

  Subject: Re: [HACKERS] Ooops ... seems we need a re-release pronto

Trying to give us heart-attacks?  :-)

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Ooops ... seems we need a re-release pronto

2007-02-07 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 While we can modify the regression tests to catch this specific problem
 in the future, I wonder if there ought to be more testing of security
 releases in the future. When a problem is reported, fixed, tested, and
 the resulting security fix is publicly distributed all without public
 discussion (e.g. on the -hackers list), that sounds like an invitation
 to introduce regressions to me.

No doubt about it, but what else do you propose?  This patch was
reviewed by several people, none of whom caught the problem.  (Not that
I want to blame them, it was certainly my bug.)  And we normally don't
have indefinite amounts of time to spend before responding.  With
limited eyes and limited time you're going to have a greater chance of
screw-up; but unless we are willing to flout the conventional wisdom
about keeping security-related bugs secret, I think that's just
something that's got to be lived with.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Status of autovacuum and the sporadic stats failures ?

2007-02-07 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  We could make it cleaner by inventing a function to clear out the cached
  statistics within a transaction, perhaps pg_stat_reset_snaphot() or
  some such name.  If anyone thinks that that would be of general
  usefulness, I'll see about making it happen.
 
  I came up with the idea of storing the stat hash in a particular memory
  context, and have a flag in pgstat.c that's reset when that context is
  reset.  So each caller can define what's the appropriate lifetime.
 
 None of your use-cases require tracking multiple sets of stats within a
 transaction, so I don't see why bother with that when we can just add a
 flush the stats call.

Yeah, it's the same thing (and simpler), except that you move the
responsability of resetting the memory context to pgstats instead of the
calling code.

In any case, if you were to provide a mechanism for stats flush,
autovacuum would have an use case for it, so that's a +1.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Status of autovacuum and the sporadic stats failures ?

2007-02-07 Thread Alvaro Herrera
 Tom Lane wrote:

  None of your use-cases require tracking multiple sets of stats within a
  transaction, so I don't see why bother with that when we can just add a
  flush the stats call.

FWIW I'm thinking that the corresponding code for handling the backends'
stats could be simplified, removing the hack that stores it in
TopTransactionContext, and just having a call to the stats flush
function in AbortTransaction and CommitTransaction.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Status of autovacuum and the sporadic stats failures ?

2007-02-07 Thread Alvaro Herrera
Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  It'd be interesting to try to gather stats on the length of the delay
  taken, but I don't see a good way to do that within the current
  regression-test infrastructure.
 
  Have it log something that will appear on the postmaster log but not the 
  client log? Buildfarm members mostly post their complete postmaster 
  logs, and we could postprocess those.
 
 I've applied a patch along this line --- it'll emit LOG messages like
 
 LOG:  wait_for_stats delayed 0.112799018621445 seconds

Beluga just failed:

http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=belugadt=2007-02-07%2019:30:01

The delay seems too short though:

LOG:  wait_for_stats delayed 0.000748 seconds

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Modifying and solidifying contrib

2007-02-07 Thread Paul Ramsey

Jim Nasby wrote:

In addition to Martijn's tsearch case, there's also PostGIS. And I 
believe this is a pretty big pain for them.


Hear hear! It would be nice to dump from an old PostgreSQL/PostGIS 
combination and restore to a new version combination, without taking all 
the function definitions along for a ride in the dump process.  What we 
really want is just the data.


--

  Paul Ramsey
  Refractions Research
  http://www.refractions.net
  [EMAIL PROTECTED]
  Phone: 250-383-3022
  Cell: 250-885-0632

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Status of autovacuum and the sporadic stats failures ?

2007-02-07 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Beluga just failed:
 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=belugadt=2007-02-07%2019:30:01

Wow, that is a really interesting failure, because it implies that the
stats collector had seen the seqscan report but not the indexscan report:

  WHERE st.relname='tenk2' AND cl.relname='tenk2';
   ?column? | ?column? | ?column? | ?column? 
  --+--+--+--
!  t| t| t| t
  (1 row)
  
  SELECT st.heap_blks_read + st.heap_blks_hit = pr.heap_blks + cl.relpages,
--- 105,111 
   WHERE st.relname='tenk2' AND cl.relname='tenk2';
   ?column? | ?column? | ?column? | ?column? 
  --+--+--+--
!  t| t| f| f
  (1 row)
  
  SELECT st.heap_blks_read + st.heap_blks_hit = pr.heap_blks + cl.relpages,

I haven't seen that too many times, if at all.

 The delay seems too short though:
 LOG:  wait_for_stats delayed 0.000748 seconds

This indicates there wasn't any delay, ie, on the first examination
pgstat.stat had a different size from what it had been at the CREATE
TEMP TABLE prevfilesize command.  [ thinks about that for awhile ]
Oh, I see the problem: at the instant of checking the file size the
first time, the stats collector must have been already in process of
writing a new version of the file, which had some but not all of the
updates we want.  And if that happened to be a different size from the
older version, we could fall through the wait as soon as it got
installed.  So this waiting mechanism isn't good enough: it proves that
a new set of stats has been *installed* since we started waiting, but
it doesn't provide any guarantee about when the computation of that set
started.  Back to the drawing board ...

If we had the suggested pg_stat_reset_snapshot function, then we could
wait until the indexscan count changes from the prior reading, which
would provide a more bulletproof synchronization approach.  So maybe I
should just go do that.  I had hoped to find a technique that was
potentially backpatchable into at least the last release or two, but
maybe there's no chance.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] [HACKERS] writing new regexp functions

2007-02-07 Thread Jeremy Drake
On Wed, 7 Feb 2007, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  * Put together a patch to add these functions to core.  I could put them
directly in regexp.c, so the support functions could stay static.  My
concern here is that I don't know if there are any functions currently
in core with OUT parameters.

 As of 8.2 there are.

Could you give me the name of one in pg_proc.h so I can see how I should
go about adding one there?

 If we are going to include these I would vote for core not contrib
 status, exactly to avoid having to export those functions.

OK, this patch will be my next project.


-- 
History is curious stuff
You'd think by now we had enough
Yet the fact remains I fear
They make more of it every year.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] [HACKERS] writing new regexp functions

2007-02-07 Thread Tom Lane
Jeremy Drake [EMAIL PROTECTED] writes:
 On Wed, 7 Feb 2007, Tom Lane wrote:
 As of 8.2 there are.

 Could you give me the name of one in pg_proc.h so I can see how I should
 go about adding one there?

select * from pg_proc where proargmodes is not null;

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-07 Thread Simon Riggs
On Wed, 2007-02-07 at 14:17 -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  The basic idea is that when a tuple is UPDATEd we can, in certain
  circumstances, avoid inserting index tuples for a tuple. Such tuples are
  marked HEAP_ONLY_TUPLE, but their storage is otherwise identical to
  other tuples.
 
 What is VACUUM FULL going to do when it wants to move one of these things?

This question stands out from the others. I'm not sure which aspect
you're thinking of - do you see some failure cases?

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-07 Thread Joshua D. Drake
Simon Riggs wrote:
 Heap Only Tuples (HOT) is a simplification of earlier proposals for
 improving the way the server handles frequent updates, based upon what's
 been learned and feedback received.
 
 Heap Only Tuples
 
 
 The basic idea is that when a tuple is UPDATEd we can, in certain
 circumstances, avoid inserting index tuples for a tuple. Such tuples are
 marked HEAP_ONLY_TUPLE, but their storage is otherwise identical to
 other tuples. The pre-conditions for allowing a HOT UPDATE are
 - UPDATE doesn't change any indexed columns

Uhmmm... how often is that the case? Don't get me wrong, bravo but that
seems a rather large limitation.

Considering it, this would certainly be a boon in web space where you
have things like Rails doing:

UPDATE foo SET first_name = 'Barney' WHERE id = 1;

That would qualify correct?

Sincerely,

Joshua D. Drake


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-07 Thread Merlin Moncure

On 2/7/07, Joshua D. Drake [EMAIL PROTECTED] wrote:

Simon Riggs wrote:
 Heap Only Tuples (HOT) is a simplification of earlier proposals for
 improving the way the server handles frequent updates, based upon what's
 been learned and feedback received.



Uhmmm... how often is that the case? Don't get me wrong, bravo but that
seems a rather large limitation.

Considering it, this would certainly be a boon in web space where you
have things like Rails doing:


HOT is great for tables that are updated frequently via triggers or
cron right?  so it this eliminate the need to vacuum foo following
executing:
update foo set v =1 where id =1;
where v is not an index, right?

if so, it would be great all kinds of things, especially
materialization techniques.  or any situation where a table is updated
so frequently autovac can't keep up.  I can think of tons of places
where this would be useful.

merlin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Status of autovacuum and the sporadic stats failures ?

2007-02-07 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 FWIW I'm thinking that the corresponding code for handling the backends'
 stats could be simplified, removing the hack that stores it in
 TopTransactionContext, and just having a call to the stats flush
 function in AbortTransaction and CommitTransaction.

Yeah, thanks for the idea.  The pgstats code itself is now decoupled
from transaction boundaries.  It still has one hack to know that
autovacuum wants to keep its stats for the whole run, but it'll probably
be a lot easier to whack it around the way you want now.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Proposal: Commit timestamp

2007-02-07 Thread Jan Wieck

On 2/7/2007 12:54 PM, Markus Schiltknecht wrote:

Hi,

Jan Wieck wrote:

Are we still discussing if the Postgres backend may provide support for
a commit timestamp, that follows the rules for Lamport timestamps in a
multi-node cluster?


No. And I think you know my opinion about that by now. ;-)


Then let me give you a little puzzle just for the fun of it.

A database containing customer contact information (among other things) 
is a two node multimaster system. One is serving the customer web 
portal, the other is used by the company staff including the call 
center. At 13:45 the two servers lose connectivity to each other, yet 
the internal staff can access the internal server while the web portal 
is accessible from the outside. At 13:50 customer A updates their credit 
card information through the web portal, while customer B does the same 
through the call center. At 13:55 both customers change their mind to 
use yet another credit card, now customer A phones the call center while 
customer B does it via the internet.


At 14:00 the two servers reconnect and go through the conflict 
resolution. How do you intend to solve both conflicts without using any 
clock, because that seems to be a stopword causing instant rejection 
of whatever you propose. Needless to say, both customers will be 
dissatisfied if you charge the wrong credit card during your next 
billing cycle.






It seems more like we are drifting into what type of
replication system I should design to please most people.


Nobody is telling you what you should do. You're free to do whatever you 
want to.


I'm only trying to get a discussion going, because a) I'm interested in 
how you plan to solve these problems and b) in the past, most people 
were complaining that all the different replication efforts didn't try 
to work together. I'm slowly trying to open up and discuss what I'm 
doing with Postgres-R on the lists.


Which is a good discussion because one of the reasons why I stopped 
looking into Postgres-R is the fact that is based on the idea to push 
all the replication information through a system that generates a global 
serialized message queue. That by itself isn't the problem, but the fact 
that implementing a global serialized message queue has serious 
throughput issues that are (among other details) linked to the speed of 
light.


I am trying to start with a system, that doesn't rely on such a 
mechanism for everything. I do intend to add an option later, that 
allows to declare a UNIQUE NOT NULL constraint to be synchronous. What 
that means is, that any INSERT, UPDATE, DELETE and SELECT FOR UPDATE 
will require the node to currently be a member of the (quorum or 
priority defined) majority of the cluster. An advisory lock system, 
based on a total order group communication, will grant the lock to the 
unique key values on a first come, first serve base. Every node in the 
cluster will keep those keys as locked until the asynchronous 
replication stream reports the locking transaction as ended. If another 
remote transaction in the meantime requires updating such key, the 
incoming stream from that node will be on hold until the lock is 
cleared. This is to protect agains node B replicating a transaction from 
node A and a later update on node B arrives on C before C got the first 
event from A. A node that got disconnected from the cluster must rebuild 
the current advisory lock list upon reconnecting to the cluster.


I think that this will be a way to overcome Postgres-R's communication 
bottleneck, as well as allowing limited update activity even during a 
completely disconnected state of a node. Synchronous or group 
communication messages are reduced to the cases, where the application 
cannot be implemented in a conflict free way, like allocating a natural 
primary key. There is absolutely no need to synchronize for example 
creating a sales order. An application can use global unique ID's for 
the order number. And everything possibly referenced by an order (items, 
customers, ...) is stored in a way that the references are never 
updated. Deletes to those possibly referenced objects are implemented in 
a two step process, where they are first marked obsolete, and later on 
things that have been marked obsolete for X long are deleted. A REPLICA 
TRIGGER on inserting an order will simply reset the obsolete flag of 
referenced objects. If a node is disconnected longer than X, you have a 
problem - hunt down the guy who defined X.


Just yesterday at the SFPUG meeting, I've experienced how confusing it 
is for the users to have such a broad variety of (existing and upcoming) 
replication solutions. And I'm all for working together and probably 
even for merging different replication solutions.


Merging certain ideas to come up with an async/sync hybrid? Seems to me 
we have similar enough ideas to need conflict resolution, because we had 
them simultaneously but communicate them asynchronously.



Jan

--

Re: [HACKERS] Proposal: Commit timestamp

2007-02-07 Thread Markus Schiltknecht

Hi,

Jan Wieck wrote:

Then let me give you a little puzzle just for the fun of it.

A database containing customer contact information (among other things) 
is a two node multimaster system. One is serving the customer web 
portal, the other is used by the company staff including the call 
center. At 13:45 the two servers lose connectivity to each other, yet 
the internal staff can access the internal server while the web portal 
is accessible from the outside. At 13:50 customer A updates their credit 
card information through the web portal, while customer B does the same 
through the call center. At 13:55 both customers change their mind to 
use yet another credit card, now customer A phones the call center while 
customer B does it via the internet.


Phew, a mind twister... one customer would already be enough to trigger 
that sort of conflict...


At 14:00 the two servers reconnect and go through the conflict 
resolution. How do you intend to solve both conflicts without using any 
clock, because that seems to be a stopword causing instant rejection 
of whatever you propose. Needless to say, both customers will be 
dissatisfied if you charge the wrong credit card during your next 
billing cycle.


Correct. But do these cases satisfy storing timestamps to each and every 
transaction you do? That's what I doubt, not the usefulness of time 
based conflict resolution for certain cases.


You can always add a time based conflict resolution, by adding a 
timestamp column and decide upon that one. I'd guess that the overall 
costs are lower that way.


But you've withdrawn that proposal already, so...

Which is a good discussion because one of the reasons why I stopped 
looking into Postgres-R is the fact that is based on the idea to push 
all the replication information through a system that generates a global 
serialized message queue. That by itself isn't the problem, but the fact 
that implementing a global serialized message queue has serious 
throughput issues that are (among other details) linked to the speed of 
light.


Agreed. Nevertheless, there are use cases for such systems, because they 
put less limitations to the application. One could even argue, that your 
above example would be one ;-)


I am trying to start with a system, that doesn't rely on such a 
mechanism for everything. I do intend to add an option later, that 
allows to declare a UNIQUE NOT NULL constraint to be synchronous. What 
that means is, that any INSERT, UPDATE, DELETE and SELECT FOR UPDATE 
will require the node to currently be a member of the (quorum or 
priority defined) majority of the cluster.


Sounds reasonable.

An advisory lock system, 
based on a total order group communication, will grant the lock to the 
unique key values on a first come, first serve base. Every node in the 
cluster will keep those keys as locked until the asynchronous 
replication stream reports the locking transaction as ended. If another 
remote transaction in the meantime requires updating such key, the 
incoming stream from that node will be on hold until the lock is 
cleared. This is to protect agains node B replicating a transaction from 
node A and a later update on node B arrives on C before C got the first 
event from A. A node that got disconnected from the cluster must rebuild 
the current advisory lock list upon reconnecting to the cluster.


Yeah, this is a convenient way to replicate sequences via a GCS.

I think that this will be a way to overcome Postgres-R's communication 
bottleneck, as well as allowing limited update activity even during a 
completely disconnected state of a node. Synchronous or group 
communication messages are reduced to the cases, where the application 
cannot be implemented in a conflict free way, like allocating a natural 
primary key. There is absolutely no need to synchronize for example 
creating a sales order. 


Agreed, such cases can easily be optimized. But you have to be aware of 
he limitations these optimizations cause. Postgres-R is much more 
targeted at very general use cases.


An application can use global unique ID's for 
the order number. And everything possibly referenced by an order (items, 
customers, ...) is stored in a way that the references are never 
updated. Deletes to those possibly referenced objects are implemented in 
a two step process, where they are first marked obsolete, and later on 
things that have been marked obsolete for X long are deleted. A REPLICA 
TRIGGER on inserting an order will simply reset the obsolete flag of 
referenced objects. If a node is disconnected longer than X, you have a 
problem - hunt down the guy who defined X.


Yeah, that's another very nice optimization. Again, as long as you know 
the limitations, that's all well and fine.


Merging certain ideas to come up with an async/sync hybrid? Seems to me 
we have similar enough ideas to need conflict resolution, because we had 
them simultaneously but communicate them asynchronously.


Huh? 

[HACKERS] quick SRF question

2007-02-07 Thread Jeremy Drake
If I have a multi-call SRF and a user_fctx struct allocated in the
multi_call_memory_ctx, and in the if(SRF_IS_FIRSTCALL()) block while still
in the multi_call_memory_ctx I use PG_GETARG_TEXT_P(n) to get an argument
to my function, and stash the result of this in my user_fctx struct, am I
guaranteed that this pointer will remain valid throughout the remaining
calls to this SRF, or should I instead use PG_GETARG_TEXT_P_COPY(n)?

Here is an example of what I am talking about

typedef struct testfunc_ctx {
 text * txt;
} testfunc_ctx;

Datum testfunc(PG_FUNCTION_ARGS)
{
 FuncCallContext *funcctx;
 testfunc_ctx *userctx;
 MemoryContext oldcontext;

 if (SRF_IS_FIRSTCALL())
 {
  funcctx = SRF_FIRSTCALL_INIT();
  oldcontext = MemoryContextSwitchTo(funcctx-multi_call_memory_ctx);
  userctx = palloc(sizeof(testfunc_ctx));
/* XXX does this need to be PG_GETARG_TEXT_P_COPY, or is this ok like this */
  userctx-txt = PG_GETARG_TEXT_P(0);
  MemoryContextSwitchTo(oldcontext);
  funcctx-user_fctx = userctx;
 }

 funcctx = SRF_PERCALL_SETUP();
 userctx = funcctx-user_fctx;
 /* do something with userctx-txt */

 if (done)
  SRF_RETURN_DONE(funcctx);
 else
  SRF_RETURN_NEXT(funcctx, result);
}


-- 
The New Testament offers the basis for modern computer coding theory,
in the form of an affirmation of the binary number system.

But let your communication be Yea, yea; nay, nay: for
whatsoever is more than these cometh of evil.
-- Matthew 5:37

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Proposal: Commit timestamp

2007-02-07 Thread Jan Wieck

On 2/7/2007 2:15 PM, Richard Troy wrote:

Jan Wieck wrote:
 Are we still discussing if the Postgres backend may provide support for
 a commit timestamp, that follows the rules for Lamport timestamps in a
 multi-node cluster?


...I thought you said in this thread that you haven't and weren't going to
work on any kind of logical proof of it's correctness, saw no value in
prototyping your way to a clear (convincing) argument, and were
withdrawing the proposal [...]


I said I don't have any such documents. I was asked to continue this 
discussion in order to find people willing to help discover potential 
problems. I am prepared to continue this development isolated, although 
I wouldn't like to.


The PostgreSQL developers community used to be good at throwing out 
ideas, brainstorming about the possibilities, adding more to them and 
coming up with very unique and flexible solutions. I am a little 
disappointed that much of that got lost over the years and please 
forgive me if I sound a little grumpy over that. The statement to 
withdraw the proposal was certainly premature - consider it not 
withdrawn at this time. However, comparing what used to be our process 
to what I see today, I must say that something like TOAST would never 
have happened. It was the result of a global brainstorming, that I 
simply translated into C code. Many details and features of the 
implementation are purely mine, but the really big sparks, that got it 
to what it is, I'm not claiming for myself. Most importantly, give me 
proof of concept before we can talk about changing backend code was not 
part of the process at all. We were pretty eager to change things back 
then, when we needed to get better in almost every way possible ... are 
we so good at replication that we need to be conservative in that 
respect now? We are certainly good at some things and have to be 
conservative with respect to them, but replication in my not so very 
humble opinion isn't one of them.


I do understand that we have a codebase used in production these days. 
And because of that we have to maintain code and syntax stability to a 
degree, we didn't have back in the glory days of introducing EXCEPT and 
INTERCEPT (who's first incarnation was committed to the code base while 
completely destroying my entire work of fixing the rewriter). Maybe we 
need to introduce something entirely different, like the concept of an 
experimental feature. Something that we add to the code but that is 
explicitly flagged as not final, not stable, not guaranteed to stay or 
work in this or any other form. This requires that the feature has very 
limited interference with other parts of the system, like (or especially 
like) the query parser. If it turns out to be a problem in x.y.0, it 
will be backed out and gone in x.y.1. Or in a different way, like we 
create an experimental CVS branch off of every major release. That way, 
developers can easier share experimental code and if things settle 
there, they will be considered to be adopted into HEAD.



Like Markus, I would like to see the various replication efforts merged as
best they can be because even if the majority of users don't use a little
bit of everything, surely the more interesting cases would like to and the
entire community is better served if the various solutions are in
harmony.


No doubt about that and I was the one organizing the Afilias sponsored 
meeting in Toronto back then, where my reversed Postgres-R idea was 
taken apart because it won't scale due to the gigantic amount of 
synchronized group communication it would require. Again, it might be 
that experimental features will cause more of the efforts to converge by 
using the same base as a compromise instead of having each and every 
support feature being designed completely independent.


I still have a hard time understanding why someone would object to 
adding a feature, however useless it might seem to them, as long as it 
doesn't cost them anything. Admitted, any feature causes maintenance 
costs on the side of the PostgreSQL development community (mainly those, 
who actually contribute and maintain the code - fortunately that is a 
finite number - everyone please ask themselves if they are part of 
that). But aside from that, would anyone, who is questioning the commit 
timestamp as I proposed it, likewise vehemently object to yet another 
procedural language, or adding another log tuning switch? I don't think 
so. As long as it doesn't cost you unless you turn it on, why would you 
even care if it serves my purpose or not? The thing that kicked off this 
emotional spin was that multimaster replication is what so many people 
want, but nobody has a universal solution for. Everyone wants to see 
their problem solved as well, or the solution isn't good. Tell you 
what, I can live with my problem solved even if it doesn't solve yours. 
Can you tell me what I have to modify in order to solve your problem as 
well, or are you asking 

Re: [HACKERS] Proposal: Commit timestamp

2007-02-07 Thread Jan Wieck

On 2/7/2007 9:27 PM, Markus Schiltknecht wrote:

Hi,

Jan Wieck wrote:

Then let me give you a little puzzle just for the fun of it.

A database containing customer contact information (among other things) 
is a two node multimaster system. One is serving the customer web 
portal, the other is used by the company staff including the call 
center. At 13:45 the two servers lose connectivity to each other, yet 
the internal staff can access the internal server while the web portal 
is accessible from the outside. At 13:50 customer A updates their credit 
card information through the web portal, while customer B does the same 
through the call center. At 13:55 both customers change their mind to 
use yet another credit card, now customer A phones the call center while 
customer B does it via the internet.


Phew, a mind twister... one customer would already be enough to trigger 
that sort of conflict...


At 14:00 the two servers reconnect and go through the conflict 
resolution. How do you intend to solve both conflicts without using any 
clock, because that seems to be a stopword causing instant rejection 
of whatever you propose. Needless to say, both customers will be 
dissatisfied if you charge the wrong credit card during your next 
billing cycle.


Correct. But do these cases satisfy storing timestamps to each and every 
transaction you do? That's what I doubt, not the usefulness of time 
based conflict resolution for certain cases.


You can always add a time based conflict resolution, by adding a 
timestamp column and decide upon that one. I'd guess that the overall 
costs are lower that way.


Yes, yes, and yes ... but aside from the problem that you use the very 
ambiguous word timestamp (which somehow suggests using a clock of 
some sort), isn't the begin timestamp of a long running transaction 
worse than the commit timestamp, when all its work got visible to the 
outside world instantaneously?




But you've withdrawn that proposal already, so...

Which is a good discussion because one of the reasons why I stopped 
looking into Postgres-R is the fact that is based on the idea to push 
all the replication information through a system that generates a global 
serialized message queue. That by itself isn't the problem, but the fact 
that implementing a global serialized message queue has serious 
throughput issues that are (among other details) linked to the speed of 
light.


Agreed. Nevertheless, there are use cases for such systems, because they 
put less limitations to the application. One could even argue, that your 
above example would be one ;-)


Now we're in sync :-)



I am trying to start with a system, that doesn't rely on such a 
mechanism for everything. I do intend to add an option later, that 
allows to declare a UNIQUE NOT NULL constraint to be synchronous. What 
that means is, that any INSERT, UPDATE, DELETE and SELECT FOR UPDATE 
will require the node to currently be a member of the (quorum or 
priority defined) majority of the cluster.


Sounds reasonable.

An advisory lock system, 
based on a total order group communication, will grant the lock to the 
unique key values on a first come, first serve base. Every node in the 
cluster will keep those keys as locked until the asynchronous 
replication stream reports the locking transaction as ended. If another 
remote transaction in the meantime requires updating such key, the 
incoming stream from that node will be on hold until the lock is 
cleared. This is to protect agains node B replicating a transaction from 
node A and a later update on node B arrives on C before C got the first 
event from A. A node that got disconnected from the cluster must rebuild 
the current advisory lock list upon reconnecting to the cluster.


Yeah, this is a convenient way to replicate sequences via a GCS.

I think that this will be a way to overcome Postgres-R's communication 
bottleneck, as well as allowing limited update activity even during a 
completely disconnected state of a node. Synchronous or group 
communication messages are reduced to the cases, where the application 
cannot be implemented in a conflict free way, like allocating a natural 
primary key. There is absolutely no need to synchronize for example 
creating a sales order. 


Agreed, such cases can easily be optimized. But you have to be aware of 
he limitations these optimizations cause. Postgres-R is much more 
targeted at very general use cases.


I am, if for no other reason than that I am familiar with the concepts 
underneath Postgres-R for more than 3 years. What I realized is that the 
 general use case (for arbitrary complex applications) is very likely 
to be in conflict with any king of good default performance case.




An application can use global unique ID's for 
the order number. And everything possibly referenced by an order (items, 
customers, ...) is stored in a way that the references are never 
updated. Deletes to those possibly referenced objects are 

Re: [HACKERS] Proposal: Commit timestamp

2007-02-07 Thread Bruce Momjian

I find the term logical proof of it's correctness too restrictive.  It
sounds like some formal academic process that really doesn't work well
for us.

What I did want to hear is a layout of how the system would work, and an
exchange of ideas until almost everyone was happy.

Also, I saw the trigger patch with no explaination of why it was
important or who would use it --- that also isn't going to fly well.

So, to add something, the community needs to hear how it is going to
help users, because every code addition has cost, and we don't want to
add things unless it has general utility.  If someone can't explain the
utility of an addition, I question whether the person has fully thought
through were they are going.

As far as adding a language, no, we would not just add any language.  We
would judge whether the language has usefulness to our users.  I think
APL would be cool, but I am not sure it is usable, so there is a hurdle
even there.

As far as TOAST, there is no question in my mind that TOAST development
would happen the same way today as it did when we did it in 2001 --- we
have a problem, how can we fix it.



---

Jan Wieck wrote:
 On 2/7/2007 2:15 PM, Richard Troy wrote:
  Jan Wieck wrote:
   Are we still discussing if the Postgres backend may provide support for
   a commit timestamp, that follows the rules for Lamport timestamps in a
   multi-node cluster?
  
  ...I thought you said in this thread that you haven't and weren't going to
  work on any kind of logical proof of it's correctness, saw no value in
  prototyping your way to a clear (convincing) argument, and were
  withdrawing the proposal [...]
 
 I said I don't have any such documents. I was asked to continue this 
 discussion in order to find people willing to help discover potential 
 problems. I am prepared to continue this development isolated, although 
 I wouldn't like to.
 
 The PostgreSQL developers community used to be good at throwing out 
 ideas, brainstorming about the possibilities, adding more to them and 
 coming up with very unique and flexible solutions. I am a little 
 disappointed that much of that got lost over the years and please 
 forgive me if I sound a little grumpy over that. The statement to 
 withdraw the proposal was certainly premature - consider it not 
 withdrawn at this time. However, comparing what used to be our process 
 to what I see today, I must say that something like TOAST would never 
 have happened. It was the result of a global brainstorming, that I 
 simply translated into C code. Many details and features of the 
 implementation are purely mine, but the really big sparks, that got it 
 to what it is, I'm not claiming for myself. Most importantly, give me 
 proof of concept before we can talk about changing backend code was not 
 part of the process at all. We were pretty eager to change things back 
 then, when we needed to get better in almost every way possible ... are 
 we so good at replication that we need to be conservative in that 
 respect now? We are certainly good at some things and have to be 
 conservative with respect to them, but replication in my not so very 
 humble opinion isn't one of them.
 
 I do understand that we have a codebase used in production these days. 
 And because of that we have to maintain code and syntax stability to a 
 degree, we didn't have back in the glory days of introducing EXCEPT and 
 INTERCEPT (who's first incarnation was committed to the code base while 
 completely destroying my entire work of fixing the rewriter). Maybe we 
 need to introduce something entirely different, like the concept of an 
 experimental feature. Something that we add to the code but that is 
 explicitly flagged as not final, not stable, not guaranteed to stay or 
 work in this or any other form. This requires that the feature has very 
 limited interference with other parts of the system, like (or especially 
 like) the query parser. If it turns out to be a problem in x.y.0, it 
 will be backed out and gone in x.y.1. Or in a different way, like we 
 create an experimental CVS branch off of every major release. That way, 
 developers can easier share experimental code and if things settle 
 there, they will be considered to be adopted into HEAD.
 
  Like Markus, I would like to see the various replication efforts merged as
  best they can be because even if the majority of users don't use a little
  bit of everything, surely the more interesting cases would like to and the
  entire community is better served if the various solutions are in
  harmony.
 
 No doubt about that and I was the one organizing the Afilias sponsored 
 meeting in Toronto back then, where my reversed Postgres-R idea was 
 taken apart because it won't scale due to the gigantic amount of 
 synchronized group communication it would require. Again, it might be 
 that experimental features will cause more of the efforts to converge 

Re: [HACKERS] Proposal: Commit timestamp

2007-02-07 Thread Jan Wieck

On 2/7/2007 10:35 PM, Bruce Momjian wrote:

I find the term logical proof of it's correctness too restrictive.  It
sounds like some formal academic process that really doesn't work well
for us.


Thank you.


Also, I saw the trigger patch with no explaination of why it was
important or who would use it --- that also isn't going to fly well.


You didn't respond to my explanation how the current Slony 
implementation could improve and evolve using it. Are you missing 
something? I am discussing this very issue with our own QA department, 
and thus far, I think I have a majority of would use a pg_trigger 
backpatched PostgreSQL vs. No, I prefer a system that knows exactly 
how it corrupted my system catalog.



As far as TOAST, there is no question in my mind that TOAST development
would happen the same way today as it did when we did it in 2001 --- we
have a problem, how can we fix it.


Looking at what did happen back then and what happens in this case, I do 
see a difference. There were concerns about the compression algorithm 
used ... it still is today what was the first incarnation and nobody 
ever bothered to even investigate if there could possibly be any better 
thing. Do you think lzcompress is the best we can come up with? I don't! 
So why is it still the thing used? Maybe it is good enough?



Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] Fwd: [webmaster] Its has been a great db experience with pg.

2007-02-07 Thread Robert Treat
Just thought this ought to be shared with everyone. :-)

--  Forwarded Message  --

Subject: [webmaster] Its has been a great db experience with pg.
Date: Monday 01 January 2007 01:46
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]

Dear Sir,
We at Indian Express Newspapers (India) , Online Dept. have been using
Postges since 2002, it has been wnderfull experience since then, Our cost
on db maintainace was kept low by postgres as programmers easily learnt
and implemented the system, and we never felt a need of dedicated DBA.
Thanks for this wonderfull DB, as users of this DB, We feel proud to
express that we use this cool open source.
Amit Kumar

Disclaimer
Indian Express NewsPapers (Mumbai) Ltd
Journalism of Courage

This e-mail message may contain proprietary,confidential or
legally privileged information for the sole use of the person
or entity to whom this message was originally addressed.
Please delete this e-mail, if it is not meant for you.

---

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Proposal: Commit timestamp

2007-02-07 Thread Bruce Momjian
Jan Wieck wrote:
 On 2/7/2007 10:35 PM, Bruce Momjian wrote:
  I find the term logical proof of it's correctness too restrictive.  It
  sounds like some formal academic process that really doesn't work well
  for us.
 
 Thank you.
 
  Also, I saw the trigger patch with no explaination of why it was
  important or who would use it --- that also isn't going to fly well.
 
 You didn't respond to my explanation how the current Slony 
 implementation could improve and evolve using it. Are you missing 
 something? I am discussing this very issue with our own QA department, 
 and thus far, I think I have a majority of would use a pg_trigger 
 backpatched PostgreSQL vs. No, I prefer a system that knows exactly 
 how it corrupted my system catalog.

No, I _now_ understand the use case, but when the patch was posted, the
use case was missing.  I would like to see a repost with the patch, and
a description of its use so we can all move forward on that.

  As far as TOAST, there is no question in my mind that TOAST development
  would happen the same way today as it did when we did it in 2001 --- we
  have a problem, how can we fix it.
 
 Looking at what did happen back then and what happens in this case, I do 
 see a difference. There were concerns about the compression algorithm 
 used ... it still is today what was the first incarnation and nobody 
 ever bothered to even investigate if there could possibly be any better 
 thing. Do you think lzcompress is the best we can come up with? I don't! 
 So why is it still the thing used? Maybe it is good enough?

It is simple/stupid enough, I would say, and the compression space is a
mine-field of patents.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] [HACKERS] elog(FATAL)ing non-existent roles during client

2007-02-07 Thread Bruce Momjian

Patch applied.  Thanks.

---


Gavin Sherry wrote:
 On Tue, 5 Dec 2006, Gavin Sherry wrote:
 
  On Thu, 30 Nov 2006, Tom Lane wrote:
 
   Gavin Sherry [EMAIL PROTECTED] writes:
I wonder if we should check if the role exists for the other
authentication methods too? get_role_line() should be very cheap and it
would prevent unnecessary authentication work if we did it before
contacting, for example, the client ident server. Even with trust, it
would save work because otherwise we do not check if the user exists 
until
InitializeSessionUserId(), at which time we're set up our proc entry 
etc.
  
   This only saves work if the supplied ID is in fact invalid, which one
   would surely think isn't the normal case; otherwise it costs more.
 
  Yes.
 
   I could see doing this in the ident path, because contacting a remote
   ident server is certainly expensive on both sides.  I doubt it's a good
   idea in the trust case.
 
  Agreed. How about Kerberos too, applying the same logic?
 
 Attached is a patch check adds the checks.
 
 Gavin
Content-Description: 

[ Attachment, skipping... ]

 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq