Re: [HACKERS] PQnotifies() in 7.3 broken?

2002-12-15 Thread Kevin Brown
Bruce Momjian wrote:
 
 OK, I have updated the libpq major number in 7.3.X, and updated major
 and minor in HEAD. Do I need to increment the other interfaces that
 _use_ libpq, like ecpg?  I think so.

You'll only need to increment the major number of those other
interfaces that are no longer binary-compatible with their immediate
predecessors.

That is, if the version of libecpg that ships with 7.3.1 is
incompatible with the version of libecpg that ships with 7.3, then you
should bump its major number (the one that shipped with 7.3 should
have had its major number bumped if it was incompatible with the one
that shipped with 7.2.3).  But its binary interface should be
independent of the binary interface of libpg.

When you build a shared library, you link it against any shared
libraries it depends on.  The resulting dependency list is contained
within the shared library the same way it's contained in an
application.  You can run ldd on shared libraries just as you can on
applications.


It might be helpful to think of the major number as being the same as
a protocol description number.  You change the protocol description
number when you make changes to the protocol that would make a server
using the new version of the protocol incompatible with a client using
the old version of the protocol.

In fact, there's really very little conceptual difference between a
protocol and an API: both are a means for two entities to communicate
with one another.  Change the API and you have to do something to make
it clear that the API has changed.  Same deal with a protocol.


I don't know if what I'm saying here makes much sense to you, but I
hope it helps...



-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [mail] Re: [HACKERS] Big 7.4 items - Replication

2002-12-15 Thread Al Sutton
Many thanks for the explanation. Could you explain to me where the order or
the writeset for the following scenario;

If a tranasction takes 50ms to reach one database from another, for a
specific data element (called X), the following timeline occurs

at 0ms, T1(X) is written to system A.
at 10ms, T2(X) is written to system B.

Where T1(X) and T2(X) conflict.

My concern is that if the Group Communication Daemon (gcd) is operating on
each database,  a successful result for T1(X) will returned to the client
talking to database A because T2(X) has not reached it, and thus no conflict
is known about, and a sucessful result is returned to the client submitting
T2(X) to database B because it is not aware of T1(X). This would mean that
the two clients beleive bothe T1(X) and T2(X) completed succesfully, yet
they can not due to the conflict.

Thanks,

Al.

- Original Message -
From: Darren Johnson [EMAIL PROTECTED]
To: Al Sutton [EMAIL PROTECTED]
Cc: Bruce Momjian [EMAIL PROTECTED]; Jan Wieck
[EMAIL PROTECTED]; [EMAIL PROTECTED];
PostgreSQL-development [EMAIL PROTECTED]
Sent: Saturday, December 14, 2002 6:48 PM
Subject: Re: [mail] Re: [HACKERS] Big 7.4 items - Replication


 
 
 
 b) The Group Communication blob will consist of a number of processes
which
 need to talk to all of the others to interrogate them for changes which
may
 conflict with the current write that being handled and then issue the
 transaction response. This is basically the two phase commit solution
with
 phases moved into the group communication process.
 
 I can see the possibility of using solution b and having less group
 communication processes than databases as attempt to simplify things, but
 this would mean the loss of a number of databases if the machine running
the
 group communication process for the set of databases is lost.
 
 The group communication system doesn't just run on one system.  For
 postgres-r using spread
 there is actually a spread daemon that runs on each database server.  It
 has nothing to do with
 detecting the conflicts.  Its job is to deliver messages in a total
 order for writesets or simple order
 for commits, aborts, joins, etc.

 The detection of conflicts will be done at the database level, by a
 backend processes.  The basic
 concept is if all databases get the writesets (changes) in the exact
 same order, apply them in a
 consistent order, avoid conflicts, then one copy serialization is
 achieved.  (one copy of the database
 replicated across all databases in the replica)

 I hope that explains the group communication system's responsibility.

 Darren


 



 ---(end of broadcast)---
 TIP 5: Have you checked our extensive FAQ?

 http://www.postgresql.org/users-lounge/docs/faq.html



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [MLIST] Re: [mail] Re: [HACKERS] Big 7.4 items - Replication

2002-12-15 Thread David Walker
Another concern I have with multi-master systems is what happens if the 
network splits in 2 so that 2 master systems are taking commits for 2 
separate sets of clients.  It seems to me that to re-sync the 2 databases 
upon the network healing would be a very complex task or impossible task.

On Sunday 15 December 2002 04:16 am, Al Sutton wrote:
 Many thanks for the explanation. Could you explain to me where the order or
 the writeset for the following scenario;

 If a tranasction takes 50ms to reach one database from another, for a
 specific data element (called X), the following timeline occurs

 at 0ms, T1(X) is written to system A.
 at 10ms, T2(X) is written to system B.

 Where T1(X) and T2(X) conflict.

 My concern is that if the Group Communication Daemon (gcd) is operating on
 each database,  a successful result for T1(X) will returned to the client
 talking to database A because T2(X) has not reached it, and thus no
 conflict is known about, and a sucessful result is returned to the client
 submitting T2(X) to database B because it is not aware of T1(X). This would
 mean that the two clients beleive bothe T1(X) and T2(X) completed
 succesfully, yet they can not due to the conflict.

 Thanks,

 Al.

 - Original Message -
 From: Darren Johnson [EMAIL PROTECTED]
 To: Al Sutton [EMAIL PROTECTED]
 Cc: Bruce Momjian [EMAIL PROTECTED]; Jan Wieck
 [EMAIL PROTECTED]; [EMAIL PROTECTED];
 PostgreSQL-development [EMAIL PROTECTED]
 Sent: Saturday, December 14, 2002 6:48 PM
 Subject: Re: [mail] Re: [HACKERS] Big 7.4 items - Replication

  b) The Group Communication blob will consist of a number of processes

 which

  need to talk to all of the others to interrogate them for changes which

 may

  conflict with the current write that being handled and then issue the
  transaction response. This is basically the two phase commit solution

 with

  phases moved into the group communication process.
  
  I can see the possibility of using solution b and having less group
  communication processes than databases as attempt to simplify things,
   but this would mean the loss of a number of databases if the machine
   running

 the

  group communication process for the set of databases is lost.
 
  The group communication system doesn't just run on one system.  For
  postgres-r using spread
  there is actually a spread daemon that runs on each database server.  It
  has nothing to do with
  detecting the conflicts.  Its job is to deliver messages in a total
  order for writesets or simple order
  for commits, aborts, joins, etc.
 
  The detection of conflicts will be done at the database level, by a
  backend processes.  The basic
  concept is if all databases get the writesets (changes) in the exact
  same order, apply them in a
  consistent order, avoid conflicts, then one copy serialization is
  achieved.  (one copy of the database
  replicated across all databases in the replica)
 
  I hope that explains the group communication system's responsibility.
 
  Darren
 
 
 
 
 
 
  ---(end of broadcast)---
  TIP 5: Have you checked our extensive FAQ?
 
  http://www.postgresql.org/users-lounge/docs/faq.html

 ---(end of broadcast)---
 TIP 6: Have you searched our list archives?

 http://archives.postgresql.org


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [MLIST] Re: [mail] Re: [HACKERS] Big 7.4 items - Replication

2002-12-15 Thread Al Sutton
David,

This can be resolved by requiring that for any transaction to succeed the
entrypoint database must receive acknowlegements from n/2 + 0.5 (rounded up
to the nearest integer) databases where n is the total number in the
replicant set. The following cases are shown as an example;

Total Number of databases: 2
Number required to accept transaction: 2

Total Number of databases: 3
Number required to accept transaction: 2

Total Number of databases: 4
Number required to accept transaction: 3

Total Number of databases: 5
Number required to accept transaction: 3

Total Number of databases: 6
Number required to accept transaction: 4

Total Number of databases: 7
Number required to accept transaction: 4

Total Number of databases: 8
Number required to accept transaction: 5

This would prevent two replicant sub-sets forming, because it is impossible
for both sets to have over 50% of the databases.

Applications could be able to detect when a database has dropped out of the
replicant set because the database could report a state of Unable to obtain
majority consesus. This would allow applications differentiate between a
database out of the set where writing to other databases in the set could
yield a sucessful result, and Unable to commit due to conflict where
trying other databases is pointless.

Al

Example
- Original Message -
From: David Walker [EMAIL PROTECTED]
To: Al Sutton [EMAIL PROTECTED]; Darren Johnson
[EMAIL PROTECTED]
Cc: Bruce Momjian [EMAIL PROTECTED]; Jan Wieck
[EMAIL PROTECTED]; [EMAIL PROTECTED];
PostgreSQL-development [EMAIL PROTECTED]
Sent: Sunday, December 15, 2002 2:29 PM
Subject: Re: [MLIST] Re: [mail] Re: [HACKERS] Big 7.4 items - Replication


 Another concern I have with multi-master systems is what happens if the
 network splits in 2 so that 2 master systems are taking commits for 2
 separate sets of clients.  It seems to me that to re-sync the 2 databases
 upon the network healing would be a very complex task or impossible task.

 On Sunday 15 December 2002 04:16 am, Al Sutton wrote:
  Many thanks for the explanation. Could you explain to me where the order
or
  the writeset for the following scenario;
 
  If a tranasction takes 50ms to reach one database from another, for a
  specific data element (called X), the following timeline occurs
 
  at 0ms, T1(X) is written to system A.
  at 10ms, T2(X) is written to system B.
 
  Where T1(X) and T2(X) conflict.
 
  My concern is that if the Group Communication Daemon (gcd) is operating
on
  each database,  a successful result for T1(X) will returned to the
client
  talking to database A because T2(X) has not reached it, and thus no
  conflict is known about, and a sucessful result is returned to the
client
  submitting T2(X) to database B because it is not aware of T1(X). This
would
  mean that the two clients beleive bothe T1(X) and T2(X) completed
  succesfully, yet they can not due to the conflict.
 
  Thanks,
 
  Al.
 
  - Original Message -
  From: Darren Johnson [EMAIL PROTECTED]
  To: Al Sutton [EMAIL PROTECTED]
  Cc: Bruce Momjian [EMAIL PROTECTED]; Jan Wieck
  [EMAIL PROTECTED]; [EMAIL PROTECTED];
  PostgreSQL-development [EMAIL PROTECTED]
  Sent: Saturday, December 14, 2002 6:48 PM
  Subject: Re: [mail] Re: [HACKERS] Big 7.4 items - Replication
 
   b) The Group Communication blob will consist of a number of processes
 
  which
 
   need to talk to all of the others to interrogate them for changes
which
 
  may
 
   conflict with the current write that being handled and then issue the
   transaction response. This is basically the two phase commit solution
 
  with
 
   phases moved into the group communication process.
   
   I can see the possibility of using solution b and having less group
   communication processes than databases as attempt to simplify things,
but this would mean the loss of a number of databases if the machine
running
 
  the
 
   group communication process for the set of databases is lost.
  
   The group communication system doesn't just run on one system.  For
   postgres-r using spread
   there is actually a spread daemon that runs on each database server.
It
   has nothing to do with
   detecting the conflicts.  Its job is to deliver messages in a total
   order for writesets or simple order
   for commits, aborts, joins, etc.
  
   The detection of conflicts will be done at the database level, by a
   backend processes.  The basic
   concept is if all databases get the writesets (changes) in the exact
   same order, apply them in a
   consistent order, avoid conflicts, then one copy serialization is
   achieved.  (one copy of the database
   replicated across all databases in the replica)
  
   I hope that explains the group communication system's responsibility.
  
   Darren
  
  
  
  
  
  
   ---(end of
broadcast)---
   TIP 5: Have you checked our extensive FAQ?
  
   http://www.postgresql.org/users-lounge/docs/faq.html
 

Re: [HACKERS] PQnotifies() in 7.3 broken?

2002-12-15 Thread Peter Eisentraut
Bruce Momjian writes:

 Do I need to increment the other interfaces that
 _use_ libpq, like ecpg?

If and only if the libpq API is part of their documented API.  For ecpg I
think this is not the case, but for libpq++ it would seem to be the case.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Information schema now available

2002-12-15 Thread Peter Eisentraut
Hannu Krosing writes:

 Could you also post it somewhere as a plain SQL script for 7.3 ?

It depends on SQL features not present in 7.3.  But you can download it
from the CVS web interface and make the adjustments yourself if you want
to use it.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] PQnotifies() in 7.3 broken?

2002-12-15 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 Bruce Momjian writes:
 Do I need to increment the other interfaces that
 _use_ libpq, like ecpg?

 If and only if the libpq API is part of their documented API.  For ecpg I
 think this is not the case, but for libpq++ it would seem to be the case.

However, an app linked against libpq++ would also be linked against
libpq, and so the incompatibility will be flagged by the linker anyway.
I can see no need to bump libpq++'s own number.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [mail] Re: [HACKERS] Big 7.4 items - Replication

2002-12-15 Thread Al Sutton
Jonathan,

How do the group communication daemons on system A and B agree that T2 is
after T1?,

As I understand it the operation is performed locally before being passed on
to the group for replication, when T2 arrives at system B, system B has no
knowlege of T1 and so can perform T2 sucessfully.

I am guessing that the System B performs T2 locally, sends it to the group
communication daemon for ordering, and then receives it back from the group
communication order queue after it's position in the order queue has been
decided before it is written to the database.

This would indicate to me that there is a single central point which decides
that T2 is after T1.

Is this true?

Al.

- Original Message -
From: Jonathan Stanton [EMAIL PROTECTED]
To: Al Sutton [EMAIL PROTECTED]
Cc: Darren Johnson [EMAIL PROTECTED]; Bruce Momjian
[EMAIL PROTECTED]; Jan Wieck [EMAIL PROTECTED];
[EMAIL PROTECTED]; PostgreSQL-development
[EMAIL PROTECTED]
Sent: Sunday, December 15, 2002 5:00 PM
Subject: Re: [mail] Re: [HACKERS] Big 7.4 items - Replication


 The total order provided by the group communication daemons guarantees
 that every member will see the tranactions/writesets in the same order.
 So both A and B will see that T1 is ordered before T2 BEFORE writing
 anything back to the client. So for both servers T1 will be completed
 successfully, and T2 will be aborted because of conflicting writesets.

 Jonathan

 On Sun, Dec 15, 2002 at 10:16:22AM -, Al Sutton wrote:
  Many thanks for the explanation. Could you explain to me where the order
or
  the writeset for the following scenario;
 
  If a tranasction takes 50ms to reach one database from another, for a
  specific data element (called X), the following timeline occurs
 
  at 0ms, T1(X) is written to system A.
  at 10ms, T2(X) is written to system B.
 
  Where T1(X) and T2(X) conflict.
 
  My concern is that if the Group Communication Daemon (gcd) is operating
on
  each database,  a successful result for T1(X) will returned to the
client
  talking to database A because T2(X) has not reached it, and thus no
conflict
  is known about, and a sucessful result is returned to the client
submitting
  T2(X) to database B because it is not aware of T1(X). This would mean
that
  the two clients beleive bothe T1(X) and T2(X) completed succesfully, yet
  they can not due to the conflict.
 
  Thanks,
 
  Al.
 
  - Original Message -
  From: Darren Johnson [EMAIL PROTECTED]
  To: Al Sutton [EMAIL PROTECTED]
  Cc: Bruce Momjian [EMAIL PROTECTED]; Jan Wieck
  [EMAIL PROTECTED]; [EMAIL PROTECTED];
  PostgreSQL-development [EMAIL PROTECTED]
  Sent: Saturday, December 14, 2002 6:48 PM
  Subject: Re: [mail] Re: [HACKERS] Big 7.4 items - Replication
 
 
   
   
   
   b) The Group Communication blob will consist of a number of processes
  which
   need to talk to all of the others to interrogate them for changes
which
  may
   conflict with the current write that being handled and then issue the
   transaction response. This is basically the two phase commit solution
  with
   phases moved into the group communication process.
   
   I can see the possibility of using solution b and having less group
   communication processes than databases as attempt to simplify things,
but
   this would mean the loss of a number of databases if the machine
running
  the
   group communication process for the set of databases is lost.
   
   The group communication system doesn't just run on one system.  For
   postgres-r using spread
   there is actually a spread daemon that runs on each database server.
It
   has nothing to do with
   detecting the conflicts.  Its job is to deliver messages in a total
   order for writesets or simple order
   for commits, aborts, joins, etc.
  
   The detection of conflicts will be done at the database level, by a
   backend processes.  The basic
   concept is if all databases get the writesets (changes) in the exact
   same order, apply them in a
   consistent order, avoid conflicts, then one copy serialization is
   achieved.  (one copy of the database
   replicated across all databases in the replica)
  
   I hope that explains the group communication system's responsibility.
  
   Darren
  
  
   
  
  
  
   ---(end of
broadcast)---
   TIP 5: Have you checked our extensive FAQ?
  
   http://www.postgresql.org/users-lounge/docs/faq.html
 
 
 
  ---(end of broadcast)---
  TIP 6: Have you searched our list archives?
 
  http://archives.postgresql.org

 --
 ---
 Jonathan R. Stanton [EMAIL PROTECTED]
 Dept. of Computer Science
 Johns Hopkins University
 ---




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to 

Re: [HACKERS] [GENERAL] PostgreSQL Global Development Group

2002-12-15 Thread cbbrowne
Kevin Brown wrote:
 Devrim G?ND?Z wrote:
  I do NOT like hearing about MySQL in this (these) list(s).
  
  PostgreSQL is not in the same category with MySQL. MySQL is for
  *dummies*, not database admins. I do not even call  it a database. I
  have never forgotten my data loss 2,5 years ago; when I used MySQL for
  just 2 months!!! 
 
 I think you're on to something here, but it's obscured by the way you
 said it.
 
 There's no question in my mind that PostgreSQL is superior in almost
 every way to MySQL.  For those of us who are technically minded, it
 boggles the mind that people would choose MySQL over PostgreSQL.  Yet
 they do.  And it's important to understand why.
 
 Simply saying MySQL has better marketing isn't enough.  It's too
 simple an answer and obscures some issues that should probably be
 addressed.

I think it /is/ a significant factor, the point being that the MySQL company 
has been quite activist in pressing MySQL as the answer, to the point to 
which there's a development strategy called LAMP (Linux + Apache + MySQL + 
(Perl|Python|PHP)).

 People use MySQL because it's very easy to set up, relatively easy to
 maintain (when something doesn't go wrong, that is), is very well
 documented and supported, and is initially adequate for the task they
 have in mind (that the task may change significantly such that MySQL
 is no longer adequate is something only those with experience will
 consider).

... And the consistent marketing pressure that in essence claims:

 - It's easier to use than any alternative;
 - It's much faster than any other DBMS;
 - It's plenty powerful and robust enough.

As near as I can tell, /none/ of these things are true outside of very 
carefully selected application domains.  But the claims have been presented 
enough times that people actually believe them to be true.

 PostgreSQL has come a long way and, with the exception of a few minor
 things (the need to VACUUM, for instance.  The current version makes
 the VACUUM requirement almost a non-issue as regards performance and
 availability, but it really should be something that the database
 takes care of itself), is equivalent to MySQL in the above things
 except for documentation and support.

I would point to a third thing:  Tools to support hands-off administration.  
My web hosting provider has a set of tools to let me administer various 
aspects of my site complete with pretty GUI that covers:
 - Configuring email accounts, including mailing lists, Spam Assassin, and 
such;
 - Configuring subdomains;
 - Managing files/directories, doing backups;
 - Apache configuration;
 - Cron jobs;
 - A couple of shopping cart systems;
 - A chat room system;
 - Last, but certainly not least, the ability to manage MySQL databases.

There is no canned equivalent for PostgreSQL, which means that ISPs that 
don't have people with DBMS expertise will be inclined to prefer MySQL.  It's 
a better choice for them.

 MySQL's documentation is very, very good.  My experience with it is
 that it's possible, and relatively easy, to find information about
 almost anything you might need to know.
 
 PostgreSQL's documentation is good, but not quite as good as MySQL's.
 It's not quite as complete.  For instance, I didn't find any
 documentation at all in the User's Guide or Administrator's Guide on
 creating tables (if I missed it, then that might illustrate that the
 documentation needs to be organized slightly differently).  I did find
 a little in the tutorial (about the amount that you'd want in a
 tutorial), but to find out more I had to go to the SQL statement
 reference (in my case I was looking for the means by which one could
 create a constraint on a column during table creation time).
 
 The reason this is important is that the documentation is *the* way
 people are going to learn the database.  If it's too sparse or too
 disorganized, people who don't have a lot of time to spend searching
 through the documentation for something may well decide that a
 different product (such as MySQL) would suit their needs better.
 
 The documentation for PostgreSQL improves all the time, largely in
 response to comments such as this one, and that's a very good thing.
 My purpose in bringing this up is to show you what PostgreSQL is up
 against in terms of widespread adoption.

That's probably pretty fair.  I'm using the word fair advisedly, too.

If someone objects, saying that PostgreSQL docs /are/ good, keep in mind that 
new users are not mandated to be fair about this.  If they have trouble 
finding what they were looking for, they couldn't care less that you think the 
docs are pretty good: /they/ didn't find what /they/ were looking for, and 
that's all they care about.

  If we want to sell PostgreSQL, we should talk about, maybe, Oracle.
  I have never took care of MySQL said. I just know that I'm running
  PostgreSQL since 2,5 years and I only stopped it JUST before upgrades
  of PostgreSQL. It's just *working*; which is unfamiliar 

Re: [HACKERS] PQnotifies() in 7.3 broken?

2002-12-15 Thread Jeroen T. Vermeulen
On Sun, Dec 15, 2002 at 11:21:07AM -0500, Tom Lane wrote:
 
 However, an app linked against libpq++ would also be linked against
 libpq, and so the incompatibility will be flagged by the linker anyway.
 I can see no need to bump libpq++'s own number.

Plus, of course, libpq++ being a C++ library will break compatibility at
some compiler upgrades--which isn't even necessarily visible to anyone
compiling a postgres release.


Jeroen


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [mail] Re: [HACKERS] Big 7.4 items - Replication

2002-12-15 Thread Al Sutton
Jonathan,

Many thanks for clarifying the situation some more. With token passing, I
have the following concerns;

1) What happends if a server holding the token should die whilst it is in
posession of the token.

2) If I have n servers, and the time to pass the token between each server
is x milliseconds, I may have to wait for upto m times x milliseconds in
order for a transaction to be processed. If a server is limited to a single
transaction per posession of the token (in order to ensure no system hogs
the token), and the server develops a queue of length y, I will have to wait
m times x times y for the transaction to be processed.  Both scenarios I
beleive would not scale well beyond a small subset of  servers with low
network latency between them.

If we consider the following situation I can illustrate why I'm still in
favour of a two phase commit;

Imagine, for example, credit card details about the status of an account
replicated in real time between databases in London, Moscow, Singapore,
Syndey, and New York. If any server can talk to any other server with a
guarenteed packet transfer time of 150ms a two phase commit could complete
in 600ms as it's worst case (assuming that the two phases consist of
request/response pairs, and that each server talks to all the others in
parallel). A token passing system may have to wait for the token to pass
through every other server before reaching the one that has the transaction
comitted to it, which could take about 750ms.

If you then expand the network to allow for a primary and disaster recover
database at each location the two phase commit still maintains it's 600ms
response time, but the token passing system doubles to 1500ms.

Allowing disjointed segments to continue executing is also a concern because
any split in the replication group could effectively double the accepted
card limit for any card holder should they purchase items from various
locations around the globe.

I can see an idea that the token may be passed to the system with the most
transactions in a wait state, but this would cause low volume databases to
loose out on response times to higher volume ones, which is again,
undesirable.

Al.

- Original Message -
From: Jonathan Stanton [EMAIL PROTECTED]
To: Al Sutton [EMAIL PROTECTED]
Cc: Darren Johnson [EMAIL PROTECTED]; Bruce Momjian
[EMAIL PROTECTED]; Jan Wieck [EMAIL PROTECTED];
[EMAIL PROTECTED]; PostgreSQL-development
[EMAIL PROTECTED]
Sent: Sunday, December 15, 2002 9:17 PM
Subject: Re: [mail] Re: [HACKERS] Big 7.4 items - Replication


 On Sun, Dec 15, 2002 at 07:42:35PM -, Al Sutton wrote:
  Jonathan,
 
  How do the group communication daemons on system A and B agree that T2
is
  after T1?,

 Lets split this into two separate problems:

 1) How do the daemons totally order a set of messages (abstract
 messages)

 2) How do database transactions get split into writesets that are sent
 as messages through the group communication system.

 As to question 1, the set of daemons (usually one running on each
 participating server) run a distributed ordering algorithm, as well as
 distributed algorithms to provide message reliability, fault-detection,
 and membership services. These are completely distributed algorithms, no
 central controller node exists, so even if network partitions occur
 the group communication system keeps running and providing ordering and
 reliability guarantees to messages.

 A number of different algorithms exist as to how to provide a total
 order on messages. Spread currently uses a token algorithm, that
 involves passing a token between the daemons, and a counter attached to
 each message, but other algorithms exist and we have implemneted some
 other ones in our research. You can find lots of details in the papers
 at www.cnds.jhu.edu/publications/ and www.spread.org.

 As to question 2, there are several different approaches to how to use
 such a total order for actual database replication. They all use the gcs
 total order to establish a single sequence of events that all the
 databases see. Then each database can act on the events as they are
 delivered by teh gcs and be guaranteed that no other database will see a
 different order.

 In the postgres-R case, the action received from a client is performned
 partially at the originating postgres server, the writesets are then
 sent through the gcs to order them and determine conflicts. Once they
 are delivered back, if no conflicts occured in the meantime, the
 original transaction is completed and the result returned to the client.
 If a conflict occured, the original transaction is rolled back and
 aborted. and the abort is returned to the client.

 
  As I understand it the operation is performed locally before being
passed on
  to the group for replication, when T2 arrives at system B, system B has
no
  knowlege of T1 and so can perform T2 sucessfully.
 
  I am guessing that the System B performs T2 locally, sends it to the
group
  

Re: [HACKERS] [GENERAL] PostgreSQL Global Development Group

2002-12-15 Thread Christopher Kings-Lynne
 You can't sell into the ISP appliance market until there's something as
 ubiquitous as PHPMyAdmin for PostgreSQL.  And note that the ISP
appliance
 market only cares about this in a very indirect way.  They don't actually
use
 the database; their /customers/ do.  And their customers are likely to be
 fairly unsophisticated souls who will use whatever database is given to
them.

Hey!  What about phpPgAdmin?

We're actually working on a next generation version atm which is a total
rewrite that:

1. modern php
2. register_globals off, full error checking
3. themable
4. Easily supports all versions
5. etc.

However, even with repeated calls for developers, it's just me and Rob
Treat!

phpPgAdmin does not work with 7.3 so this in an increasingly important
project.

Anyone wanna help? :)

http://phppgdamin.sourceforge.net/

Maybe we should move to gborg?

Chris



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



[HACKERS] PageGetMaxOffsetNumber

2002-12-15 Thread Alvaro Herrera
Hello hackers,

I wonder what the PageGetMaxOffsetNumber macro is supposed to do (in a
btree index page)?

The scenario is the following:  I need to get the pointer to a btree
page (leaf or internal) in its parent page, to call _bt_itemdel on it.
The only thing I can think of is to use its parent link and then iterate
in every OffsetNumber to see which item points down to the page I'm
looking for.  I think the gotcha here is that the parent page can split
and I may have to walk right in order to find the new parent (Is this
right?  Can the parent go someplace else?)

I iterate over the elements of the parent page in a for loop, and the
upper bound is rarely reached because the item is found.  However
sometimes the item isn't found, and PageGetItem fails its assertion
because the item isn't used (LP_USED).  I have found that
PageGetMaxOffsetNumber (the upper bound) returns a consistent value
that's far too high (4294967291, 0xFFFB) and therefore the for loop
eventually falls out of bounds.


The btree freelist patch is almost ready (which means it works in the
trivial cases I've tested and there are some corner cases I haven't
even covered), however I am stuck on this.  Can anyone give me a light?
Maybe there's another way to get at the pointer to a page in its parent?

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
Crear es tan dificil como ser libre (Elsa Triolet)

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] PageGetMaxOffsetNumber

2002-12-15 Thread Alvaro Herrera
On Sun, Dec 15, 2002 at 11:49:57PM -0300, Alvaro Herrera wrote:

 I iterate over the elements of the parent page in a for loop, and the
 upper bound is rarely reached because the item is found.  However
 sometimes the item isn't found, and PageGetItem fails its assertion
 because the item isn't used (LP_USED).  I have found that
 PageGetMaxOffsetNumber (the upper bound) returns a consistent value
 that's far too high (4294967291, 0xFFFB) and therefore the for loop
 eventually falls out of bounds.

FWIW, the code that is supposed to do this (and appears to work fine on
most cases) is the following.  buf is the page I'm going to free and
pblkno is the BlockNumber of its parent as seen in btpo_parent.

/*
 * Delete the pointer to a child page.  If the parent page is empty after
 * the deletion, delete the pointer from its parent too.
 */
static void
_bt_deletefromparent(Relation rel, BlockNumber pblkno, Buffer buf)
{
Buffer  pbuf;
OffsetNumberoffnum;
Pageppage;
BTPageOpaquepop;
BlockNumber blkno = BufferGetBlockNumber(buf),
max;

pbuf = _bt_getbuf(rel, pblkno, BT_WRITE);
Assert(!BufferIsInvalid(pbuf));

ppage = BufferGetPage(pbuf);
pop = (BTPageOpaque) PageGetSpecialPointer(ppage);

/*
 * Repeat until the correct parent page is found.   Splits may
 * cause the parent page to move right.
 */
for (;;)
{
BlockNumber next;

/* Make sure no one else tries to look at the page. */
LockBuffer(pbuf, BUFFER_LOCK_UNLOCK);
LockBufferForCleanup(pbuf);
max = PageGetMaxOffsetNumber(pop);

/*
 * Look every offset of the page for the item pointing to the
 * dead page.
 */
for (offnum = FirstOffsetNumber; offnum = max; offnum++)
{
BTItem  item;
ItemPointer iptr;

item = (BTItem) PageGetItem(ppage, PageGetItemId(ppage, 
offnum));
iptr  = (item-bti_itup.t_tid);
if (ItemPointerGetBlockNumber(iptr) == blkno)
{
/* Ok, we found the page.  Now delete the pointer. */
ItemPointer iptrd = palloc(SizeOfIptrData);

ItemPointerSet(iptrd, pblkno, offnum);
_bt_itemdel(rel, pbuf, iptrd);
_bt_wrtnorelbuf(rel, pbuf);
LockBuffer(pbuf, BUFFER_LOCK_UNLOCK);

/*
 * If the parent page is empty after this deletion,
 * mark it dead and free it too.
 */
if (_bt_pageisempty(ppage))
{
pop-btpo_flags |= BTP_DEAD;
_bt_processdead(rel, pbuf);
}
ReleaseBuffer(pbuf);
return;
}
}

/*
 * If we just finished scanning the rightmost page of the upper level,
 * there's some wrong.
 */
if (P_RIGHTMOST(pop))
elog(ERROR, Unable to find parent page!);

/* Oops, the parent was split.  Check its right sibling. */
next = pop-btpo_next;
_bt_relbuf(rel, pbuf);
pbuf = _bt_getbuf(rel, next, BT_WRITE);
ppage = BufferGetPage(pbuf);
pop = (BTPageOpaque) PageGetSpecialPointer(ppage);
}
}

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
Aprende a avergonzarte mas ante ti que ante los demas (Democrito)

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



[HACKERS] about table function

2002-12-15 Thread Jinqiang Han
hello, Joe Conway. 

it seems the crosstab function you write in contrib/tablefunc can't work.
it saids that  A column definition list is required for functions returning RECORD.
How can i make it work?
thank you very much

best regard,

Jinqiang Han





---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] about table function

2002-12-15 Thread Joe Conway
Jinqiang Han wrote:
 it seems the crosstab function you write in contrib/tablefunc can't work.
 it saids that  A column definition list is required for functions returning RECORD.
 How can i make it work?

From contrib/tablefunc/README.tablefunc:

Outputs

  Returns setof record, which must defined with a column definition
  in the FROM clause of the SELECT statement, e.g.:

SELECT *
FROM crosstab(sql, 2)
AS ct(row_name text, category_1 text, category_2 text);

the example crosstab function produces a set something like:
  == values  columns ==
   row_name   category_1   category_2
   -++
 row1val1 val2
 row2val5 val6


You are apparently forgetting the
  AS ct(row_name text, category_1 text, category_2 text)
part.

hope this helps,

Joe


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] PQnotifies() in 7.3 broken?

2002-12-15 Thread Greg Copeland
But it's something they should of already had to do.  We're just paying
late for old sins.  ;)  

Greg


On Thu, 2002-12-12 at 23:34, Bruce Momjian wrote:
 Tom Lane wrote:
  Bruce Momjian [EMAIL PROTECTED] writes:
   OK, so what do we do with 7.3.1.  Increment major or minor?
  
  Major.  I thought you did it already?
 
 I did only minor, which I knew was safe.  Do folks realize this will
 require recompile of applications by 7.3 users moving to 7.3.1?  That
 seems very drastic, and there have been very few problem reports about
 the NOTIFY change.
-- 
Greg Copeland [EMAIL PROTECTED]
Copeland Computer Consulting


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [mail] Re: [HACKERS] Big 7.4 items - Replication

2002-12-15 Thread Jan Wieck
Darren Johnson wrote:

 The group communication system doesn't just run on one system.  For
 postgres-r using spread

The reason why group communication software is used is simply because
this software is designed with two goals in mind:

1) optimize bandwidth usage

2) make many-to-many communication easy

Number one is done by utilizing things like multicasting where
available.

Number two is done by using global scoped queues.

I add this only to avoid reading that pushing some PITR log snippets via
FTP or worse over a network would do the same. It did not in the past,
it does not do right now and it will not do in the future.


Jan

-- 

#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] [GENERAL] PostgreSQL Global Development Group

2002-12-15 Thread Kevin Brown
[EMAIL PROTECTED] wrote:
 Kevin Brown wrote:
  Simply saying MySQL has better marketing isn't enough.  It's too
  simple an answer and obscures some issues that should probably be
  addressed.
 
 I think it /is/ a significant factor, the point being that the MySQL company 
 has been quite activist in pressing MySQL as the answer, to the point to 
 which there's a development strategy called LAMP (Linux + Apache + MySQL + 
 (Perl|Python|PHP)).

Oh, I'll certainly not dispute that marketing has had a significant
effect, but I don't think it's the only reason for MySQL's success.

History has a lot to do with it, because it's through history that
momentum gets built up, as it has with MySQL.

  People use MySQL because it's very easy to set up, relatively easy to
  maintain (when something doesn't go wrong, that is), is very well
  documented and supported, and is initially adequate for the task they
  have in mind (that the task may change significantly such that MySQL
  is no longer adequate is something only those with experience will
  consider).
 
 ... And the consistent marketing pressure that in essence claims:
 
  - It's easier to use than any alternative;
  - It's much faster than any other DBMS;
  - It's plenty powerful and robust enough.
 
 As near as I can tell, /none/ of these things are true outside of very 
 carefully selected application domains.  But the claims have been presented 
 enough times that people actually believe them to be true.

I agree with you -- now.  But the situation as it is now has not
always been.  Consider where PostgreSQL was 4 years ago.  I believe it
was at version 6 at that time, if I remember correctly.  And as I
recall, many people had very significant issues with it in the key
areas of performance and reliability.  Now, I didn't experience these
things firsthand because I wasn't using it at the time, but it is the
general impression I got when reading the accounts of people who
*were* using it.

MySQL at the time wasn't necessarily any more reliable, but it had one
thing going for it that PostgreSQL didn't: myisamchk.  Even if the
database crashed, you stood a very good chance of being able to
recover your data without having to restore from backups.  PostgreSQL
didn't have this at all: either you had to be a guru with the
PostgreSQL database format or you had to restore from backups.  That
meant that *in practice* MySQL was easier to maintain, even it crashed
more often as PostgreSQL, because the amount of administrative effort
to deal with a MySQL crash was so much less.

  PostgreSQL has come a long way and, with the exception of a few minor
  things (the need to VACUUM, for instance.  The current version makes
  the VACUUM requirement almost a non-issue as regards performance and
  availability, but it really should be something that the database
  takes care of itself), is equivalent to MySQL in the above things
  except for documentation and support.
 
 I would point to a third thing: Tools to support hands-off
 administration.  My web hosting provider has a set of tools to let
 me administer various aspects of my site complete with pretty GUI
 that covers:

  - Configuring email accounts, including mailing lists, Spam
Assassin, and such;
  - Configuring subdomains;
  - Managing files/directories, doing backups;
  - Apache configuration;
  - Cron jobs;
  - A couple of shopping cart systems;
  - A chat room system;
  - Last, but certainly not least, the ability to manage MySQL
databases.
 
 There is no canned equivalent for PostgreSQL, which means that
 ISPs that don't have people with DBMS expertise will be inclined to
 prefer MySQL.  It's a better choice for them.

This is true, but the only way to combat that is to get PostgreSQL
more widely deployed.  Network effects such as that are common in the
computing world, so it doesn't come as much surprise that the most
popular database engine in the webhosting world is the best supported
one for that role.

It's only because of the relative popularity of MySQL that it has so
much support.  The only way to grow PostgreSQL's popularity is to get
it deployed in situations where the tools available for it are
sufficient.

  But you're mistaken if you believe that MySQL isn't competition for
  PostgreSQL.  It is, because it serves the same purpose: a means of
  storing information in an easily retrievable way.
 
 Indeed.  People with modest data storage requirements that came in
 with /no/ comprehension of what a relational database is may find
 the limited functionality of MySQL perfectly reasonable for their
 purposes.

This is true, but the biggest problem is that the requirements of a
project often balloon over time, and the demands on the database
backend will also tend to increase.  Because MySQL is rather limited
in its functionality, it doesn't take much until you'll be forced to
use a different database backend.

This is why I view PostgreSQL as a much wiser choice in almost all
cases where you