date:20040517

Re: [HACKERS] email data type first release

2004-05-17 Thread Greg Stark


Gaetano Mendola [EMAIL PROTECTED] writes:

 Comments are welcomed.

Well as long as you're asking...

Email domains are case insensitive, but the left hand side is case sensitive.
That's the only part that's hard to handle using a text data type, it would be
kind of neat if the email operators got it right.

Another thing is that it might make more sense to sort email addresses by
domain first (case insensitively of course), then by left hand side (case
sensitively). Since the domain is really the most significant bit. This is
also convenient for many systems like email since they perform better when
they can handle data in that order.

Note that this would make the type sort differently from its text
representation. This shouldn't really be a problem but occasionally you see
poorly written queries that introduce extra type conversions that the user
doesn't expect. But then if it behaves just like the text datatype then there
wouldn't be much point in using it.


-- 
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

[HACKERS] Clean-up callbacks for non-SR functions

2004-05-17 Thread James William Pye

Greets,

Is there a way for me to clean up fn_extra in flinfo when the function is not
a Set Returning Function(SRF)? 

I know I can use RegisterExprContextCallback and the RSI's econtext to register a 
callback
for SRFs, but this--or similar functionality--does not appear to be available for 
non-SRFs.

Am I missing something?

(This question comes from trying to keep non-SRF generator support in PL/Py
from leaking all over the floor..)

Regards,
James William Pye


pgpLEIU51G2Ic.pgp
Description: PGP signature

Re: [HACKERS] Rough draft for Unicode-aware UPPER()/LOWER()/INITCAP()

2004-05-17 Thread Marko Karppinen

Tatsuo Ishii wrote:
initdb could even emit a warning if the --encoding option was
used without also specifying --no-locale.
Please don't do that. Most Asian chasets does not work with locale
enabled PostgreSQL installation. i.e. it returns WRONG SELECT
results. I've been telling this to Japanese users for hundreds of
times when they ask me why SELECT returns results. If that kind of
wanings are installed, I think more Japanese users will be confused.
You've advocated a default of --no-locale yourself for this reason.
If using a Japanese --encoding setting without --no-locale emitted
a warning suggesting the use of --no-locale, I'd imagine you wouldn't
have had to give human support to most of those hundreds of people?
Wouldn't that be a halfway point to your goal?
mk
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Hans-Jürgen Schönig

Marc G. Fournier wrote:
On Mon, 17 May 2004, Bruce Momjian wrote:

Marc G. Fournier wrote:
Agreed, but you are a me too, not a huge percentage of our userbase.
How do you know?  Have you polled our complete userbase?

Basically, after 6-7 months of development, I want more than a vacuum
patch and a new cache replacement policy.  I want something big, in
fact, several big things.
Most likely won't happen, since what is considered big by you isn't
necessarily what is considered big by someone else ... as Hannu, and I
believe, Jan, have so far pointed out to you ...
I can't poll for everything.  I make my own educated guesses.

Based on what though?
All the clients that I deal with on a daily basis generally care about is
performance ... that is generally what they upgrade for ... so, my
'educated guess' based on real world users is that Win32, PITR and nested
transactions are not important ... tablespaces, I have one client that has
asked about something *similar* to it, but tablespaces, for him, doesn't
come close to what they would like to see ...
So, my 'educated guess' is different then yours is ... does that make
yours wrong?  Nope ... just means we have different sample sets to work
with ...

Interesting.
We have made COMPLETELY different experiences.
There is one question people ask me daily: When can we have sychronous 
replication and PITR?.
Performance is not a problem here. People are more interested in 
stability and enterprise features such as those I have mentioned above.

I am still wondering about two things:
Somebody has posted a 2PC patch - I haven't seen too many comments
Somebody has posted sync multimaster replication (PgCluster) - nobody 
has commented on that. Maybe I am the only one who has ever tried it ...

Most likely this is not very encourageing for the developers involved ...
Regards,
Hans
--
Cybertec Geschwinde u Schoenig
Schoengrabern 134, A-2020 Hollabrunn, Austria
Tel: +43/720/10 1234567 or +43/664/233 90 75
www.cybertec.at, www.postgresql.at, kernel.cybertec.at
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] enabling tcpip_socket by default

2004-05-17 Thread Tatsuo Ishii

  Tatsuo Ishii wrote:
  Is there any security risk if we enable tcpip_socket by default? We
  restrict connection from localhost only by default so I think enabling
  tcpip_socket adds no security risk. Please correct me if I am wrong.
 
 Bruce Momjian wrote:
  Right, and 7.5 will ship with tcp and localhost enabled.
 
 If the default will be to listen on all interfaces, not just 127.0.0.1,
 then this IS a security risk. And if that's not the plan, what good does
 this change do? Any real use of tcp would still require a 
 configuration
 change anyway.

Consider a program using JDBC on localhost. It can only reach to
PostgreSQL via TCP/IP.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] enabling tcpip_socket by default

2004-05-17 Thread Philip Yarra

On Mon, 17 May 2004 05:29 pm, Marko Karppinen wrote:
 If the default will be to listen on all interfaces, not just 127.0.0.1,
 then this IS a security risk. And if that's not the plan, what good does
 this change do? Any real use of tcp would still require a 
 configuration
 change anyway.

From what I have read today, localhost is the only one planned to be on by 
default.

Listening on a TCP/IP socket on localhost will allow JDBC connections to work 
out of the box (this exact problem tripped me up, and I'm sure this is the 
case for other users). 

I suspect other cross-platform APIs (PHP for example) need TCP/IP sockets to 
connect too. 

Regards, Philip.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] enabling tcpip_socket by default

2004-05-17 Thread Marko Karppinen

On 17. touko 2004, at 10:40, Tatsuo Ishii wrote:
Consider a program using JDBC on localhost. It can only reach to
PostgreSQL via TCP/IP.
Ah! Of course. That makes sense, and listening on 127.0.0.1 never
hurt anyone (except, of course, the tinfoil hat crowd nmapping
localhost in a frenzy...)
mk
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] enabling tcpip_socket by default

2004-05-17 Thread Marko Karppinen

Tatsuo Ishii wrote:
Is there any security risk if we enable tcpip_socket by default? We
restrict connection from localhost only by default so I think enabling
tcpip_socket adds no security risk. Please correct me if I am wrong.
Bruce Momjian wrote:
Right, and 7.5 will ship with tcp and localhost enabled.
If the default will be to listen on all interfaces, not just 127.0.0.1,
then this IS a security risk. And if that's not the plan, what good does
this change do? Any real use of tcp would still require a 
configuration
change anyway.

Listening on public network interfaces by default would multiply by
orders of magnitude the number of machines vulnerable to potential
future remote exploits.
I gather that the pre-authentication code paths are pretty well known,
and that the chances of such an attack are slim. Nevertheless I cannot
help but note that it is exactly this default setting that caused
Microsoft SQL Server to lose a big, big chunk of its reputation, and
gain notoriety as a launchpad for Windows worms.
mk
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] enabling tcpip_socket by default

2004-05-17 Thread Peter Eisentraut

Marko Karppinen wrote:
 If the default will be to listen on all interfaces, not just
 127.0.0.1, then this IS a security risk. And if that's not the plan,
 what good does this change do? Any real use of tcp would still
 require a
 configuration
 change anyway.

Some interfaces, most notably JDBC, require the use of TCP/IP but will 
often be perfectly happy with a locally bound socket only.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] MS SQL features for new version

2004-05-17 Thread Andreas Pflug

Shridhar Daithankar wrote:
Hi all,
Just stumbled upon this. just an FYI,
http://www.microsoft.com/sql/yukon/productinfo/top30features.asp
 

Notice the Snapshot Isolation. Sounds like MVCC for MSSQL?
Regards,
Andreas

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] add server include files to default installation?

2004-05-17 Thread Fabien COELHO


Dear Tom,

  I wish to submit a small patch so that server includes and
  all necessary configuration files could be installed *by default*.

 There is a reason why install-all-headers is not the default.

Sure. I hope so!

I'm questionning the pros and cons. I'm arguing that the cons against the
current situation are strong. Thanks for providing the pros, so it can be
discussed.

 It is that it shouldn't be the default: most people do not need it,

How will people know that they will not need it afterwards?

As far as I'm concerned, I noticed it afterwards:-(
I'm a pretty stupid user, but I'm sure I'm not the only one.

For instance several weeks after installing a pg, I wanted to test
tsearch2 on one database. What do I need to do? Well, basically I need a
full recompile, or at least a full extract of the sources, re-configure
and compilation of the relevant contrib part. When reconfiguring, I need
to remember what options I used.

Now, as someone trying to develop extensions, I need to rely on some sound
and extendable default installation.

 because they will never build any C extensions for themselves.

Hey, how do they know that before hand? They might need extensions. If
they need extensions, it is likely that they may have a C part. Most
contrib have a C part. External projects are likely to have a C part.
Some in user land (e.g. SPI_*), but not necessarilly.

Say aclitem accessor functions or new aggregates that may be needed for
some project that may be of interest to people. If there are not in, you
have to compile them. If the project cannot rely on the fact that
everything needed will be provided, it is no good.

 It would just be a waste of disk space for them.

2MB... that is less than 0.002 EUR.

Moreover, I do not think that it is wasted, especially as postgresql is
moving things outside the main project (pgfoundry, contrib, gborg). You
must help external projects to live if you want them to leave;-)

As a compromise, we can have two makefile targets, say install and
light-install.

I argue that the default should be the one which enable later extensions.

 If we did do such a thing it would have zero impact on the majority of
 users anyway, because the RPM and other packagers will still put these
 files into separate postgresql-devel packages, which would still not be
 installed by default.

Thanks for this argument! If packagers do it (I know of BSD ports) it mean
that they think it is useful. So it should be the default for people who
*bother* to install pg by hand, so that they don't have to do it twice
because they did not notice this installation subtlety.

IMHO, it is the core business of postgresql to help external projects by
providing a usable default infrastructure for them. As another example of
this bad behavior, having external projects (pgadmin or phppgadmin) to
parse aclitem descriptors because you want to keep things opaque and
they need it anyway is damaging.

Thanks for you answer, have a nice day,

-- 
Fabien Coelho - [EMAIL PROTECTED]

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] email data type first release

2004-05-17 Thread Tommi Maekitalo

...

 Another thing is that it might make more sense to sort email addresses by
 domain first (case insensitively of course), then by left hand side (case
 sensitively). Since the domain is really the most significant bit. This
 is also convenient for many systems like email since they perform better
 when they can handle data in that order.

 Note that this would make the type sort differently from its text
 representation. This shouldn't really be a problem but occasionally you see
 poorly written queries that introduce extra type conversions that the user
 doesn't expect. But then if it behaves just like the text datatype then
 there wouldn't be much point in using it.

Sorting should then be done by top-level-domain first. Then 2nd, 3rd... and 
last by user.

[EMAIL PROTECTED]  [EMAIL PROTECTED]
and
[EMAIL PROTECTED]  [EMAIL PROTECTED]

we get then the order:
[EMAIL PROTECTED]  [EMAIL PROTECTED]  [EMAIL PROTECTED]  [EMAIL PROTECTED]

rather than (in normal text-order):
[EMAIL PROTECTED]  [EMAIL PROTECTED]  [EMAIL PROTECTED]  [EMAIL PROTECTED]

Tommi

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[HACKERS] Table Spaces

2004-05-17 Thread Gavin Sherry

Hi all,

Attached is a patch against HEAD implementing tablespaces.

I've done some testing on Linux and BSD. I've also compiled without
HAVE_SYMLINK defined -- which determines whether or not tablespaces are
available.

The reason for this is that symlinks are used extensively to simplify
access to relations in hairy places.

This is extensive discussion of this in the archives.

There are some outstanding TODOs. These include

pg_dump
documentation
regression tests
comment on

Chris KL has offered to tackle these.

Other outstanding things which need to be resolved:

XLog entries:

Within a tablespace, the files for different databases are isolated into
directories. Those directory names are the database oid for those files.
Since we create those files, we should log them. The question is, what is
the best way: appending it to an xlrec and storing its length in the xlrec
or adding another rdata with that in it?

The same goes for symlinks, where we need to store the oldpath and
newpath.

Alternative database location:

Should this code be removed now?

Drop tablespace:

I haven't added an interlock here (as suggested by Tom) to handle
situations where one transaction schedules the removal of a tablespace and
concurrently another transaction makes use of that tablespace. I'd
appreciate some advice on what this would look like: should a command
creating an object touch a CREATE a file and a DROP TABLESPACE block until
the file goes away, then check if it can drop the tablespace (ie, that the
tablespace is empty)? Likewise when DROP TABLESPACE proceeds, should it
block creates?

The other thing is handling DROP TABLESPACE in a transaction where the
tablespace has been emptied (using DROP TABLE, etc) during that
transaction. Because we only schedule the unlink() of the relation, the
data files will (and must) still be around. Is it reasonable to look
inside PendingRelDeletes to see if the tablespace will be empty at COMMIT?

Basic usage rundown

initdb sets up a new directory pg_tablespaces with some symlinks with in
it. To create a tablespace, you must first up create a tablespace
directory. Then you can create a tablespace inside postgres using:

CREATE TABLESPACE name LOCATION '/path/to/tbl/dir'

Then, the DDL commands can have a tablespace associatged with it:

CREATE DATABASE ... TABLESPACE name
CREATE SCHEMA ... TABLESPACE name
CREATE TABLE ... TABLESPACE name
CREATE INDEX ... TABLESPACE name
CREATE SEQUENCE ... TABLESPACE name

Comments, criticisms, etc, all welcome, of course.

Thanks,

Gavin

tablespace-9.diff.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] enabling tcpip_socket by default

2004-05-17 Thread Andrew Dunstan

Marko Karppinen said:
 Tatsuo Ishii wrote:
 Is there any security risk if we enable tcpip_socket by default? We
 restrict connection from localhost only by default so I think
 enabling tcpip_socket adds no security risk. Please correct me if I
 am wrong.

 Bruce Momjian wrote:
 Right, and 7.5 will ship with tcp and localhost enabled.

 If the default will be to listen on all interfaces, not just 127.0.0.1,
 then this IS a security risk. And if that's not the plan, what good
 does this change do? Any real use of tcp would still require a
 configuration
 change anyway.

This is demonstrably not true.


 Listening on public network interfaces by default would multiply by
 orders of magnitude the number of machines vulnerable to potential
 future remote exploits.

 I gather that the pre-authentication code paths are pretty well known,
 and that the chances of such an attack are slim. Nevertheless I cannot
 help but note that it is exactly this default setting that caused
 Microsoft SQL Server to lose a big, big chunk of its reputation, and
 gain notoriety as a launchpad for Windows worms.


Why are we having a discussion about a change that went in 2 months ago
and was fully debated back then?

Of course we did not enable listening on any all addresses by default.
Maybe you think we are not security conscious?

The change was done because
 - Windows does not have Unix sockets, and can only talk via TCP
 - Some interfaces (notably JDBC) also require it
 - It is probably the number one thing that trips up new pg users.

We listen by default on localhost (note, not 127.0.0.1).

Read the dev version of the docs for details.

cheers

andrew



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] add server include files to default installation?

2004-05-17 Thread Fabien COELHO


Dear hackers,

 Agreed.  I would also like to see Makefile.global installed.
 pg_config.h has C-level configs, and Makefile.global has the
 Makefile-level configs.

There is also config.status which is definitely of interest as it
allows to recreate the build tree, and which is not installed by default.

Independently of the actual file list to be included, where
could these makefiles be installed?

 - share is for architecture independent files. Not really the case.

 - include is rather for C files... but Makefile.global and others are
   actually included, so it may make sense? If so, should it be directly
   in the include/postgresql subdir, or some special subdir, say
   include/postgresql/config?

 - some other directory?

I would tend to put everything in include/postgresql/config:
 - Makefile.global
 - a copy of Makefile.port (the actual file, not the link)
 - config.status
 - possibly other included files (NLS? win32? cygwin?)

Another issue is that Makefile.global seems hardwired to be in
the src subdirectory of a postgresql source tree, and to find other
files there:

# Pull in platform-specific magic
include $(top_builddir)/src/Makefile.port

This is an issue for external tools that would like to include
Makefile.global so as to be in the same environment as the server
compilation.

Any idea? The best I have would be to create a src subdir in the
installation, so that top_builddir could be set to
.../include/postgresql/config and everything would work from that
point of view.

-- 
Fabien COELHO _ http://www.cri.ensmp.fr/~coelho _ [EMAIL PROTECTED]
   CRI-ENSMP, 35, rue Saint-Honoré, 77305 Fontainebleau cedex, France
   phone: (+33|0) 1 64 69 {voice: 48 52, fax: 47 09, standard: 47 08}
     All opinions expressed here are mine  _

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Bruce Momjian

Hans-Jürgen Schönig wrote:
  All the clients that I deal with on a daily basis generally care about is
  performance ... that is generally what they upgrade for ... so, my
  'educated guess' based on real world users is that Win32, PITR and nested
  transactions are not important ... tablespaces, I have one client that has
  asked about something *similar* to it, but tablespaces, for him, doesn't
  come close to what they would like to see ...
  
  So, my 'educated guess' is different then yours is ... does that make
  yours wrong?  Nope ... just means we have different sample sets to work
  with ...
 
 
 
 Interesting.
 We have made COMPLETELY different experiences.
 
 There is one question people ask me daily: When can we have sychronous 
 replication and PITR?.
 Performance is not a problem here. People are more interested in 
 stability and enterprise features such as those I have mentioned above.
 
 I am still wondering about two things:
 Somebody has posted a 2PC patch - I haven't seen too many comments

He is waiting for nested transactions to be committed so he can merge
his work in.

 Somebody has posted sync multimaster replication (PgCluster) - nobody 
 has commented on that. Maybe I am the only one who has ever tried it ...

I think it should be on gborg.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Table Spaces

2004-05-17 Thread Bruce Momjian

Gavin Sherry wrote:
 Alternative database location:
 
 Should this code be removed now?

Yes, I believe we agreed on this.  One of the committers will take care
of that.

The only downside to removal is that folks without symlinks (I believe
Win32 only) will loose that functionality with nothing to replace it. 
However, I think the clarity of removing it is worth it.  Also, I think
someone had a special way to do symlinks on Win32 and we should look
into that.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Jan Wieck

Marc G. Fournier wrote:
On Sun, 16 May 2004, Bruce Momjian wrote:
I personally don't think Win32 is enough of a new feature either, but
others disagree.
Jan, correct me if I'm wrong ... Jan's point is that we have enough
already to warrant a beta on June 1st, even without Win32 ... Win32 (or
any of the other stuff, like PITR/tablespaces) would be icing on the cake
...
I agree that we don't have many of the big marketing bang for the buck
features done for 7.5. But that is no reason to use wordings like nifty
enhancements or for a small percentage in an attempt to make what we 
have look uninteresting for the average user so that it looks wise to 
wait, and wait, and wait. Just that someone doesn't understand the 
importance of these issues because he doesn't deal with the type of 
customer that values a good standard deviation on response times doesn't 
make Win32 more important. Judging by those standards, we probably 
shouldn't have had 7.4 at all, and we probably shouldn't claim that we 
want to attract typical Oracle users either.

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Bruce Momjian

Jan Wieck wrote:
 Christopher Kings-Lynne wrote:
 
  Jan, correct me if I'm wrong ... Jan's point is that we have enough
  already to warrant a beta on June 1st, even without Win32 ... Win32 (or
  any of the other stuff, like PITR/tablespaces) would be icing on the cake
  ...
  
  I think we're close enough on win32 to wait for it.  It would look bad 
  for us to miss inclusion of win32 two releases in a row...
 
 Yes, unless this is forever. Is there a clear commitment that Win32 
 will be done if we push from June to July?

It is too late to think about pushing back another month.  We had this
discussion already.  June 1 is it.

I do think Win32 will make it, though.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Zeugswetter Andreas SB SD


 It is too late to think about pushing back another month.  We had this
 discussion already.  June 1 is it.

I thought the outcome of that discussion was June 15 ?

Can we try to do the 2PC patch now instead of waiting for subtransactions ?

Andreas

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Multiple Xids in PGPROC?

2004-05-17 Thread Jan Wieck

Bruce Momjian wrote:
Tom Lane wrote:
Manfred Koizar [EMAIL PROTECTED] writes:
 The straightforward pg_clog lookup is still in transam.c,
 but has been deactivated:
  * Now this func in shmem.c and gives quality answer by scanning
  * PGPROC structures of all running backend. - vadim 11/26/96
 What was the motivation for this change?  Consistency or speed?
Getting the right answer --- the other way can't tell the difference
between an open transaction and a crashed one.
  .  We could include a small number of subtransaction xids in PGPROC.
Yeah, I was just thinking that myself.  If we only need to show open
subtrans xids, then the number you'd need would depend on nesting depth
not the total number of subxacts used.  So half-a-dozen or so would
probably suffice for 99% of situations.  You'd need a flag that could be
set to show I'm so deeply nested I can't fit all my subxacts here,
but you'd only need to go to pg_subtrans when that happened.
On the other hand, I'm not sure how much that helps, considering you
probably have to resolve the subtrans XID up to its parent anyway to
check commit/abort status.
I am confused.  Don't we need to know about all subtransctions, not just
opened ones?
BEGIN; -- xid=100
BEGIN; -- xid=101
COMMIT;
At this point, don't backends need to know the parent of xid 101,
meaning we can't limit visibility to just the transactions that are
currently openly nested?
That's right. You can only forget about nested rolled back ones, because 
no matter what, their changes will never become valid. But every 
committed subtransaction is from the outside still in progress until the 
outmost transaction commits/rolls back.

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Email data type

2004-05-17 Thread Gaetano Mendola

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Bruno Wolff III wrote:
| On Sun, May 16, 2004 at 04:36:55 +0200,
|   Gaetano Mendola [EMAIL PROTECTED] wrote:
|
|The type is indexable and provide also conversion methods:
|
|text -- email
|and the operator , is possible use it in select like:
|
|
| When you are converting between text and email data, what format are the
| text version of the address going to be? For example you might be using
| rfc2821 encoding, rfc2822 encoding or concatenating the local part,
| an @ sign and the domain name. Don't forget about domain literals.
Actually I use local_part@domain_name
Also the validator will validate emails in this form, if you are thinking to
validate emails as:
Gaetano M. Public(junior)gmendola@(new account)bigfoot.com
that are perfectly valid I think that it's a valid option to consider.
About the domain literals, I think to validate it in the near future,
rejecting private subnet according to this list:
10.0.0.0 - 10.255.255.255
172.16.0.0 - 172.31.255.255
192.168.0.0 - 192.168.255.255
169.254.0.0 -169.254.255.255
I think I have to discard also the addresses with last octet equal to 256.
Any comments ?
Regards
Gaetano Mendola

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFAqNPP7UpzwH2SGd4RAgqxAJ9UCPOOTpPN6Tl9w1748inn7t4a/QCg7pl1
8O3lHDwEKsCuCOJYQAo0vHM=
=wFo+
-END PGP SIGNATURE-
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] email data type first release

2004-05-17 Thread Gaetano Mendola

Greg Stark wrote:
Gaetano Mendola [EMAIL PROTECTED] writes:

Comments are welcomed.

Well as long as you're asking...
Email domains are case insensitive, but the left hand side is case sensitive.
That's the only part that's hard to handle using a text data type, it would be
kind of neat if the email operators got it right.
Another thing is that it might make more sense to sort email addresses by
domain first (case insensitively of course), then by left hand side (case
sensitively). Since the domain is really the most significant bit. This is
also convenient for many systems like email since they perform better when
they can handle data in that order.
Note that this would make the type sort differently from its text
representation. This shouldn't really be a problem but occasionally you see
poorly written queries that introduce extra type conversions that the user
doesn't expect. But then if it behaves just like the text datatype then there
wouldn't be much point in using it.
That's true, I will order as Tommi Maekitalo suggest.

Regards
Gaetano Mendola

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Bruce Momjian

Robert Treat wrote:
 On Monday 17 May 2004 08:21, Bruce Momjian wrote:
  Hans-J?rgen Sch?nig wrote:
   I am still wondering about two things:
   Somebody has posted a 2PC patch - I haven't seen too many comments
 
  He is waiting for nested transactions to be committed so he can merge
  his work in.
 
 
 I was thinking about this over the weekend... if nested transactions doesn't 
 make it into 7.5, does this mean that we won't get 2PC either? Seems that 
 would be a shame if we have a 2PC patch that is ready to go...

No idea.  The 2PC author wanted to wait for subtransactions.  It wasn't
our idea.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Robert Treat

On Monday 17 May 2004 08:21, Bruce Momjian wrote:
 Hans-Jürgen Schönig wrote:
  I am still wondering about two things:
  Somebody has posted a 2PC patch - I haven't seen too many comments

 He is waiting for nested transactions to be committed so he can merge
 his work in.


I was thinking about this over the weekend... if nested transactions doesn't 
make it into 7.5, does this mean that we won't get 2PC either? Seems that 
would be a shame if we have a 2PC patch that is ready to go...

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Email data type

2004-05-17 Thread Dave Page

 

 -Original Message-
 From: Gaetano Mendola [mailto:[EMAIL PROTECTED] 
 Sent: 17 May 2004 16:02
 To: Bruno Wolff III
 Cc: [EMAIL PROTECTED]
 Subject: Re: [HACKERS] Email data type
 
 About the domain literals, I think to validate it in the near 
 future, rejecting private subnet according to this list:
 
 10.0.0.0 - 10.255.255.255
 172.16.0.0 - 172.31.255.255
 192.168.0.0 - 192.168.255.255
 169.254.0.0 -169.254.255.255

Why? What's wrong with private addresses?

Regards, Dave.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Bogus permissions display in 7.4

2004-05-17 Thread Fabien COELHO


Dear Tom,

  (2) Although I subscribe your first 3 points, I do not like the 4th point.

 I didn't either.  After working on it some more, what I want to do now
 is keep the ACL representation the same as it is, but implicitly assume
 that the owner has all grant options whether the ACL says so or not.

Mmmm... So you still want to stick to exceptionnal rights that are
managed somewhere explicitely in the backend code. I would much prefer
something explicit in the acl, because it would make the path to roles
easier, and I'm a little bit interested in this path.

What I was implicitely suggesting thru questions in my mail was an
approach where:

(0) all rights are always explicit, null means no rights.
So the algorithm to check accesses would be:

- if (it is a super-user) access granted;
  else interpret explicitly ACL;

Otherwise you have everywhere:

- if (it is a super-user) access granted;
  elsif (it is the owner and we just need grant options) access granted;
  else interpret acl explicitly;

As I've pointed out with some bug reports, the current middle section
is quite buggy at the time, and I think I've found another one not yet
reported bugs in this area.

As we're dealing with security, the simpler the better. Having
implicit things just make the code harder to understand and check
because there is always a special case, and looking at the acl
from userland needs some interpretation of things that are not there.

(1) grant option rights are given at creation time explicitly, maybe
with your special user 0. They could then be revoked by the owner.

(2) if they are revoked, they could be given back but only by the
super user, with something like:

GRANT ALL ON ... TO calvin WITH GRANT OPTIONS FROM GRANTOR _SYSTEM;

It is a point that the super user should be able to create and
manage all rights, including explicit grantors.

So only the superuser is special in this approach, and owners are
only managed especially when creating an object, but not after.

This should make the actual code simpler and more explicit, so I feel that
it would less likely be buggy, and adding roles after that would be more
straightforward.

Well, all this is just my little opinion, and I'm not the one coding.

Have a nice day,

-- 
Fabien Coelho - [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Email data type

2004-05-17 Thread Steve Atkins

On Mon, May 17, 2004 at 05:01:36PM +0200, Gaetano Mendola wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Bruno Wolff III wrote:
 
 | On Sun, May 16, 2004 at 04:36:55 +0200,
 |   Gaetano Mendola [EMAIL PROTECTED] wrote:
 |
 |The type is indexable and provide also conversion methods:
 |
 |text -- email
 |and the operator , is possible use it in select like:
 |
 |
 | When you are converting between text and email data, what format are the
 | text version of the address going to be? For example you might be using
 | rfc2821 encoding, rfc2822 encoding or concatenating the local part,
 | an @ sign and the domain name. Don't forget about domain literals.
 
 Actually I use local_part@domain_name
 
 Also the validator will validate emails in this form, if you are thinking to
 validate emails as:
 
 Gaetano M. Public(junior)gmendola@(new account)bigfoot.com
 
 that are perfectly valid I think that it's a valid option to consider.
 
 About the domain literals, I think to validate it in the near future,
 rejecting private subnet according to this list:
 
 10.0.0.0 - 10.255.255.255
 172.16.0.0 - 172.31.255.255
 192.168.0.0 - 192.168.255.255
 169.254.0.0 -169.254.255.255

Bad idea. If I'm testing I'm likely to be testing with addresses on my
local network. My local network is in 10/8. The middleware or the
application should be making those decisions. But if you are going to
filter on IP space, also consider class D  E space.

Also, [EMAIL PROTECTED] is a syntactically valid email address, in the .13
TLD. It does not deliver to 10.11.12.13, or anywhere else, as of
today, unless the MTA or local recursive resolver is broken (a common
case). [EMAIL PROTECTED] is a whole other thing. As is [EMAIL PROTECTED]::10.11.12.13]
and various other IPv6 variants.

[EMAIL PROTECTED] is syntactically valid. [EMAIL PROTECTED] is syntactically valid, but
should be immediately rejected. [EMAIL PROTECTED] is valid, even if known to
be non-deliverable.

 I think I have to discard also the addresses with last octet equal to 256.

Or equal to 4872014, come to that. Any reason you're looking at 256
in particular? If you mean .255 (or .0) then don't fall into that trap -
there are perfectly valid, routable addresses ending in both .0 and
.255, despite what some folks in Redmond would have you believe.

 Any comments ?

Parsing email addresses is a significant part of my day job, and email
address validation is a lot harder than it looks at first sight.

Don't forget quoting, whitespace, escaping and nesting parenthetical
comments. Also, remember that [EMAIL PROTECTED] and [EMAIL PROTECTED] are different 
email
addresses, while [EMAIL PROTECTED] and [EMAIL PROTECTED] are the same email
address. [EMAIL PROTECTED] and [EMAIL PROTECTED] are the same email
address. [EMAIL PROTECTED] and [EMAIL PROTECTED] may be the same address or
different email addresses, depending on which religious faction you
belong to.

In some contexts the empty string is a valid email address. In some
contexts Postmaster is a valid email address.

Also, one persons definition of a valid email address will be very
different from another persons definition of such. Many of those
definitions require some DNS resolution to make the decision.

I'm not entirely convinced that an email address is a simple and
well-defined enough datatype to handle comprehensively within the
DB. The validation decisions are complex and vary from application to
application.

(I use two text columns - localpart and domainpart, with an index
 on reverse(lower(domainpart)) and leave validation to the application,
 myself).

Cheers,
  Steve


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Email data type

2004-05-17 Thread Bort, Paul


 From: Gaetano Mendola [mailto:[EMAIL PROTECTED]
 
 I think I have to discard also the addresses with last octet 
 equal to 256.
 
 Any comments ?
 

Any octet that contains a number less than 0 or greater than 255 should be
suspect. 

Assuming you really meant 255:

It would be perfectly legal for an entity on the internet to have a block of
addresses with a subnet mask of less than 24 bits, which leads to legal
addresses that end in 255.

For example, if your company/university/black helicopter squad needed about
500 servers with direct presence, you might be assigned a block like
123.45.6.0/23. So the network address would be 123.45.6.0, the broadcast
address would be 123.45.7.255, and everything in between, including
123.45.6.255 and 123.45.7.0, would be available for your servers.

HTH, 

Paul



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Email data type

2004-05-17 Thread Gaetano Mendola

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Dave Page wrote:
|-Original Message-
|From: Gaetano Mendola [mailto:[EMAIL PROTECTED]
|Sent: 17 May 2004 16:02
|To: Bruno Wolff III
|Cc: [EMAIL PROTECTED]
|Subject: Re: [HACKERS] Email data type
|
|About the domain literals, I think to validate it in the near
|future, rejecting private subnet according to this list:
|
|10.0.0.0 - 10.255.255.255
|172.16.0.0 - 172.31.255.255
|192.168.0.0 - 192.168.255.255
|169.254.0.0 -169.254.255.255
|
|
| Why? What's wrong with private addresses?
Well I think that accept an email like:
[EMAIL PROTECTED]
is a risky.
It's true that any RFC forbid this kind of emails,
so I can accept it and leave to the final user the
opportunity to extract the domain an validate it.
Regards
Gaetano Mendola





-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFAqOHm7UpzwH2SGd4RAp4dAJwNqDSws1cVwKr/QMXG/JvV/8M94QCbBfIS
alixKQYe/VoLpwXrBFHszDs=
=BAML
-END PGP SIGNATURE-
---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Email data type

2004-05-17 Thread Dave Page

 -Original Message-
 From: Steve Atkins [mailto:[EMAIL PROTECTED] 
 Sent: 17 May 2004 16:46
 To: [EMAIL PROTECTED]
 Subject: Re: [HACKERS] Email data type

 [EMAIL PROTECTED] is syntactically valid. [EMAIL PROTECTED] is 
 syntactically valid, but should be immediately rejected. 

I disagree - just because the database server cannot verify the the
existence of a domain does not mean that it isn't valid. I think only
addresses that are syntactically incorrect should be rejected.

Regards, Dave.

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Email data type

2004-05-17 Thread Steve Atkins

On Mon, May 17, 2004 at 05:01:05PM +0100, Dave Page wrote:

  [EMAIL PROTECTED] is syntactically valid. [EMAIL PROTECTED] is 
  syntactically valid, but should be immediately rejected. 
 
 I disagree - just because the database server cannot verify the the
 existence of a domain does not mean that it isn't valid. I think only
 addresses that are syntactically incorrect should be rejected.

I think you're missing the point. Go and take a glance at RFC 2606 -
it's a BCP which defines the .test, .invalid, .example and .localhost
TLDs.

Of course, some users may want to use .invalid email addresses in
their database, despite their being, by definition, invalid. It's
something that will vary depending on their needs.

Cheers,
  Steve


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Email data type

2004-05-17 Thread Dave Page

 -Original Message-
 From: Gaetano Mendola [mailto:[EMAIL PROTECTED] 
 Sent: 17 May 2004 17:02
 To: Dave Page
 Cc: Bruno Wolff III; [EMAIL PROTECTED]
 Subject: Re: [HACKERS] Email data type

 Well I think that accept an email like:

 [EMAIL PROTECTED]

 is a risky.

 It's true that any RFC forbid this kind of emails, so I can 
 accept it and leave to the final user the opportunity to 
 extract the domain an validate it.

Forbidden on the 'net with good reason, but useful internally on private
networks for testing, and probably various other reasons where you might
want to direct a message to a specific interface rather than a hostname
or MX.

Regards, Dave

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Email data type

2004-05-17 Thread Steve Atkins

On Mon, May 17, 2004 at 09:21:54AM -0700, Steve Crawford wrote:

 Along those lines [EMAIL PROTECTED] and [EMAIL PROTECTED] are valid but they 
 don't necessarily refer to the same mailbox (depends on the mx for 
 foo.bar.com).

I don't believe the latter is actually valid, as it has to be an
address literal inside the square brackets. Until that's extended
by RFC that means IPv4 or IPv6 literals.

It's another example of this is neither trivial nor well-defined,
though.

  I'm not entirely convinced that an email address is a simple and
  well-defined enough datatype to handle comprehensively within the
  DB. The validation decisions are complex and vary from application
  to application.
 
  (I use two text columns - localpart and domainpart, with an index
   on reverse(lower(domainpart)) and leave validation to the
  application, myself).
 
 Indeed. A problem with the email address datatype is that it hinders 
 normalization:

Yup. That's a bigger concern than the overall vagueness of the problem.

 Joe User [EMAIL PROTECTED] is valid but a database designer would 
 probably prefer columns for name and email, or if the addresses were 
 all people, firstname, middlename, lastname, email.
 
 As you mentioned, the email can be broken into localpart and 
 domainpart but if the app requires it, the domainpart could be 
 further rendered into toplevel (so you could find all the .gov or 
 .edu), secondlevel (at least corresponds to a registrant) and 
 subdomain(s) as necessary.

That's why I index the domainpart on reverse(lower()) - that way I
can say WHERE reverse(lower(domainpart)) LIKE reverse('%.com')

Cheers,
  Steve

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] add server include files to default installation?

2004-05-17 Thread Fabien COELHO



 I would tend to put everything in include/postgresql/config:

On second thought, I would hesitate with lib/..., as config file are
plateform specific so they cannot be shared with other plateforms,
although it could be the case for include files that they could be shared.

-- 
Fabien Coelho - [EMAIL PROTECTED]

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Email data type

2004-05-17 Thread Steve Crawford

On Monday 17 May 2004 8:45 am, Steve Atkins wrote:
 Also, [EMAIL PROTECTED] is a syntactically valid email address, in the
 .13 TLD. It does not deliver to 10.11.12.13, or anywhere else, as
 of today, unless the MTA or local recursive resolver is broken (a
 common case). [EMAIL PROTECTED] is a whole other thing. As is
 [EMAIL PROTECTED]::10.11.12.13] and various other IPv6 variants.

Along those lines [EMAIL PROTECTED] and [EMAIL PROTECTED] are valid but they 
don't necessarily refer to the same mailbox (depends on the mx for 
foo.bar.com).

 Parsing email addresses is a significant part of my day job, and
 email address validation is a lot harder than it looks at first
 sight.

Yes, indeed.

 Don't forget quoting, whitespace, escaping and nesting
 parenthetical comments

The just looking for an @ breaks pretty quickly, say with 
joe@some.dom or a myriad of other variations.

 In some contexts the empty string is a valid email address. In some
 contexts Postmaster is a valid email address.

As are postmaster and pOsTmaSTeR and POSTmaster and they are all the 
same address.

 I'm not entirely convinced that an email address is a simple and
 well-defined enough datatype to handle comprehensively within the
 DB. The validation decisions are complex and vary from application
 to application.

 (I use two text columns - localpart and domainpart, with an index
  on reverse(lower(domainpart)) and leave validation to the
 application, myself).

Indeed. A problem with the email address datatype is that it hinders 
normalization:

Joe User [EMAIL PROTECTED] is valid but a database designer would 
probably prefer columns for name and email, or if the addresses were 
all people, firstname, middlename, lastname, email.

As you mentioned, the email can be broken into localpart and 
domainpart but if the app requires it, the domainpart could be 
further rendered into toplevel (so you could find all the .gov or 
.edu), secondlevel (at least corresponds to a registrant) and 
subdomain(s) as necessary.

Cheers,
Steve


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Bruce Momjian

Jan Wieck wrote:
  I am still wondering about two things:
  Somebody has posted a 2PC patch - I haven't seen too many comments
  Somebody has posted sync multimaster replication (PgCluster) - nobody 
  has commented on that. Maybe I am the only one who has ever tried it ...
 
 Do you really need someone commenting on query based replication? I 
 get goosebumps from just thinking someone would voluntarily push all 
 sequence- or timestamp-generation and other not strictly deterministic 
 functionality into the application to be able to use such a solution. 
 This is exactly how people work around all the MySQL idiosyncrasies.
 
  
  Most likely this is not very encourageing for the developers involved ...
 
 Most hopefully this is very discouraging! Connection pools are a nice 
 thing and I have used pgpool recently with great success, for pooling 
 connections. But attempting to deliver multimaster replication as a 
 byproduct of a connection pool isn't going to become an enterprise 
 feature. And the more half-baked, half-functional and half-reliable 
 replication attempts there are, the harder it will be to finally get a 
 real solution being recognized.

Well, considering we offer _nothing_ for multi-master right now, I think
it is a valuable project.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Table Spaces

2004-05-17 Thread Manfred Spraul

Bruce Momjian wrote:
The only downside to removal is that folks without symlinks (I believe
Win32 only) will loose that functionality with nothing to replace it. 
However, I think the clarity of removing it is worth it.  Also, I think
someone had a special way to do symlinks on Win32 and we should look
into that.
 

Windows 2000 and later support mount points - you can attach a new 
partition as C:\pgsql\data\xlog instead of D:\. That might be enough for 
most users. IIRC there was a tool to create arbitrary links, but it was 
removed just before W2K final.

--
   Manfred

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Jan Wieck

Hans-Jürgen Schönig wrote:
Marc G. Fournier wrote:
On Mon, 17 May 2004, Bruce Momjian wrote:

Marc G. Fournier wrote:
Agreed, but you are a me too, not a huge percentage of our userbase.
How do you know?  Have you polled our complete userbase?

Basically, after 6-7 months of development, I want more than a vacuum
patch and a new cache replacement policy.  I want something big, in
fact, several big things.
Most likely won't happen, since what is considered big by you isn't
necessarily what is considered big by someone else ... as Hannu, and I
believe, Jan, have so far pointed out to you ...
I can't poll for everything.  I make my own educated guesses.

Based on what though?
All the clients that I deal with on a daily basis generally care about is
performance ... that is generally what they upgrade for ... so, my
'educated guess' based on real world users is that Win32, PITR and nested
transactions are not important ... tablespaces, I have one client that has
asked about something *similar* to it, but tablespaces, for him, doesn't
come close to what they would like to see ...
So, my 'educated guess' is different then yours is ... does that make
yours wrong?  Nope ... just means we have different sample sets to work
with ...

Interesting.
We have made COMPLETELY different experiences.
There is one question people ask me daily: When can we have sychronous 
replication and PITR?.
Performance is not a problem here. People are more interested in 
stability and enterprise features such as those I have mentioned above.

I am still wondering about two things:
Somebody has posted a 2PC patch - I haven't seen too many comments
Somebody has posted sync multimaster replication (PgCluster) - nobody 
has commented on that. Maybe I am the only one who has ever tried it ...
Do you really need someone commenting on query based replication? I 
get goosebumps from just thinking someone would voluntarily push all 
sequence- or timestamp-generation and other not strictly deterministic 
functionality into the application to be able to use such a solution. 
This is exactly how people work around all the MySQL idiosyncrasies.

Most likely this is not very encourageing for the developers involved ...
Most hopefully this is very discouraging! Connection pools are a nice 
thing and I have used pgpool recently with great success, for pooling 
connections. But attempting to deliver multimaster replication as a 
byproduct of a connection pool isn't going to become an enterprise 
feature. And the more half-baked, half-functional and half-reliable 
replication attempts there are, the harder it will be to finally get a 
real solution being recognized.

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Bruce Momjian

Jan Wieck wrote:
 Marc G. Fournier wrote:
  On Sun, 16 May 2004, Bruce Momjian wrote:
  
  I personally don't think Win32 is enough of a new feature either, but
  others disagree.
  
  Jan, correct me if I'm wrong ... Jan's point is that we have enough
  already to warrant a beta on June 1st, even without Win32 ... Win32 (or
  any of the other stuff, like PITR/tablespaces) would be icing on the cake
  ...
 
 I agree that we don't have many of the big marketing bang for the buck
 features done for 7.5. But that is no reason to use wordings like nifty

That was my only point, that we don't have any big marketing bang for
the buck features.  I don't mean to minimize the good work we have
already done.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Bruce Momjian

Marc G. Fournier wrote:
 On Mon, 17 May 2004, Bruce Momjian wrote:
 
   Most hopefully this is very discouraging! Connection pools are a nice
   thing and I have used pgpool recently with great success, for pooling
   connections. But attempting to deliver multimaster replication as a
   byproduct of a connection pool isn't going to become an enterprise
   feature. And the more half-baked, half-functional and half-reliable
   replication attempts there are, the harder it will be to finally get a
   real solution being recognized.
 
  Well, considering we offer _nothing_ for multi-master right now, I think
  it is a valuable project.
 
 Connection pooling is *not* multi master ... it doesn't even simulate
 multi-master ... multi-master, at least as far as I'm aware, means no
 point of failure, and connection pooling creates a *single* point of
 failure ... the pgpool process dies, you've lost all connections to the
 database ...

I think people are confusing pgpool with pgcluster.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Marc G. Fournier

On Mon, 17 May 2004, Bruce Momjian wrote:

  Most hopefully this is very discouraging! Connection pools are a nice
  thing and I have used pgpool recently with great success, for pooling
  connections. But attempting to deliver multimaster replication as a
  byproduct of a connection pool isn't going to become an enterprise
  feature. And the more half-baked, half-functional and half-reliable
  replication attempts there are, the harder it will be to finally get a
  real solution being recognized.

 Well, considering we offer _nothing_ for multi-master right now, I think
 it is a valuable project.

Connection pooling is *not* multi master ... it doesn't even simulate
multi-master ... multi-master, at least as far as I'm aware, means no
point of failure, and connection pooling creates a *single* point of
failure ... the pgpool process dies, you've lost all connections to the
database ...


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] email data type first release

2004-05-17 Thread Bruno Wolff III

On Mon, May 17, 2004 at 17:11:42 +0200,
  Gaetano Mendola [EMAIL PROTECTED] wrote:
 
 That's true, I will order as Tommi Maekitalo suggest.

And how do domain literals fit into this?

[EMAIL PROTECTED] is a valid email address for me. (At least as
long as my server is at that IP address.)

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Email data type

2004-05-17 Thread Dave Page

It's rumoured that Steve Atkins once said:
 On Mon, May 17, 2004 at 05:01:05PM +0100, Dave Page wrote:

  [EMAIL PROTECTED] is syntactically valid. [EMAIL PROTECTED] is
  syntactically valid, but should be immediately rejected.

 I disagree - just because the database server cannot verify the the
 existence of a domain does not mean that it isn't valid. I think only
 addresses that are syntactically incorrect should be rejected.

 I think you're missing the point. Go and take a glance at RFC 2606 -
 it's a BCP which defines the .test, .invalid, .example and .localhost
 TLDs.

Ahh, yes - clearly I was missing the point :-)

Regards, Dave.



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Table Spaces

2004-05-17 Thread Thomas Hallgren

Manfred Spraul [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 Bruce Momjian wrote:

 The only downside to removal is that folks without symlinks (I believe
 Win32 only) will loose that functionality with nothing to replace it.
 However, I think the clarity of removing it is worth it.  Also, I think
 someone had a special way to do symlinks on Win32 and we should look
 into that.
 
 
 Windows 2000 and later support mount points - you can attach a new
 partition as C:\pgsql\data\xlog instead of D:\. That might be enough for
 most users. IIRC there was a tool to create arbitrary links, but it was
 removed just before W2K final.

If you run NTFS, it's still possible to use arbitrary links. In the Windows
world, they are called junctions. Microsoft does not provide a junction tool
for some reason (perhaps because it's limited to NTFS). A good tool, free
and with source, can be found here
http://www.sysinternals.com/ntw2k/source/misc.shtml#junction I use this tool
myself. Works like a charm.

Kind regards,

Thomas Hallgren


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Mario Weilguni


 
 Interesting.
 We have made COMPLETELY different experiences.
 
 There is one question people ask me daily: When can we have sychronous 
 replication and PITR?.
 Performance is not a problem here. People are more interested in 
 stability and enterprise features such as those I have mentioned above.

I doubt that. Having deployed several 7.4 databases, the first customers ask 
(of course not in technical speech, but in the meaning) when the problem with 
checkpoint hogging system down is solved. This is a really serious issue, 
especially when using drbd + ext3. The system will become really unresponsive 
when checkpoint is running.

I heavily await 7.5 because of the background writer.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Table Spaces

2004-05-17 Thread Andrew Dunstan

Thomas Hallgren wrote:
Manfred Spraul [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 

Bruce Momjian wrote:
   

The only downside to removal is that folks without symlinks (I believe
Win32 only) will loose that functionality with nothing to replace it.
However, I think the clarity of removing it is worth it.  Also, I think
someone had a special way to do symlinks on Win32 and we should look
into that.
 

Windows 2000 and later support mount points - you can attach a new
partition as C:\pgsql\data\xlog instead of D:\. That might be enough for
most users. IIRC there was a tool to create arbitrary links, but it was
removed just before W2K final.
   

If you run NTFS, it's still possible to use arbitrary links. In the Windows
world, they are called junctions. Microsoft does not provide a junction tool
for some reason (perhaps because it's limited to NTFS). A good tool, free
and with source, can be found here
http://www.sysinternals.com/ntw2k/source/misc.shtml#junction I use this tool
myself. Works like a charm.
 

We've looked at it before. Apart from anything else I don't think its 
license is compatible with PostgreSQL's.

Also, IIRC NTFS junctions also have some severe limitations.
cheers
andrew
---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Mike Mascari

Mario Weilguni wrote:
Interesting. We have made COMPLETELY different experiences.
There is one question people ask me daily: When can we have
sychronous replication and PITR?. Performance is not a problem
here. People are more interested in stability and enterprise
features such as those I have mentioned above.

I doubt that. Having deployed several 7.4 databases, the first
customers ask (of course not in technical speech, but in the
meaning) when the problem with checkpoint hogging system down is
solved. This is a really serious issue, especially when using
drbd + ext3. The system will become really unresponsive when
checkpoint is running.
I heavily await 7.5 because of the background writer.
This thread reminds me of Andrew Sullivan's signature:
The plural of anecdote is not data - Roger Brinner
Of course, once the sample size becomes sufficiently large, it does 
become data. Has the advocacy group performed any polling in this 
area that might shed some light as to what users and potential users 
might want?

Mike Mascari
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] pg_autovacuum next steps

2004-05-17 Thread Matthew T. O'Connor

Tom Lane wrote:
Matthew T. O'Connor [EMAIL PROTECTED] writes:
 

I probably said that wrong, but how do backends get their stats data?  
   

They read it out of a flat file that the stats collector rewrites every
so often.
 

Ok so that would be easy to do (if we decide we want to)
Is that really worth the loss of independence?  I'm not sure one way or
the other myself.  I suppose the autovac daemon could still be run by
hand for testing purposes, but it would have to run as user postgres and
on the same machine as the postmaster.
 

I'm not sure, it would allow autovacuum to check the stats more 
regularly.  I suppose it would be possible for pg_autovacuum to 
recognize if it's been launched by the postmaster or not and look up 
stats as appropriate, but as you say, I'm not sure it's worth it, and my 
first cut will work exactly as the current pg_autovacuum does.

Also, you didn't mention if I will be able to use the backend logging 
functions, I would guess that I can, but I'm not totally sure.
   

Yes, if you are launched by the postmaster then you'd have access to its
stderr etc, so you could just log via elog/ereport.  Again though this
puts you in a situation where the daemon *must* be launched by the
postmaster or it won't work.  How much arm's-length separation are you
willing to give up?
Well I think I would be more willing to give up the separation for 
logging purposes.  I would think that an error message issued by a 
postmaster sub-process should wind up in the same location as an error 
issued by the postmaster proper.  Also, people have requested lots of 
logging options such as syslog etc and I think it would be nice to get 
all this for free.

I would think that pg_autovacuum could determine if it's a postmaster 
sub-process and log appropriately, does sound reasonable?



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Simon Riggs

On Mon, 2004-05-17 at 15:10, Bruce Momjian wrote:

 It is too late to think about pushing back another month.  We had this
 discussion already.  June 1 is it.
 

I think I have to reiterate: PITR won't make 1 June. (I will be away
travelling soon). This has been said a number of times.

This is all rather disheartening, having laid out a time plan months
back, noting some of this. Yes, I am working on it, and no, I'm not hand
waving, but I do take time off every million keystrokes or so.

Mid to late June is a realistic time for the completion of PITR, given
the additional features I am now working on and the time allocation I
can reasonably give this (which in many ways is already unreasonable).
My last publicly stated completion estimate was right to within a few
days (mid-April, stated in early March).

I'm a very task focused person and I strongly support the notion of
deadlines and freezes. It's just in this instance, 1 June seems to have
been plucked from the air - unless there are other pressures that others
can see better than I. What are they again?

7.5dev is loads better than 7.4, though that was true in February.
Having waited until now, why not wait for the features? If somebody
suggested that we do 7.5 now and then 8.0 with Win32  PITR in
September, I'd think about that, but I'm worried that the next major
release is a long way out after this next one, not a few months. Beta is
gonna take ages anyhow.

Many people in the community are still using 7.3 or below. If the
releases come too frequently, running an initdb is just a pain,
especially with out a big reason to sell to the business folks as to
why they have to put up with the downtime of upgrading (or time spent on
clever plans). Is it running? Yeh. So why upgrade? for existing users,
and Does it have feature XYZ? No, but they think next release. OK,
well, we'll wait for that and then trial it for new adopters.

Shipping too early is a bad thing too. It's not clear to me why you
would ship in a hurry when the community has waited years to get some of
the features on the URGENT list. Honestly, how long has PITR been
brewing? And, who thinks that we'll get increased adoption without the
big ticket items? (Even if your opinion is that PITR isn't one of them).

I can't complete by 1 June. Think worse of me if you choose.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Table Spaces

2004-05-17 Thread Magnus Hagander

The only downside to removal is that folks without symlinks 
(I believe
Win32 only) will loose that functionality with nothing to 
replace it.
However, I think the clarity of removing it is worth it.  
Also, I think
someone had a special way to do symlinks on Win32 and we should look
into that.


  

Windows 2000 and later support mount points - you can attach a new
partition as C:\pgsql\data\xlog instead of D:\. That might 
be enough for
most users. IIRC there was a tool to create arbitrary links, 
but it was
removed just before W2K final.



If you run NTFS, it's still possible to use arbitrary links. 
In the Windows
world, they are called junctions. Microsoft does not provide 
a junction tool
for some reason (perhaps because it's limited to NTFS). A 
good tool, free
and with source, can be found here
http://www.sysinternals.com/ntw2k/source/misc.shtml#junction 
I use this tool
myself. Works like a charm.

  


We've looked at it before. Apart from anything else I don't think its 
license is compatible with PostgreSQL's.

Well, people can still use it. We just can't distribute it... We can
always link to it.
But unless there is a GUI tool (actually, unless it shows up in the
*default* GUI tool), expect there to be questions. An 


Also, IIRC NTFS junctions also have some severe limitations.

The main being they can't do files, and there are few tools for them.
Also, most win32 admins are *NOT* experienced with them. Sure, they are
used for the NETLOGON directory, but how many admins know that...

//Magnus

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Email data type

2004-05-17 Thread Bruno Wolff III

On Mon, May 17, 2004 at 17:01:36 +0200,
  Gaetano Mendola [EMAIL PROTECTED] wrote:
 
 Also the validator will validate emails in this form, if you are thinking to
 validate emails as:
 
 Gaetano M. Public(junior)gmendola@(new account)bigfoot.com

That appears to be an rfc 2822 address. RFC 2821 addresses are more limited
in that they can't have comments but still escape special characters.

 
 that are perfectly valid I think that it's a valid option to consider.

What you want to store depends on what you are going to do with the address.
If you don't need the comments, storing it as a local part and a domain
is probably best.

 About the domain literals, I think to validate it in the near future,
 rejecting private subnet according to this list:
 
 10.0.0.0 - 10.255.255.255
 172.16.0.0 - 172.31.255.255
 192.168.0.0 - 192.168.255.255
 169.254.0.0 -169.254.255.255

That would depend on whether or not sending to those addresses would make
sense in your case.

 
 I think I have to discard also the addresses with last octet equal to 256.

Email addresses are in ascii and hence bytes should only contain characters
with the codes 0 through 127.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Email data type

2004-05-17 Thread Bruno Wolff III

On Mon, May 17, 2004 at 18:01:43 +0200,
  Gaetano Mendola [EMAIL PROTECTED] wrote:
 Well I think that accept an email like:
 
 [EMAIL PROTECTED]

Be careful about this. [EMAIL PROTECTED] is not the same as
[EMAIL PROTECTED] . 

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Gaetano Mendola

Hans-Jürgen Schönig wrote:
Somebody has posted sync multimaster replication (PgCluster) - nobody 
has commented on that. Maybe I am the only one who has ever tried it ...
I didn't find it on pgFoundry, others place to look at it ?
Regards
Gaetano Mendola

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Jan Wieck

Bruce Momjian wrote:
Marc G. Fournier wrote:
On Mon, 17 May 2004, Bruce Momjian wrote:
  Most hopefully this is very discouraging! Connection pools are a nice
  thing and I have used pgpool recently with great success, for pooling
  connections. But attempting to deliver multimaster replication as a
  byproduct of a connection pool isn't going to become an enterprise
  feature. And the more half-baked, half-functional and half-reliable
  replication attempts there are, the harder it will be to finally get a
  real solution being recognized.

 Well, considering we offer _nothing_ for multi-master right now, I think
 it is a valuable project.
Connection pooling is *not* multi master ... it doesn't even simulate
multi-master ... multi-master, at least as far as I'm aware, means no
point of failure, and connection pooling creates a *single* point of
failure ... the pgpool process dies, you've lost all connections to the
database ...
I think people are confusing pgpool with pgcluster.
And you wonder where that's coming from, eh? Tatsuo is advertising 
pgpool as a synchronous replication system suitable for failover. 
Quoting from the pgpool-1.0 README:

   pgpool could be used as a replication server. This allows real-time
   backuping of the database to avoid disk failures. pgpool sends
   exactly same query to each PostgreSQL servers to accomplish
   replication. So pgpool can be regarded as a synchronous
   replication server.
Don't get me wrong, as said pgpool works great for the purpose I tested, 
the pooling. But statements like that are causing the confusion here.

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Email data type

2004-05-17 Thread Andrew Dunstan





Bruno Wolff III wrote:

  On Mon, May 17, 2004 at 18:01:43 +0200,
  Gaetano Mendola [EMAIL PROTECTED] wrote:
  
  
Well I think that accept an email like:

[EMAIL PROTECTED]

  
  
Be careful about this. [EMAIL PROTECTED] is not the same as
[EMAIL PROTECTED]] . 

  


Not wanting to rain on this fascinating parade, but shouldn't this
discussion be carried out on a mailing list for the pgfoundry emailadt
project? (pgfoundry mailing lists were a bit broken but I think they
are fixed now).

cheers

andrew

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Joshua D. Drake

Simon Riggs wrote:
On Mon, 2004-05-17 at 15:10, Bruce Momjian wrote:

It is too late to think about pushing back another month.  We had this
discussion already.  June 1 is it.

Just to throw in my .02, plPerlNG won't be ready for testing until mid, 
later June either. Then there is also plPHP which although we haven't 
had any bug reports still needs some more peer review.

Also we would like to submit our ECPG which includes SET DESCRIPTOR and 
error handling but that too needs more peer review.

It just seems, considering the current state of 7.4.2 (stable, just now 
being deployed in production shops) that we should make a longer 
development time for 7.5.

Personally, Win32, subtransactions and PITR are what we are after. 
Second would be inclusion of plPHP and plPerlNG which are arguably the 
most widely used languages to connect to PostgreSQL.

Sincerely,
Joshua D. Drake



--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
begin:vcard
fn:Joshua D. Drake
n:Drake;Joshua D.
org:Command Prompt, Inc.
adr:;;PO Box 215;Cascade Locks;Oregon;97014;USA
email;internet:[EMAIL PROTECTED]
title:Consultant
tel;work:503-667-4564
tel;fax:503-210-0034
note:Command Prompt, Inc. is the largest and oldest US based commercial PostgreSQL support provider. We  provide the only commercially viable integrated PostgreSQL replication solution, but also custom programming, and support. We authored  the book Practical PostgreSQL, the procedural language plPHP, and adding trigger capability to plPerl.
x-mozilla-html:FALSE
url:http://www.commandprompt.com/
version:2.1
end:vcard


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Relocatable installs

2004-05-17 Thread Jan Wieck

Bruce Momjian wrote:
Andrew Dunstan wrote:
I think we should use the relative-path method *unless* the configure
command called out specific installation directories (that is, not
just --prefix but --datadir and/or related options).  If you use one of
those then that absolute path should be used always, ie, you are
specifically asking for a nonrelocatable install and that's what you
should get.

  


I think we are making this way too complicated in a quest for 
flexibility that is of dubious value.

I think we could adopt a simple rule: if you configure it for relocation 
(and I think you should have to do that explicitly) then all paths are 
relative to the binary location. If not, then full hardcoded paths are 
used. No exceptions.

Most people won't need this at all, I suspect - people who make binary 
packages/installers for redistribution will find it a great boon.
I think if we go for the plan outlined, we will not need a special
configure flag.  (People might decide to move the install dir long after
they install it.)  By default, everything sits under pgsql as pgsql/bin,
pgsql/lib, etc.  I can't see how making it relative is going to bite us
unless folks move the binaries out of pgsql/bin.  Is that common for
installs that don't specify a special bindir?
Does that include a mechanism for -rpath?
Currently, if you have multiple installations of PostgreSQL on a server 
and call ones psql or whatever explicitly, it is not loading another 
ones libpq, but for sure the one belonging to its version. How does the 
plan you're talking about cover this?

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Marc G. Fournier

On Mon, 17 May 2004, Joshua D. Drake wrote:

 Personally, Win32, subtransactions and PITR are what we are after.
 Second would be inclusion of plPHP and plPerlNG which are arguably the
 most widely used languages to connect to PostgreSQL.

plPHP and plPerlNG both belong on pgfoundry, not in the core distribution
...



Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Joshua D. Drake

Mario Weilguni wrote:
Interesting.
We have made COMPLETELY different experiences.
There is one question people ask me daily: When can we have sychronous 
replication and PITR?.
Performance is not a problem here. People are more interested in 
stability and enterprise features such as those I have mentioned above.

I doubt that. Having deployed several 7.4 databases, the first customers ask 
(of course not in technical speech, but in the meaning) when the problem with 
checkpoint hogging system down is solved. This is a really serious issue, 
especially when using drbd + ext3. 
^^^
Well that seems to be part of the problem. ext3 does not scale well at 
all under load. You should probably upgrade to a better FS (like XFS). I 
am not saying that your point isn't valid (it is) but upgrading to a 
better FS will help you.

Sincerely,
Joshua D. Drake
The system will become really unresponsive
when checkpoint is running.
I heavily await 7.5 because of the background writer.
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
begin:vcard
fn:Joshua D. Drake
n:Drake;Joshua D.
org:Command Prompt, Inc.
adr:;;PO Box 215;Cascade Locks;Oregon;97014;USA
email;internet:[EMAIL PROTECTED]
title:Consultant
tel;work:503-667-4564
tel;fax:503-210-0034
note:Command Prompt, Inc. is the largest and oldest US based commercial PostgreSQL support provider. We  provide the only commercially viable integrated PostgreSQL replication solution, but also custom programming, and support. We authored  the book Practical PostgreSQL, the procedural language plPHP, and adding trigger capability to plPerl.
x-mozilla-html:FALSE
url:http://www.commandprompt.com/
version:2.1
end:vcard


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Joshua D. Drake


plPHP and plPerlNG both belong on pgfoundry, not in the core distribution
...
Uhhh?? Are you ripping out all core pls then? plPerlNG is supposed to 
replace plPerl, I was talking with Bruce and he seemed to think that (as 
long as the code was good enough) that we could incorporate plPHP???

Sincerely,
Joshua D. Drake



Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
begin:vcard
fn:Joshua D. Drake
n:Drake;Joshua D.
org:Command Prompt, Inc.
adr:;;PO Box 215;Cascade Locks;Oregon;97014;USA
email;internet:[EMAIL PROTECTED]
title:Consultant
tel;work:503-667-4564
tel;fax:503-210-0034
note:Command Prompt, Inc. is the largest and oldest US based commercial PostgreSQL support provider. We  provide the only commercially viable integrated PostgreSQL replication solution, but also custom programming, and support. We authored  the book Practical PostgreSQL, the procedural language plPHP, and adding trigger capability to plPerl.
x-mozilla-html:FALSE
url:http://www.commandprompt.com/
version:2.1
end:vcard


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Jan Wieck

Mario Weilguni wrote:
Interesting.
We have made COMPLETELY different experiences.
There is one question people ask me daily: When can we have sychronous 
replication and PITR?.
Performance is not a problem here. People are more interested in 
stability and enterprise features such as those I have mentioned above.
I doubt that. Having deployed several 7.4 databases, the first customers ask 
(of course not in technical speech, but in the meaning) when the problem with 
checkpoint hogging system down is solved. This is a really serious issue, 
especially when using drbd + ext3. The system will become really unresponsive 
when checkpoint is running.

I heavily await 7.5 because of the background writer.
Have you done some more extensive tests with 7.5 already and if so, what 
are your experiences with it so far?

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] email data type first release

2004-05-17 Thread Greg Stark


Tommi Maekitalo [EMAIL PROTECTED] writes:

 Sorting should then be done by top-level-domain first. Then 2nd, 3rd... and 
 last by user.

I thought of that but decided not to suggest it:

a) as far as email goes there's no relationship between [EMAIL PROTECTED] and
   [EMAIL PROTECTED] The .com doesn't mean the emails are any more related than
   [EMAIL PROTECTED] and [EMAIL PROTECTED] are. In fact in practice the latter two are
   more likely to be related.

b) it's a lot of extra work, whereas sorting by domain first is just as easy
   as sorting by lhs first.


-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] enabling tcpip_socket by default

2004-05-17 Thread Greg Stark

Marko Karppinen [EMAIL PROTECTED] writes:

 On 17. touko 2004, at 10:40, Tatsuo Ishii wrote:
  Consider a program using JDBC on localhost. It can only reach to
  PostgreSQL via TCP/IP.

Huh? Why on earth would that be true? Is this a limitation of our JDBC
drivers?

 Ah! Of course. That makes sense, and listening on 127.0.0.1 never
 hurt anyone (except, of course, the tinfoil hat crowd nmapping
 localhost in a frenzy...)

Actually on many systems it was very possible to send packets to a machine
with a source address of 127.0.0.1 even over external networks or through
routers. Making an attack out of this on a TCP service would be difficult, but
it has been done.

Good OS distributions install network filters by default to refuse such
packets, but lots of OSes still don't do this.

-- 
greg


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Greg Stark


Simon Riggs [EMAIL PROTECTED] writes:

 I can't complete by 1 June. Think worse of me if you choose.

I'll mention another perspective as a user. I'm actually happier seeing a
relatively minor release come out just before the big changes hit. If 7.5 has
Windows, PITR, nested transactions, etc. especially if I see they went in just
before a feature freeze then I'm liable to wait before I suggest installing it
in production because it makes me fear the impact these major new features
will have on a system that's been running fine on 7.4.

Whereas if 7.5 comes out and is an incremental improvement over 7.4 with a
background writer, more knobs to control checkpoints and vacuum, better sorted
pg_dumps, etc, then I'm liable to install it right away. And I get to use
these features while the big changes settle.

From a user-perceptions point of view this is an even bigger factor when
people start bandying about 8.0. A LOT of sysadmins are going to hold off on
installing a release labelled 8.0 until they see 8.1 or 8.2. Sysadmins are an
awfully superstitious bunch and numerology is popular.

Moreover if PITR, the Windows port, nested transactions go into 7.6 or 8.0
right at the beginning of the development cycle and we have 3-4 months of
working with databases running with these features I strongly suspect that the
stability of the product will make a bigger long term impression than the
rapid pace of new features arriving.

So in my perfect world I picture 7.5 freezing June 1 and releasing in July or
so, giving a nice reliable simple upgrade for people who just want a safe 7.x
series to upgrade to even after 8.0 comes out. PITR, nested transactions going
into the CVS tree sometime in June or July and being frozen as 8.0 towards the
end of the year.

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] enabling tcpip_socket by default

2004-05-17 Thread Doug McNaught

Greg Stark [EMAIL PROTECTED] writes:

 Doug McNaught [EMAIL PROTECTED] writes:

 Java doesn't support Unix domain sockets.  If you want to use JDBC,
 you have to use TCP sockets.

 That doesn't follow. That just means you can't implement a unix domain socket
 driver using only Java. Is there some reason the driver has to be pure a Java
 driver?

Well, no.  But then you get into the horror that is JNI, and the need
to maintain that glue layer (there's a substantial impedance mismatch
between libpq and the JDBC API) on all platforms.

 I had always assumed the JDBC driver isn't currently pure java and is just an
 API wrapper around libpq. Writing and maintaining a pure java driver would be
 much more work and be much slower for no practical gain.

The JDBC driver is and always has been pure Java, and I'm damn glad.
If your app is all Java, all you have to do is add the driver jar to
the classpath and you're done--no libpq, no .so version mismatches,
etc.  AFAIK (I'm not one of the developers) there are no plans to
write a JNI layer just for Unix sockets.

And I doubt very much that it's that much slower in most cases,
especially with modern JIT compilers.  You'll take a bit of a hit from
the TCP overhead, but in practice for most apps it won't be an issue.

-Doug

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] enabling tcpip_socket by default

2004-05-17 Thread Greg Stark


Doug McNaught [EMAIL PROTECTED] writes:

 Greg Stark [EMAIL PROTECTED] writes:
 
  Marko Karppinen [EMAIL PROTECTED] writes:
 
  On 17. touko 2004, at 10:40, Tatsuo Ishii wrote:
   Consider a program using JDBC on localhost. It can only reach to
   PostgreSQL via TCP/IP.
 
  Huh? Why on earth would that be true? Is this a limitation of our JDBC
  drivers?
 
 Java doesn't support Unix domain sockets.  If you want to use JDBC,
 you have to use TCP sockets.

That doesn't follow. That just means you can't implement a unix domain socket
driver using only Java. Is there some reason the driver has to be pure a Java
driver?

I had always assumed the JDBC driver isn't currently pure java and is just an
API wrapper around libpq. Writing and maintaining a pure java driver would be
much more work and be much slower for no practical gain.

-- 
greg


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Mike Mascari

Greg Stark wrote:
Simon Riggs [EMAIL PROTECTED] writes:
I can't complete by 1 June. Think worse of me if you choose.

...
So in my perfect world I picture 7.5 freezing June 1 and releasing in July or
so, giving a nice reliable simple upgrade for people who just want a safe 7.x
series to upgrade to even after 8.0 comes out. PITR, nested transactions going
into the CVS tree sometime in June or July and being frozen as 8.0 towards the
end of the year.
A quick google of 7.4 Win32 release will reveal that the above was 
precisely what was said about 7.4: it would be released to not hold 
up important features like the IN optimization and a quick 7.5 would 
have Win32 and PITR. It's almost as if a cron job reposts this 
thread every 6 - 12 months. For those of us that are desirous of 
PITR, it's a 6 month reposting that is becoming painful to read...

Mike Mascari

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Marc G. Fournier

On Mon, 17 May 2004, Joshua D. Drake wrote:


 
  plPHP and plPerlNG both belong on pgfoundry, not in the core distribution
  ...

 Uhhh?? Are you ripping out all core pls then? plPerlNG is supposed to
 replace plPerl, I was talking with Bruce and he seemed to think that (as
 long as the code was good enough) that we could incorporate plPHP???

That is the plan ... unless someone knows a reason why they can't be built
independently of the core?  ecpg relies on the grammar files in core, but
as far as I knew (please correct me if I'm wrong) the pls only rely on
headers and libraries that get installed ...


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Table Spaces

2004-05-17 Thread Bruce Momjian

Magnus Hagander wrote:
 If you run NTFS, it's still possible to use arbitrary links. 
 In the Windows
 world, they are called junctions. Microsoft does not provide 
 a junction tool
 for some reason (perhaps because it's limited to NTFS). A 
 good tool, free
 and with source, can be found here
 http://www.sysinternals.com/ntw2k/source/misc.shtml#junction 
 I use this tool
 myself. Works like a charm.
 
   
 
 
 We've looked at it before. Apart from anything else I don't think its 
 license is compatible with PostgreSQL's.
 
 Well, people can still use it. We just can't distribute it... We can
 always link to it.
 But unless there is a GUI tool (actually, unless it shows up in the
 *default* GUI tool), expect there to be questions. An 

I assume we can just look at the source and write our own version
bypassing any license.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Relocatable installs

2004-05-17 Thread Bruce Momjian

Jan Wieck wrote:
  I think if we go for the plan outlined, we will not need a special
  configure flag.  (People might decide to move the install dir long after
  they install it.)  By default, everything sits under pgsql as pgsql/bin,
  pgsql/lib, etc.  I can't see how making it relative is going to bite us
  unless folks move the binaries out of pgsql/bin.  Is that common for
  installs that don't specify a special bindir?
  
 
 Does that include a mechanism for -rpath?
 
 Currently, if you have multiple installations of PostgreSQL on a server 
 and call ones psql or whatever explicitly, it is not loading another 
 ones libpq, but for sure the one belonging to its version. How does the 
 plan you're talking about cover this?

Someone asked about rpath, and I didn't deal with it.  How many
platforms use rpath?  I am not sure.

I assume folks are going to have to modify their ld.so.conf to point to
the proper library, or for non-root, set an environment variable like
LD_LIBRARY_PATH.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Greg Stark


Mike Mascari [EMAIL PROTECTED] writes:

 Greg Stark wrote:
  Simon Riggs [EMAIL PROTECTED] writes:
 
 I can't complete by 1 June. Think worse of me if you choose.
 
 ...
  So in my perfect world I picture 7.5 freezing June 1 and releasing in July or
  so, giving a nice reliable simple upgrade for people who just want a safe 7.x
  series to upgrade to even after 8.0 comes out. PITR, nested transactions going
  into the CVS tree sometime in June or July and being frozen as 8.0 towards the
  end of the year.
 
 A quick google of 7.4 Win32 release will reveal that the above was precisely
 what was said about 7.4: it would be released to not hold up important features
 like the IN optimization and a quick 7.5 would have Win32 and PITR. It's almost
 as if a cron job reposts this thread every 6 - 12 months. For those of us that
 are desirous of PITR, it's a 6 month reposting that is becoming painful to
 read...

I'm not sure what your point is though. It's not like people with my attitude
made the people writing code take any longer. In fact had we held off 7.4 for
any of these features it would have been a disaster.

Incidentally, I'm not suggesting rushing 7.6/8.0 out the door. I'm imagining a
regular release cycle. My comments are more geared to the idea that having
PITR, Nested Transactions, etc hit the tree early in the development cycle
would be smoother than having them hit right at the end of the development
cycle.

Think of all the added bells and whistles PITR will be able to grow over the
course of a whole release cycle. Instead of having a barebones see it works
implementation we'll have a really polished system with lots of optional but
appreciated features. Maybe better integration in third-party backup tools,
maybe even standby databases or replication based on it.

-- 
greg


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Marc G. Fournier

On Mon, 17 May 2004, Mike Mascari wrote:

 Greg Stark wrote:
  Simon Riggs [EMAIL PROTECTED] writes:
 
 I can't complete by 1 June. Think worse of me if you choose.
 
 ...
  So in my perfect world I picture 7.5 freezing June 1 and releasing in July or
  so, giving a nice reliable simple upgrade for people who just want a safe 7.x
  series to upgrade to even after 8.0 comes out. PITR, nested transactions going
  into the CVS tree sometime in June or July and being frozen as 8.0 towards the
  end of the year.

 A quick google of 7.4 Win32 release will reveal that the above was
 precisely what was said about 7.4: it would be released to not hold
 up important features like the IN optimization and a quick 7.5 would
 have Win32 and PITR. It's almost as if a cron job reposts this
 thread every 6 - 12 months. For those of us that are desirous of
 PITR, it's a 6 month reposting that is becoming painful to read...

k, let's think this through ... 7.4 was released, what, 6 months ago?  And
6 months later, PITR still isn't ready?  Is there some logic here that if
7.4 wasn't released, PITR would have been done any sooner?


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Jan Wieck

Marc G. Fournier wrote:
On Mon, 17 May 2004, Joshua D. Drake wrote:
Personally, Win32, subtransactions and PITR are what we are after.
Second would be inclusion of plPHP and plPerlNG which are arguably the
most widely used languages to connect to PostgreSQL.
plPHP and plPerlNG both belong on pgfoundry, not in the core distribution
...
When are you going to pull PL/pgSQL, PL/python and PL/Tcl?
Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Tatsuo Ishii

 Bruce Momjian wrote:
  Marc G. Fournier wrote:
  On Mon, 17 May 2004, Bruce Momjian wrote:
  
Most hopefully this is very discouraging! Connection pools are a nice
thing and I have used pgpool recently with great success, for pooling
connections. But attempting to deliver multimaster replication as a
byproduct of a connection pool isn't going to become an enterprise
feature. And the more half-baked, half-functional and half-reliable
replication attempts there are, the harder it will be to finally get a
real solution being recognized.
  
   Well, considering we offer _nothing_ for multi-master right now, I think
   it is a valuable project.
  
  Connection pooling is *not* multi master ... it doesn't even simulate
  multi-master ... multi-master, at least as far as I'm aware, means no
  point of failure, and connection pooling creates a *single* point of
  failure ... the pgpool process dies, you've lost all connections to the
  database ...

I think multi-master does nothing with
free-from-single-point-of-failure. A multi-master replication system
could have its own single point of failure (for example, some systems
have single coordinate server). On the other hand single-master
replication system could avoid single point of failure using some
external mechanism (for example UltraMonkey).

  I think people are confusing pgpool with pgcluster.
  
 
 And you wonder where that's coming from, eh? Tatsuo is advertising 
 pgpool as a synchronous replication system suitable for failover. 
 Quoting from the pgpool-1.0 README:

Please do not use the word failover for pgpool relication
functionality. Failover means it could continue replication
operation with alternative database. pgpool does not do that in
replication mode. Instead it disconnect the failed DB and continues
operation with healthy DB (with no replication, of course). That's why
I use the word degeneration in pgpool's replication mode.

 pgpool could be used as a replication server. This allows real-time
 backuping of the database to avoid disk failures. pgpool sends
 exactly same query to each PostgreSQL servers to accomplish
 replication. So pgpool can be regarded as a synchronous
 replication server.
 
 Don't get me wrong, as said pgpool works great for the purpose I tested, 
 the pooling. But statements like that are causing the confusion here.

Could you tell me why above is confusing? If it's really confusing,
I'm glad to enhance it. Or are you saying pgpool should not be regrard
as having replicaton facility?

Or you are saying that pgpool is too similar to PGCluster?

PGCluster is a multi-master/multi-slave/sync relication system. pgpool
is a single-master/single-slave/sync replication. There's a clear
distinction. single vs. multi-master is a BIG difference and I have
never stated that pgpool is a multi-master replication system.

BTW, the reason why I developed pgpool with replication functionality
is that there's no single perfect replication solution in the
world. Here are my comments for officially released replicatin systems
(from my own point of view, of course):

1) DBMirror:
   good: simple and easy to use.
   bad:  can not handle too much traffic. cannot replicate large objects.

2) PGCluster:
   good: can handle failover and recovery. SELECT load balancing is
  really nice.

   bad: requries many PCs. update performance is not good. cannot
  replicate large objects.

3) pgpool:
   good: simple and easy to use. can replicate large objects. update
  performance is not too bad.
   bad:  no load balancing, no failover. 

I'm interested in if Slony-I solves all these bad. I will try it when
I have spare time.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Jan Wieck

Marc G. Fournier wrote:
On Mon, 17 May 2004, Joshua D. Drake wrote:

 plPHP and plPerlNG both belong on pgfoundry, not in the core distribution
 ...
Uhhh?? Are you ripping out all core pls then? plPerlNG is supposed to
replace plPerl, I was talking with Bruce and he seemed to think that (as
long as the code was good enough) that we could incorporate plPHP???
That is the plan ... unless someone knows a reason why they can't be built
independently of the core?  ecpg relies on the grammar files in core, but
as far as I knew (please correct me if I'm wrong) the pls only rely on
headers and libraries that get installed ...
They are not as independant as one might think. The core support for set 
returning functions is required before a PL can do it. Same was with 
cursors and same will be with subtransactions being the base for 
exception handling. People have been struggling with unloadable shared 
objects from another version due to elog changes, I can't imagine what 
kind of support horror we're creating with this now.

The much I am for pulling stuff that does not belong into core, doing it 
just for the fun of cleaning up or trimming doesn't do. One of the major 
functions of CVS is that one can tag collections of revisions that 
together build a release, a known to be working snapshot of file 
revisions. If we completely lose the ability to tell what version of 
what PL, client interface or extension works with what version of the 
backend, we're losing some important detail here.

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Joshua D. Drake


The much I am for pulling stuff that does not belong into core, doing 
it just for the fun of cleaning up or trimming doesn't do. One of the 
major functions of CVS is that one can tag collections of revisions 
that together build a release, a known to be working snapshot of file 
revisions. If we completely lose the ability to tell what version of 
what PL, client interface or extension works with what version of the 
backend, we're losing some important detail here.

Also, one of the best features of PostgreSQL is that you can, at will 
write a procedure in just about anything... It seems that keeping at least
the most popular pl implementations would be an important step.
























]

Jan

--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Bruce Momjian

Joshua D. Drake wrote:
 Just to throw in my .02, plPerlNG won't be ready for testing until mid, 
 later June either. Then there is also plPHP which although we haven't 
 had any bug reports still needs some more peer review.
 
 Also we would like to submit our ECPG which includes SET DESCRIPTOR and 
 error handling but that too needs more peer review.

I assume your ecpg will be a patch to the existing ecpg rather than a
new verion, right?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Christopher Kings-Lynne


He is waiting for nested transactions to be committed so he can merge
his work in.

Somebody has posted sync multimaster replication (PgCluster) - nobody 
has commented on that. Maybe I am the only one who has ever tried it ...

I think it should be on gborg.
You mean pgFoundry :)
Chris
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Table Spaces

2004-05-17 Thread Christopher Kings-Lynne

Alternative database location:
Should this code be removed now?
I think that this:
CREATE DATABASE blah LOCATION 'xyz';
Should now be interpreted to mean:
CREATE TABLESPACE blah_tbsp LOCATION 'xyz';
CREATE DATABSE blah TABLESPACE blah_tbsp;
Or something like that...
Chris
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Bruce Momjian

Marc G. Fournier wrote:
 On Mon, 17 May 2004, Joshua D. Drake wrote:
 
 
  
   plPHP and plPerlNG both belong on pgfoundry, not in the core distribution
   ...
 
  Uhhh?? Are you ripping out all core pls then? plPerlNG is supposed to
  replace plPerl, I was talking with Bruce and he seemed to think that (as
  long as the code was good enough) that we could incorporate plPHP???
 
 That is the plan ... unless someone knows a reason why they can't be built
 independently of the core?  ecpg relies on the grammar files in core, but
 as far as I knew (please correct me if I'm wrong) the pls only rely on
 headers and libraries that get installed ...

Server-side languages are tied into the backend even closer than the
user data types.  They are best in the core distribution.  We didn't put
plR in core because it had a conflicting license.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Joshua D. Drake







  
I assume your ecpg will be a patch to the existing ecpg rather than a
new verion, right?

  

Yes it is a patch against 7.4.2



J



-- 
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Marc G. Fournier

On Mon, 17 May 2004, Bruce Momjian wrote:

 Server-side languages are tied into the backend even closer than the
 user data types.  They are best in the core distribution.  We didn't put
 plR in core because it had a conflicting license.

So, they can live on their own, which is a good thing to know ...


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Marc G. Fournier

On Mon, 17 May 2004, Jan Wieck wrote:

 They are not as independant as one might think. The core support for set
 returning functions is required before a PL can do it. Same was with
 cursors and same will be with subtransactions being the base for
 exception handling. People have been struggling with unloadable shared
 objects from another version due to elog changes, I can't imagine what
 kind of support horror we're creating with this now.

k, how is this different then any other software package that has to be
extended to make use of new features?  For instance, when subtransactions
get added, is that same person going to extended the various pls as well?
Or, more likely, when subtransactions are added, will someone responsible
for each of the pls submit patches to extend them?

Having pl/pgsql included as a 'reference implementation' is reasonable ...
I just think that pl/pick your language here should be on pgfoundry ...

 If we completely lose the ability to tell what version of what PL,
 client interface or extension works with what version of the backend,
 we're losing some important detail here.

Why is it our responsibility to ensure that though?  Shouldn't the
developer (or group of developers) responsible for the
PL/interface/extension be responsible for that?

Let's use plPHP as an example here ... I'm going to guess that it supports
PHP4, which is the 'standard' right now ... what about PHP5?  If not, what
happens in 3 months if/when that support is added?  Do ppl using PHP5 have
to wait until the next release of PostgreSQL before they can use it?

If, instead, plPHP is on pgfoundry, there is nothing that stops them
adopting a release numbering in parallel to the core distribution, at
least in so far as major.minor ... but they could release a
major.minor.minor release as required seperate from our release cycle that
still matches our latest stable, but extends itself to working with PHP5,
as an example ...

The thing is, whether as part of core, or as a seperate project, *any*
pl/interface/extension has to be maintained in order to be in sync ... if
done as a seperate  project, in parallel with core, it is at least
possible to release on their own timelines in order to correct bugs, or
add features ... as part of core, new features/bug fixes have to wait for
all of core to be released ...


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Mike Mascari

Marc G. Fournier wrote:
On Mon, 17 May 2004, Mike Mascari wrote:
A quick google of 7.4 Win32 release will reveal that the above was
precisely what was said about 7.4: it would be released to not hold
up important features like the IN optimization and a quick 7.5 would
have Win32 and PITR. It's almost as if a cron job reposts this
thread every 6 - 12 months. For those of us that are desirous of
PITR, it's a 6 month reposting that is becoming painful to read...
k, let's think this through ... 7.4 was released, what, 6 months ago?  And
6 months later, PITR still isn't ready?  Is there some logic here that if
7.4 wasn't released, PITR would have been done any sooner?
Not being the author, I don't know. And in the case of PITR, the 
pre-7.4 author is different than the post-7.4 author. However, if I 
was personally responsible for holding up the release of a project 
due to a feature that I had vowed to complete, I would feel morally 
compelled to get it done. If I had then asked for, and was granted, 
an extra 15-30 days I would feel even more personally responsible 
and under greater pressure.

If, however, the project made the release without waiting, I would 
feel simultaneously relieved and possibly a little bitter. Possibly 
a little bitter in that either what I was working on wasn't 
perceived as sufficiently valuable to hold up a release for 15-30 
days, or that my word regarding the completion status was 
insufficient for the project to trust me. Let me reiterate the words 
possibly and little. But in open source projects, a developer 
willing to contribute hundreds, possibly thousands of hours of his 
own time is particularly invaluable.

I can tell you that, in economic models that have studied human 
behavior with respect to unemployment insurance, for example, the 
re-employment rates are clustered at the tails: when someone is 
first unemployed and when the insurance is about to expire. It's an 
inappropriate analogy because the project lives on from release to 
release, instead of having a drop-dead date at which point no future 
changes would be made ad infinitum, but it paints a useful picture. 
I'm willing to bet that CVS commit rates mirror the above behavior.

Unlike unemployment benefits, releasing the software without the 
feature essentially just extends the development period another 6 
months, the work will intensify at the new perceived tails, and the 
process repeated. There are probably econometric papers that model 
the software development release cycle that could give quantitative 
arguments. I'm not arguing I'm right and your wrong, btw. I'm just 
pointing out some of the possibilities. In fact, for one developer 
it might be the code production maximizing condition to give them 
another 6 months and for another, creating the pressure associated 
with a 15-30 day extension where the world is standing still 
awaiting their patch...

Mike Mascari

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Marc G. Fournier

On Mon, 17 May 2004, Joshua D. Drake wrote:


 
 I assume your ecpg will be a patch to the existing ecpg rather than a
 new verion, right?
 
 
 
 Yes it is a patch against 7.4.2

Will you have one against -HEAD?  I believe there have been changes since
7.5 was branched, no?  Or have they been mostly cosmetic/docs?



Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Joshua D. Drake


Why is it our responsibility to ensure that though?  Shouldn't the
developer (or group of developers) responsible for the
PL/interface/extension be responsible for that?
Let's use plPHP as an example here ... I'm going to guess that it supports
PHP4, which is the 'standard' right now ... what about PHP5?  If not, what
happens in 3 months if/when that support is added?  Do ppl using PHP5 have
to wait until the next release of PostgreSQL before they can use it?
 

Actually this is a pretty good example. Yes right now it supports PHP4, 
it will support PHP5 when PHP5 is ready.
And of course, no they would not have to wait. They could download and 
patch against the current
source tree...

The thing is, whether as part of core, or as a seperate project, *any*
pl/interface/extension has to be maintained in order to be in sync ... if
done as a seperate  project, in parallel with core, it is at least
possible to release on their own timelines in order to correct bugs, or
add features ... as part of core, new features/bug fixes have to wait for
all of core to be released ...
 

Well actually no, because of the above mentioned. Even if plPHP is on 
pgFoundry... there is no
reason why a README couldn't be included in the src/pl/plphp directory 
that says: look here
for the latest release etc...


Sincerely,
Joshua D. Drake


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
---(end of broadcast)---
TIP 8: explain analyze is your friend
 


--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Bruce Momjian

Marc G. Fournier wrote:
 On Mon, 17 May 2004, Joshua D. Drake wrote:
 
 
  
  I assume your ecpg will be a patch to the existing ecpg rather than a
  new verion, right?
  
  
  
  Yes it is a patch against 7.4.2
 
 Will you have one against -HEAD?  I believe there have been changes since
 7.5 was branched, no?  Or have they been mostly cosmetic/docs?

Josh just sent me the patch and it looks good.  I encouraged him to send
it over to Michael Meskes, and soon, especially if he wants it in 7.5.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Andrew Dunstan

Bruce Momjian said:
 Marc G. Fournier wrote:
 On Mon, 17 May 2004, Joshua D. Drake wrote:

 
  
   plPHP and plPerlNG both belong on pgfoundry, not in the core
   distribution ...
 
  Uhhh?? Are you ripping out all core pls then? plPerlNG is supposed
  to replace plPerl, I was talking with Bruce and he seemed to think
  that (as long as the code was good enough) that we could incorporate
  plPHP???

 That is the plan ... unless someone knows a reason why they can't be
 built independently of the core?  ecpg relies on the grammar files in
 core, but as far as I knew (please correct me if I'm wrong) the pls
 only rely on headers and libraries that get installed ...

 Server-side languages are tied into the backend even closer than the
 user data types.  They are best in the core distribution.  We didn't
 put plR in core because it had a conflicting license.


I would never have created the plperlNG project on pgfoundry if I had
thought it meant divorcing plperl from the core.

pgfoundry in my mind can be a home for projects that will eventually fold
into the core, as well as things that will always remain separate.

I agree with Bruce about the place of server-side PLs.

cheers

andrew




---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] enabling tcpip_socket by default

2004-05-17 Thread Bruno Wolff III

On Mon, May 17, 2004 at 18:00:48 -0400,
  Andrew Dunstan [EMAIL PROTECTED] wrote:
 
 But what we listen to relates to the destination address of the packets, 
 not the source address ...

There still is some small risk. If you OS doesn't reject packets destined
for 127.*.*.* that don't come from the loopback interface, it is possible
for someone on your local network to at least do a blind spoofing attack,
possibly they might also be able to get replies back as well.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Bruce Momjian

Marc G. Fournier wrote:
  A quick google of 7.4 Win32 release will reveal that the above was
  precisely what was said about 7.4: it would be released to not hold
  up important features like the IN optimization and a quick 7.5 would
  have Win32 and PITR. It's almost as if a cron job reposts this
  thread every 6 - 12 months. For those of us that are desirous of
  PITR, it's a 6 month reposting that is becoming painful to read...
 
 k, let's think this through ... 7.4 was released, what, 6 months ago?  And
 6 months later, PITR still isn't ready?  Is there some logic here that if
 7.4 wasn't released, PITR would have been done any sooner?

Even though I was for a later feature freeze, Marc argument is
powerful.

There was talk that Win32 and PITR would be available right after 7.5
started development, and they weren't.  Instead it took several months
for Patrick and Tom to get JR's PITR/WAL patch into CVS, and then
another month or two for someone to appear and do the work of archiving
the files, then after discussion of implementation issues, it now needs
even more work.  Win32 has been on a steady course thanks to Claudio and
Magnus --- without them we would be nowhere near finished.  Nested
transactions were started in April and tablespaces in February, both
funded by Fujitsu.

Basically, maybe Marc is right that these features have to span multiple
releases.  Win32 spanned two releases (some of it was in 7.4).  PITR WAL
was initially done by JR just before 7.3 feature freeze, I think, but it
took all this time to get this far.

Basically, my big concern is incremental improvement releases, which I
feel describe our past few releases.  Yes, I said it.  I see items
listed above as critical to allowing PostgreSQL to move into more
significant roles in enterprises, and I am frustrated that it is taking
so long to happen.

What can be done?  Well, money from Fujitsu and other companies
(Afilias/Sloney, Command Prompt/ecpg-plPHP), is allowing us to hit some
of these bigger items, so hopefully that will move us forward in these
complex areas.  I am not sure what could have been done to push some of
these projects along faster.  I am happy Win32 had a steady pace of
improvement, but even now we are finishing up to the wire rather than
having it done months ago, but in hindsight, I am not sure what we could
have done differently.

So, yea, I am frustrated.  I know these features are hard and complex,
but I want them for PostgreSQL, and I want them as soon as possible.  I
guess what really bugs me is that we are so close to having these few
remaining big features, and because they are so complex, they are taking
a lot longer to arrive than previous features, and sometimes see a year
pass without progress on some items, and that bugs me.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Bruno Wolff III

On Mon, May 17, 2004 at 17:06:18 -0400,
  Greg Stark [EMAIL PROTECTED] wrote:
 
 I'll mention another perspective as a user. I'm actually happier seeing a
 relatively minor release come out just before the big changes hit. If 7.5 has
 Windows, PITR, nested transactions, etc. especially if I see they went in just
 before a feature freeze then I'm liable to wait before I suggest installing it
 in production because it makes me fear the impact these major new features
 will have on a system that's been running fine on 7.4.

7.5 has already had some significant changes made in how writes are done
to disk. If I had very valuable data to protect I would already consider
7.5 a risky upgrade (in comparison to the 7.3 to 7.4 upgrade).

If this is your situation, going to 7.4.3 and sticking with it for a while
waiting to see how 7.5 does, may be a good idea.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Joshua D. Drake


So, yea, I am frustrated.  I know these features are hard and complex,
but I want them for PostgreSQL, and I want them as soon as possible.  I
guess what really bugs me is that we are so close to having these few
remaining big features, and because they are so complex, they are taking
a lot longer to arrive than previous features, and sometimes see a year
pass without progress on some items, and that bugs me.
 

So why do we wait for some of these features? The bgwriter is done 
right? Why don't we backport to
7.4.x and release with 7.4.3? What about the vacuum stuff Jan was doing?

I guess what I am saying is, what features are in HEAD that can be 
backported to 7.4.x without the
requiring of an initdb?

Yes it would be breaking from the tradition of very little feature 
releases in incrementals but then again
maybe that would be a good thing...

Sincerely,
Joshua D. Drkae


--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [HACKERS] enabling tcpip_socket by default

2004-05-17 Thread Andrew Dunstan

Bruno Wolff III said:
 On Mon, May 17, 2004 at 18:00:48 -0400,
  Andrew Dunstan [EMAIL PROTECTED] wrote:

 But what we listen to relates to the destination address of the
 packets,  not the source address ...

 There still is some small risk. If you OS doesn't reject packets
 destined for 127.*.*.* that don't come from the loopback interface, it
 is possible for someone on your local network to at least do a blind
 spoofing attack, possibly they might also be able to get replies back
 as well.


For some value of small approaching 0 :-) .

The default configuration will only allow localhost-localhost connections
(via the combination of the default listening_addresses value and the
default pg_hba.conf settings). So to spoof it successfully you would have
to be able to get the host to accept a nonlocal packet addressed to
localhost AND get it to route the reply addressed to localhost to your
nonlocal machine.

If you have such an insecure OS you should
- throw it in the bin and get another with a sane network stack, and
- in the meantime set listening_addresses to  to turn of TCP altogether.

But then PostgreSQL is likely to be the least of your problems, I suspect.

Bear in mind that behaviour has not changed at all really, only *which*
behaviour is the default.

cheers

andrew






---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Marc G. Fournier

On Mon, 17 May 2004, Joshua D. Drake wrote:


 So, yea, I am frustrated.  I know these features are hard and complex,
 but I want them for PostgreSQL, and I want them as soon as possible.  I
 guess what really bugs me is that we are so close to having these few
 remaining big features, and because they are so complex, they are taking
 a lot longer to arrive than previous features, and sometimes see a year
 pass without progress on some items, and that bugs me.
 
 
 

 So why do we wait for some of these features? The bgwriter is done
 right? Why don't we backport to 7.4.x and release with 7.4.3? What about
 the vacuum stuff Jan was doing?

 I guess what I am saying is, what features are in HEAD that can be
 backported to 7.4.x without the requiring of an initdb?

 Yes it would be breaking from the tradition of very little feature
 releases in incrementals but then again maybe that would be a good
 thing...

This has been put forward by me a couple of times in the past, and, to a
small extent, I do agree with the arguments against it ... namely, the
backporting, testing and release cycles that we'd have to adopt would
detract from forward development ...

As soon as we get to backporting features, then we have to get into
feature freezes on the stable branch, beta periods of testing leading to
a release, etc ...

I'd almost say that time would be better spent on coming up with an
effective upgrade method so that upgrading to new releases is easier ...

Please note that I'm not against the backporting, but do understand the
arguments against it in terms of time and manpower ...


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Alvaro Herrera Munoz

On Mon, May 17, 2004 at 04:55:50PM +0200, Zeugswetter Andreas SB SD wrote:
 
  It is too late to think about pushing back another month.  We had this
  discussion already.  June 1 is it.
 
 I thought the outcome of that discussion was June 15 ?

I think there was no outcome.  There was no official pronouncement, there
was no vote, there was no consensus.  People seemed to stick with whatever
was closer.

 Can we try to do the 2PC patch now instead of waiting for subtransactions ?

The last post from Heikki I read said that he discovered some serious
problems with his implementation and he wanted do rethink about them.  I
don't think he will be able to make it, mainly because if my patch gets
accepted it will be too close to feature freeze (if it is June 1st).

Personally I've been focused on getting subtransactions done and now I think
I'm very close to an acceptable patch, but what has slowed me down the last
time has been lack of feedback from core developers.  It was feedback I
needed to figure out the best ways to do things (I made several big mistakes
that I'm only now correcting thanks to invaluable comments from Tom Lane),
and without it the last steps were getting very difficult to me.
Fortunately now I've got it.

I have some confidence in that I will be able to deliver it maybe the last
week of May.  I can only hope, however, that it will not be rejected because
it's presented too close to feature freeze.  That would be a shame, because
I offered incremental patches a lot of time ago and they weren't even looked
at  (Hey, I'm not blaming anyone).

-- 
Alvaro Herrera ([EMAIL PROTECTED])
FOO MANE PADME HUM

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

[HACKERS] proposal: be smarter about i/o patterns in index scan

2004-05-17 Thread Jeffrey W. Baker

Greetings all,

We have noticed a way to make a major improvement in Pg's performance on
our workload, and I would like to get your thoughts before I go off to
work up a patch.

The basic problem is that Pg seeks far too much when performing an index
scan.  I have saved an strace of a backend which is selecting 170,000
rows from a 240,000,000 row table via index scan.  The strace shows
134,000 seeks and reads, or almost one seek for every tuple in the
result set.  This would be fine normally except the seeks are in a
*very* suboptimal pattern, swinging back and forth across the device. 
The query requires 16 minutes, 30 seconds on our test hardware.

The proposal is to sort the block requests before they are issued. 
Because Pg only issues single seek/read pairs synchronously, the kernel
has no chance to apply its elevator and hence every seek/read Pg issues
results in actual movement of the disk head.  Pg's random pattern of
seeking also makes the kernel's readahead fairly useless.

To prove the proposal I did a simulation, using the recorded seek
positions in the strace.  We noted that Pg issued 403 seek/read pairs
for every 8192 bytes read from the index.  So we performed four
simulations: a straight playback of Pg's seek pattern, a playback where
each list of 403 seeks is sorted by byte offset, a playback of all the
seeks sorted by offset, and a sequential scan of the 13GB data file.

PostgreSQL 4.2.1:  16m30s
Sorted in chunks:  10m6s
Sorted altogether: 4m19s
Sequential scan:   6m7s

As you can see, there's a lot to be gained here.  If Pg was able to
issue its seeks in the optimal order, the query would return in 1/4th
the time.  Even the chunk-sorted scheme is a big win.

So the proposal is this: the offsets stored in the index ought to be
sorted.  Either A) they can be stored sorted in the first place (sorted
in VACUUM ANALYZE, or always sorted), or B) the backend can sort each
list of tuples it reads from the index, or C) the backend can read the
entire index result and sort that (this would be the best).

From just paging through the code, it doesn't seem terribly hard.  B
seems the easiest to implement but is also the least improvement.  Even
seqscan is better than B.  One improvement to B would be to read much
more than 8K from the index.  Reading e.g. 256KB would improve things
dramatically.  A might be easy but would either degrade over time or
cause slower inserts.  C is the best and hardest to implement (I am
guessing, because the size of sorted index subset is unbounded).

Your thoughts are appreciated.

-jwb


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Bruce Momjian

Alvaro Herrera Munoz wrote:
 Personally I've been focused on getting subtransactions done and now I think
 I'm very close to an acceptable patch, but what has slowed me down the last
 time has been lack of feedback from core developers.  It was feedback I
 needed to figure out the best ways to do things (I made several big mistakes
 that I'm only now correcting thanks to invaluable comments from Tom Lane),
 and without it the last steps were getting very difficult to me.
 Fortunately now I've got it.
 
 I have some confidence in that I will be able to deliver it maybe the last
 week of May.  I can only hope, however, that it will not be rejected because
 it's presented too close to feature freeze.  That would be a shame, because
 I offered incremental patches a lot of time ago and they weren't even looked
 at  (Hey, I'm not blaming anyone).

One huge problem is that many features are asking for our attention this
close to freeze date.  I spend some time on relative-path installs, and
now there are lots of patches that need to be reviewed and placed into
the queue, and we haven't been giving you enough feedback.

And I am thinking of helping with PITR because there isn't much work to
do except plugging into the backend and GUC.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] proposal: be smarter about i/o patterns in index scan

2004-05-17 Thread Sailesh Krishnamurthy


Yes, fetching a RID list from an index scan, sorting 'em and then
fetching from the table would be a very appreciable speedup for many
queries. I would imagine that all the commercial engines do this (db2
calls it a sorted RID-list-fetch) .. and this has in fact been
discussed on -hackers before.

One issue for this is that there could be a slip between the cup and
the lip .. ie., between the fetch from the index, the sort, and the
fetch from the table a vaccuum could have taken place rendering TIDs
invalid. This should however not be a show-stopper .. surely all
that's needed is a lock against vaccuuming in the presence of a
tid-list-fetch. 

-- 
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

[HACKERS] FYI: away for the summer

2004-05-17 Thread Neil Conway

I'll be away from PostgreSQL development from approximately the end of 
May until August 20th. I won't be subscribed to any PG-related mailing 
lists for that period. However, I'll still be accessible via email to 
this address.

Have a great summer, everyone.
-Neil
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

1 2 >

1 - 100 of 105 matches

Mail list logo