Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-22 Thread Curt Sampson

On Thu, 22 Jun 2005, Greg Stark wrote:


Tom Lane [EMAIL PROTECTED] writes:


Unfortunately, I cannot believe these numbers --- the near equality of
fsync off and fsync on means there is something very wrong with the
measurements.  What I suspect is that your ATA drives are doing write
caching and thus the fsyncs are not really waiting for I/O at all.


I wonder whether it would make sense to have an automatic test for this
problem. I suspect there are lots of installations out there whose admins
don't realize that their hardware is doing this to them.


But is it really a problem? I somewhere got the impression that some
drives, on power failure, will be able to keep going for long enough to
write out the cache and park the heads anyway. If so, the drive is still
guaranteeing the write.

But regardless, perhaps we can add some stuff to the various OSes'
startup scripts that could help with this. For example, in NetBSD you
can dkctl device setcache r for most any disk device (certainly all
SCSI and ATA) to enable the read cache and disable the write cache.

cjs
--
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
 Make up enjoying your city life...produced by BIC CAMERA

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-22 Thread Curt Sampson

On Wed, 22 Jun 2005, Tom Lane wrote:


[ shudder ]  I can see the complaints now: Merely starting up Postgres
cut my overall system performance by a factor of 10!


Yeah, quite the scenario.


This can *not* be default behavior, and unfortunately that limits its
value quite a lot.


Indeed. Maybe it's best just to document this stuff for the various
OSes, and let the admins deal with configuring their machines.

But you know, it might be a reasonable option switch, or something.

cjs
--
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
 Make up enjoying your city life...produced by BIC CAMERA

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-22 Thread Curt Sampson

On Thu, 23 Jun 2005, Tom Lane wrote:


The bottom line here seems to be the same as always: you can't run an
industrial strength database on piece-of-junk consumer grade hardware.


Sure you can, though it may take several bits of piece-of-junk
consumer-grade hardware. It's far more about how you set up your system
and implement recovery policies than it is about hardware.

I ran an ISP back in the '90s on old PC junk, and we had far better
uptime than most of our competitors running on expensive Sun gear. One
ISP was completely out for half a day because the tech. guy bent and
broke a hot-swappable circuit board while installing it, bringing down
the entire machine. (Pretty dumb of them to be running everything on a
single, irreplacable high-availablity system.)


...they blame us when they don't get the same results as the guy
running Oracle on...


Now that phrase irritates me a bit. I've been using all this stuff for
a long time (Postgres on and off since QUEL, before SQL was dropped
in instead) and at this point, for the (perhaps slim) majority of
applications, I would say that PostgreSQL is a better database than
Oracle. It requires much, much less effort to get a system and its test
framework up and running under PostgreSQL than it does under Oracle,
PostgreSQL has far fewer stupid limitations, and in other areas, such
as performance, it competes reasonably well in a lot of cases. It's a
pretty impressive piece of work, thanks in large part to efforts put in
over the last few years.

cjs
--
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
 Make up enjoying your city life...produced by BIC CAMERA

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[HACKERS] Multiple-statement Rules Incompatible With Constraints

2005-05-27 Thread Curt Sampson


I've got a table I've split into two, a portion that can be modified
under normal circumstances, and a portion that can't be. (In a testing
mode they both can be, but in production the user IDs doing the work
don't have update or delete access on the immutable portion. BTW, I'm
open to better ways of doing this.)

So it's along the lines of

CREATE TABLE offer_immutable (offer_id serial PRIMARY KEY );
CREATE TABLE offer_mutable (
offer_id int PRIMARY KEY REFERENCES offer_immutable);
ALTER TABLE offer_immutable ADD CONSTRAINT offer_immutable_offer_id_fkey
FOREIGN KEY (offer_id) REFERENCES offer_mutable
DEFERRABLE INITIALLY DEFERRED;
CREATE VIEW offer AS
SELECT * FROM offer_immutable NATURAL JOIN offer_mutable;

In a transaction, when I try to commit, this does not work:

CREATE OR REPLACE RULE offer_delete AS
ON DELETE TO offer DO INSTEAD (
DELETE FROM offer_mutable WHERE offer_id = OLD.offer_id;
DELETE FROM offer_immutable WHERE offer_id = OLD.offer_id;
);

It gives me:

ERROR:  insert or update on table offer_immutable violates foreign
key constraint offer_immutable_offer_id_fkey
DETAIL:  Key (offer_id)=(77) is not present in table offer_mutable.

On the other hand, if I use this instead:

CREATE OR REPLACE FUNCTION offer_delete(int) RETURNS VOID AS $$
DELETE FROM offer_mutable WHERE offer_id = $1;
DELETE FROM offer_immutable WHERE offer_id = $1;
$$ LANGUAGE 'SQL' VOLATILE SECURITY INVOKER;

CREATE OR REPLACE RULE offer_delete AS
ON DELETE TO offer DO INSTEAD SELECT offer_delete(OLD.offer_id);

It works fine.

I can also just do the two separate deletes in a transaction, and it
works ok.

BEGIN;
DELETE FROM offer_mutable WHERE offer_id = 123;
DELETE FROM offer_immutable WHERE offer_id = 123;
COMMIT;

Bug?

cjs
--
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
 Make up enjoying your city life...produced by BIC CAMERA

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Multiple-statement Rules Incompatible With Constraints

2005-05-27 Thread Curt Sampson

On Fri, 27 May 2005, Tom Lane wrote:


Same old same old: as soon as you've deleted from offer_mutable, there
is no row in the view with the given offer_id; and since OLD is a macro
for the view, the second delete finds nothing to do.


Oh, now I'm starting to see how this behavior does make sense...unwanted
as it is in my case. Maybe I find it a bit weird that the sense of
OLD.something can change within a rule, though.


You might be able to fix this by making the view an outer join, and
deleting from the nullable side first.


Yuck. Though I'm already deep in yuck anyway.


The whole data structure seems a tad weird though ...


More than a tad. I'm not happy with this thing I've come up with. I'm
really starting to think that I took the wrong path. What's a better way
to express my intention?

Perhaps the thing to do is go back to one table, and have a trigger that
throws an exception if you try to update the non-updatable columns.
(And only the test account can do deletes--if I need to do an update
during testing, delete and insert is the answer.) What do you think?

But as well, this is really also another instance of an area where
functions work differently from things not in functions (and I tend to
think that the way things work in functions in most of these cases is
right).

cjs
--
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
 Make up enjoying your city life...produced by BIC CAMERA

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


[HACKERS] Time Zone Names Problem

2005-02-20 Thread Curt Sampson
I'm running PostgreSQL 8.0.1 from the NetBSD pkgsrc package. I'm getting some
odd behavior with time zone names:
SET timezone TO 'JST';
psql:z.sql:1: ERROR:  unrecognized time zone name: JST
SET timezone TO 'EST';
psql:z.sql:2: ERROR:  unrecognized time zone name: EST
SET timezone TO 'Asia/Tokyo';
psql:z.sql:3: ERROR:  unrecognized time zone name: Asia/Tokyo
SET timezone TO 'US/Eastern';
psql:z.sql:4: ERROR:  unrecognized time zone name: US/Eastern
SET timezone TO 'Asia/Tokyo-9';
SET
SELECT '2004-08-22 18:42:12' AT TIME ZONE 'Asia/Tokyo';
psql:z.sql:7: ERROR:  time zone asia/tokyo not recognized
SELECT '2004-08-22 18:42:12' AT TIME ZONE 'Asia/Tokyo-9';
psql:z.sql:8: ERROR:  time zone asia/tokyo-9 not recognized
SELECT '2004-08-22 18:42:12' AT TIME ZONE 'JST';
  timezone
-
 2004-08-22 18:42:12
(1 row)
Anybody have any idea what's going on here? The only patch pkgsrc makes
is related to shared library version numbers for a couple of operating
systems, so I don't think that this is package-specific. It may be
NetBSD-specific, however.
cjs
--
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
 Make up enjoying your city life...produced by BIC CAMERA
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Time Zone Names Problem

2005-02-20 Thread Curt Sampson
On Sun, 20 Feb 2005, Tom Lane wrote:
Hmm, all of those should work (and do work here).  Are the corresponding
files present in the appropriate installation directory?  Look under
.../share/postgresql/timezone/
They are not; it's evidently a problem with the way the NetBSD package
installs things. I'll look into that. Thanks for the tip.
cjs
--
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
 Make up enjoying your city life...produced by BIC CAMERA
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PATCHES] [HACKERS] ARC Memory Usage analysis

2004-10-26 Thread Curt Sampson
On Tue, 26 Oct 2004, Greg Stark wrote:

 I see mmap or O_DIRECT being the only viable long-term stable states. My
 natural inclination was the former but after the latest thread on the subject
 I suspect it'll be forever out of reach. That makes O_DIRECT And a Postgres
 managed cache the only real choice. Having both caches is just a waste of
 memory and a waste of cpu cycles.

I don't see why mmap is any more out of reach than O_DIRECT; it's not
all that much harder to implement, and mmap (and madvise!) is more
widely available.

But if using two caches is only costing us 1% in performance, there's
not really much point

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
 Make up enjoying your city life...produced by BIC CAMERA

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PATCHES] [HACKERS] ARC Memory Usage analysis

2004-10-26 Thread Curt Sampson
On Wed, 26 Oct 2004, Greg Stark wrote:

  I don't see why mmap is any more out of reach than O_DIRECT; it's not
  all that much harder to implement, and mmap (and madvise!) is more
  widely available.

 Because there's no way to prevent a write-out from occurring and no way to be
 notified by mmap before a write-out occurs, and Postgres wants to do its WAL
 logging at that time if it hasn't already happened.

I already described a solution to that problem in a post earlier in this
thread (a write queue on the block). I may even have described it on
this list a couple of years ago, that being about the time I thought
it up. (The mmap idea just won't die, but at least I wasn't the one to
bring it up this time. :-))

 Well firstly it depends on the work profile. It can probably get much higher
 than we saw in that profile

True, but 1% was is much, much lower than I'd expected. That tells me
that my intuitive idea of the performance model is wrong, which means,
for me at least, it's time to shut up or put up some benchmarks.

 Secondly it also reduces the amount of cache available. If you have 256M of
 ram with about 200M free, and 40Mb of ram set aside for Postgres's buffer
 cache then you really only get 160Mb. It's costing you 20% of your cache, and
 reducing the cache hit rate accordingly.

Yeah, no question about that.

 Thirdly the kernel doesn't know as much as Postgres about the load. Postgres
 could optimize its use of cache based on whether it knows the data is being
 loaded by a vacuum or sequential scan rather than an index lookup. In practice
 Postgres has gone with ARC which I suppose a kernel could implement anyways,
 but afaik neither linux nor BSD choose to do anything like it.

madvise().

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
 Make up enjoying your city life...produced by BIC CAMERA

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Check Constraints and pg_dump

2004-03-02 Thread Curt Sampson
On Tue, 2 Mar 2004, Robert Treat wrote:

 Wouldn't a FK on both tables be the appropriate schema?  With the FK on
 contract being deffered?

Unfortunately, it appears that an FK must reference a unique column. So this:

ALTER TABLE contract ADD CONSTRAINT contract_must_have_a_plan
FOREIGN KEY ( contract_id ) REFERENCES plan ( contract_id )
INITIALLY DEFERRED;

produces the error message:

UNIQUE constraint matching given keys for referenced table plan not found

Since a plan may have more than one contract.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Check Constraints and pg_dump

2004-03-01 Thread Curt Sampson
On Thu, 26 Feb 2004, Tom Lane wrote:

 Jonathan Scott [EMAIL PROTECTED] writes:
  The functions and tables create just fine, but when it gets to the
  COPY part of the sql script, it tries to load tables in what really is
  the wrong order. The check constraint is making sure there is a plan
  before there is a contract, yet pg_dump is trying to load the
  contract table before there is anything in the plan table.

 Shouldn't you be using a foreign key for that?

Can you explain how to do this? There is no reference to a plan in the
contract table; the constraint just checks to see that, if a contract
exists, there is at least one plan referencing that contract.

There is of course a foreign key constraint used in the plan table to
make sure that the contract exists.

 I don't see any reasonable way that pg_dump can be expected to
 understand that a check constraint expresses a relationship between two
 tables.  The semantics of check constraints aren't right for it anyway.

What other constraint could one use for a situation like this?

At any rate, I am not sure why pg_dump has to know or care what check
constraints do; if it simply treated them as it does all the other
constraints, and applied them after all the data are loaded, wouldn't
the problem just go away?

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Check Constraints and pg_dump

2004-03-01 Thread Curt Sampson
On Mon, 1 Mar 2004, Tom Lane wrote:

 Curt Sampson [EMAIL PROTECTED] writes:
  Can you explain how to do this? There is no reference to a plan in the
  contract table; the constraint just checks to see that, if a contract
  exists, there is at least one plan referencing that contract.
  There is of course a foreign key constraint used in the plan table to
  make sure that the contract exists.

 If so, how would it be possible to create a plan before creating the
 contract?  I don't think the OP's requirements are clearly thought out.

You create the contract and its first plan at the same time, inserting
the plan first. It works fine.

  At any rate, I am not sure why pg_dump has to know or care what check
  constraints do; if it simply treated them as it does all the other
  constraints, and applied them after all the data are loaded, wouldn't
  the problem just go away?

 If we did that we'd be slowing bulk loads (since each added check
 constraint would incur an additional scan over the table)

Certainly, but do we not already pay that price for all non-check
constraints? And it slows loads a lot more when you have to edit your
dumps because they are otherwise unloadable. At any rate, this being a
database, I'm inclined more towards correctness than speed.

 and decreasing legibility of the dumped schema (surely you will agree
 that it's more readable to keep the constraint in the CREATE TABLE
 command).

I agree that it's more readable, yes. But again, do we not already pay
that price for all non-check constraints?

 There is code in CVS tip pg_dump to split out a check constraint from
 the parent table when this is the only way to break a circular
 dependency.  But I'm disinclined to apply that transformation all the
 time, especially when the only reason to do so is to support a misuse
 of check constraints.

I'm having great difficulty understanding why this is a misuse. What
is the proper way to check that a contract cannot exist without at least
one plan?

 Check constraints are not intended to handle
 cross-table checks, and I'm unwilling to buy into any suggestion that
 we should consider that a supported use.

So how do we handle it?

 We have talked in the past about supporting SQL's CREATE ASSERTION
 command, which *is* intended to describe cross-table conditions.
 I don't recall that anyone had good ideas about a reasonably efficient
 implementation though.

I would happily settle for an inefficent implementation; that would give
me the choice of correctness versus efficiency, rather than having no
choice at all.

 In the meantime, if what's wanted is a one-time check at row insertion,
 the right way to express that behavior is with an ON INSERT trigger.

That's not an adequate check; it would allow you later to delete the
plan without deleting the contract.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] pg_dump and CHECK constraints

2004-01-26 Thread Curt Sampson

I notice that pg_dump is still dumping CHECK constraints with the table,
rather than at the very end, as it does with all the other constraints.
As discussed in bug report #787, at

http://archives.postgresql.org/pgsql-bugs/2002-09/msg00278.php

this breaks your restore if your CHECK constraint uses a user-defined
function.

1. Does anybody have any plans to fix this in the very near future?

2. If not, is there something that makes it particularly hard to
fix? I notice, Tom, that in your reply to that bug report you
intimated that this wasn't an easy fix, but I don't see why CHECK
constraints couldn't be added at the end of the dump, just as all
the other constraints are. Presumably, your message being late 2002,
this was before pg_dump was modified to re-order stuff?

3. If we created a patch for this at my work, would it be accepted?

I dunno...this looks really easy to me

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Broken Constraint Checking in Functions

2003-10-23 Thread Curt Sampson
On Fri, 24 Oct 2003, Bruce Momjian wrote:

 I am not sure we should add something to the SET CONSTRAINT page on
 this.  Our current behavior is clearly a bug, and for that reason
 belongs more on the TODO list, where it already is:

Had it been on the SET CONSTRAINT page, it would have saved me several
hours of debugging. I found the entry only after tracking down the
problem and creating a simple test case to demonstrate it.

We document other bugs on this page, e.g.:

Currently, only foreign key constraints are affected by this
setting. Check and unique constraints are always effectively
initially immediate not deferrable.

So why not document this one as well?

   * Have AFTER triggers execute after the appropriate SQL statement in a
 function, not at the end of the function

 The big question is whether this entry is clear enough for people to
 understand it could bite them.

My big question is, should we expect that anybody reading the
documentation also has to go through the TODO list to see if there are
bugs on the list not mentioned in the manual?

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[HACKERS] 7.4beta4 compile failure on NetBSD

2003-10-10 Thread Curt Sampson
I just happened to try to compile 7.4beta4 on a two-month-old NetBSD-current
system, and I get the following:

In file included from timestamp.c:14:
../../../../src/interfaces/ecpg/include/datetime.h:7: conflicting types for `dtime_t'
/usr/include/sys/types.h:186: previous declaration of `dtime_t'

Basically, the typedef timestamp dtime_t; (timestamp is a double)
conflicts with

typedef int32_t dtime_t;/* on-disk time_t */

in sys/types.h. These appear to me to be completely unrelated typedefs
that just happen to have the same name.

This is when compiling src/interfaces/ecpg/pgtypeslib/timestamp.c, BTW.

Any thoughts? I'm prepared to pull down a CVS checkout and recompile
to test fixes, if that will help.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] Broken Constraint Checking in Functions

2003-10-10 Thread Curt Sampson

So it seems I got bitten today by this to-do list item:

Have AFTER triggers execute after the appropriate SQL statement in a
function, not at the end of the function

Under normal circumstances, delaying this stuff until the end of the
function doesn't bother me; in fact I've even used it to get around the
fact that SET CONSTRAINTS won't let you delay non-referential constraint
checks.

However, it seems that cascading deletes are also delayed, which leads to
a pretty serious problem. The following code:

INSERT INTO master (master_id) VALUES (400);
INSERT INTO dependent (master_id) VALUES (400);
DELETE FROM master WHERE master_id = 400;

works just fine outside a function, but inside a function it fails with

ERROR: $1 referential integrity violation - key referenced from
dependent not found in master

It seems that the integrity check for the dependent is happening before the
cascaded delete, but the check is operating not on the data at the time of
the statement, but the data as it stands after the statement following the
one that triggered the check. Ouch!

Having spent the better part of a day tracking down this problem
(because of course, as always, it only decides to appear in one's own
code after it's gotten quite complex), I think for a start it would
be a really, really good idea to put something about this in the
documentation for the 7.4 release. Probably the SET CONSTRAINTS page
would be a good place to have it, or at least a pointer to it.

In the long run, of course, I'd like to see a fix, but preferably after
we fix the system to allow delay of non-referential constraints as well,
since I am use this bug now in production code to delay constraint
checking for non-referential constraints. (You might even document that
workaround in the SET CONSTRAINTS manual page, with an appropriate
warning, if one seems necessary.)

I've attached a short shell script that will demonstrate the problem.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
Don't you know, in this new Dark Age, we're all light.  --XTC#!/bin/sh
schema=pfcb_test
cat EOF | psql -q

SET client_min_messages TO WARNING;
CREATE SCHEMA $schema;
SET search_path TO $schema;
\qecho

CREATE TABLE junk (junk int);   -- just to supress some select results
CREATE TABLE master (master_id int PRIMARY KEY);
CREATE TABLE dependent (dependent_id serial PRIMARY KEY,
master_id  int REFERENCES master ON DELETE CASCADE);

\qecho Outside a function, inserting into the dependent before the master
\qecho correctly produces a referential integrity error:
BEGIN;
SET CONSTRAINTS ALL IMMEDIATE;
INSERT INTO dependent (master_id) VALUES (200);
--INSERT INTO master (master_id) VALUES (200);
COMMIT;
\qecho

CREATE OR REPLACE FUNCTION reverse_order_insert() RETURNS int AS '
SET CONSTRAINTS ALL IMMEDIATE;
INSERT INTO dependent (master_id) VALUES (300);
INSERT INTO master (master_id) VALUES (300);
SELECT 0;
' LANGUAGE 'SQL';

\qecho The reverse order insert, when in a function, somehow passes
\qecho the integrity constraint, even in immediate mode:
INSERT INTO junk SELECT reverse_order_insert();
SELECT count(*) = 1 AS reverse_order_insert_in_function_succeeded
FROM master NATURAL JOIN dependent
WHERE master_id = 300;


\qecho Outside a function, a cascaded delete works:
INSERT INTO master (master_id) VALUES (100);
INSERT INTO dependent (master_id) VALUES (100);
DELETE FROM master WHERE master_id = 100;
SELECT count(*) = 0 AS inserted_row_not_present
FROM master NATURAL JOIN dependent
WHERE master_id = 100;

CREATE OR REPLACE FUNCTION delete_using_cascade() RETURNS int AS '
INSERT INTO master (master_id) VALUES (400);
INSERT INTO dependent (master_id) VALUES (400);
DELETE FROM master WHERE master_id = 400;
SELECT 0;
' LANGUAGE 'SQL';

\qecho A properly ordered insert and cascaded delete, inside a function,
\qecho does not work:
INSERT INTO junk SELECT delete_using_cascade();
\qecho

DROP SCHEMA $schema CASCADE;
EOF

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Bumping block size to 16K on FreeBSD...

2003-08-29 Thread Curt Sampson
On Thu, 28 Aug 2003, David Schultz wrote:

 (2) Make BLCKSZ a runtime constant, stored as part of the database.

Now this I really like. It would make benchmarking 8K vs. 16K blocksizes
much easer, as well as of course avoiding the initdb required after
rebuilding problem.

BTW, pretty much every BSD system is going to be using 16K block sizes
on large partitions; the cylinder group size and filesystem overhead is
way, way too small when using 8K blocks.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] NOTICE vs WARNING

2003-08-28 Thread Curt Sampson
On Wed, 27 Aug 2003, Bruno Wolff III wrote:

 Implicit froms aren't depreciated yet.

It would be really nice, to my mind, if they were killed stone dead.
I've been bitten several times by having an implicit FROM added to a
query that destroyed it.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Decent VACUUM (was: Buglist)

2003-08-26 Thread Curt Sampson
On Thu, 21 Aug 2003, Tom Lane wrote:

 We have had some people looking at improved buffer management
 algorithms; LRU-2 or something smarter would help.  I dunno whether
 we can dissuade the kernel from flushing its cache though.

Using open/read/write system calls, you can't. You can always use an OS
like Solaris that should detect the sequential read, however, and avoid
blowing out the buffer cache.

Most operating systems support the madvise system call, which does let
you do things like say, I'm accessing this sequentially and I don't
need this to be buffered any more, though how much attention most
operating systems pay to this advice varies with the OS. However, it
turns out to be non-trivial to get postgres to use mmap for data blocks,
since you can't actually write any data to a mmmap'd block until you've
confirmed that the log entry has been written, because once you've
touched data in an mmaped block you have no way of stopping it from
being written to the disk right away.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] O_DIRECT in freebsd

2003-06-17 Thread Curt Sampson
On Tue, 17 Jun 2003, Christopher Kings-Lynne wrote:

 A new DIRECTIO kernel option enables support for read operations that
 bypass the buffer cache and put data directly into a userland buffer

 Will PostgreSQL pick this up automatically, or do we need to add extra
 checks?

You don't want it to. It's more efficent just to use mmap, because then
all the paging and caching issues are taken care of for you.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] WAL replay logic (was Re: [PERFORM] Mount options f

2003-02-18 Thread Curt Sampson
On Tue, 18 Feb 2003, Mikheev, Vadim wrote:

   So if you do this, do you still need to store that information in
   pg_control at all?

 Yes: to speeds up the recovery process.

If it's going to slow down the performance of my database when not doing
recovery (because I have to write two files for every transaction,
rather than one), I couldn't care less about speeding up the recovery
process.

As far as Bruce's question goes, what kind of corruption can happen to
the log files? We write a full block at a time, I guess, so it might
make sense to checksum it to verify that the block was not partially
written.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Questions about indexes?

2003-02-17 Thread Curt Sampson
On Mon, 16 Feb 2003, Ryan Bradetich wrote:

 I am not sure why all the data is duplicated in the index ...

Well, you have to have the full key in the index, or how would you know,
when you look at a particular index item, if it actually matches what
you're searching for?

MS SQL server does have an interesting option that would help you a lot
in this case: clustered indexes. A table may have a single clustered
index, and each leaf node of the index stores not just the key but
actually the entire row. Thus, in a case like yours, you'd store the row
only once, not twice.

Without thinking too hard about it (my usual mode of operation on this
list :-)) this could probably be implemented in postgresql. But I don't
think it would be entirely trivial, and your case is unusual enough
that I very much doubt whether it would be worth implementing to fix
that alone. It would also offer the advantage that any lookup using the
clustered index would save fetching the heap page after that as well,
but it's hard to say if the savings would be worth the work.

 Since my only requirement is that the rows be unique, I have developed a
 custom MD5 function in C, and created an index on the MD5 hash of the
 concatanation of all the fields.

Well, that won't guarantee uniqueness, since it's perfectly possible
to have two different rows hash to the same value. (If that weren't
possible, your hash would have to contain as much information as the row
itself, and your space savings wouldn't be nearly so dramatic.)

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Detecting corrupted pages earlier

2003-02-17 Thread Curt Sampson
On Mon, 17 Feb 2003, Tom Lane wrote:

 Postgres has a bad habit of becoming very confused if the page header of
 a page on disk has become corrupted.

What typically causes this corruption?

If it's any kind of a serious problem, maybe it would be worth keeping
a CRC of the header at the end of the page somewhere.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Detecting corrupted pages earlier

2003-02-17 Thread Curt Sampson
On Mon, 17 Feb 2003, Tom Lane wrote:

 Curt Sampson [EMAIL PROTECTED] writes:

  If it's any kind of a serious problem, maybe it would be worth keeping
  a CRC of the header at the end of the page somewhere.

 See past discussions about keeping CRCs of page contents.  Ultimately
 I think it's a significant expenditure of CPU for very marginal returns
 --- the layers underneath us are supposed to keep their own CRCs or
 other cross-checks, and a very substantial chunk of the problem seems
 to be bad RAM, against which occasional software CRC checks aren't
 especially useful.

Well, I wasn't proposing the whole page, just the header. That would be
significantly cheaper (in fact, there's no real need even for a CRC;
probably just xoring all of the words in the header into one word would
be fine) and would tell you if the page was torn during the write, which
was what I was imagining the problem might be.

But bad memory, well, not much you can do about that beyond saying, buy
ECC, dude.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Detecting corrupted pages earlier

2003-02-17 Thread Curt Sampson
On Tue, 18 Feb 2003, Tom Lane wrote:

 The header is only a dozen or two bytes long, so torn-page syndrome
 won't result in header corruption.

No. But the checksum would detect both header corruption and torn pages.
Two for the price of one. But I don't think it's worth changing the page
layout for, either. Maybe, if anybody still cares next time the page layout
is changed, pop it in with whatever else is being changed.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Incremental backup

2003-02-15 Thread Curt Sampson
On Fri, 14 Feb 2003, Kevin Brown wrote:

 Oracle has something they call rollback segments which I assume are
 separate bits of data that have enough information to reverse changes
 that were made to the database during a transaction, and I figured
 PITR would (or could) apply particular saved rollback segments to the
 current state in order to roll back a table, tablespace, or database
 to the state it was in at a particular point in time.

You're right about what rollback segments are; they hold the information
about previous versions of a row, so that if a transaction is rolled
back, the previous data can be restored. (Postgres doesn't need this,
since it doesn't update the rows in place: the old copy of the row stays
as and where it is, and a new copy is made with the new data. [The old
copy can later be removed, after no transactions reference it any more,
with a VACUUM.])

Oracle9i has a new feature called a flashback query, which uses the
information in the rollback segements to let you query the database
in a previous state. (I.e., select such and such from this table as
of two hours ago.) Postgres could do this using the older copies of
rows as well, though the performance often wouldn't be pretty, since
your indexes become useless, I believe. (Don't they point to only the
latest copy of a row?) Still, it would be cool and life-saving in some
situations.

But yeah, PITR takes a snapshot and goes in a forward direction, not a
backwards one. This is just what Oracle does, too, using the redo logs.

 For PITR, I assume we'd need an archivelog function that would
 copy the WAL files as they're checkpointed to some other location
 (with destination names that reflect their order in time), just for
 starters.

Well, I'm not sure that you'd really need to have any special archiving
facilities: you just need to let the old files sit there, and keep
creating new ones. You do lose a bit of performance in that you can't
recycle log segments, but you could always just fork off a (well niced)
process to create a new, empty log segment at the time you start in on
the last pre-created one, so that you will have another pre-created one
ready when you finish the current one.

BTW, why exactly do we pre-create log segments, anyway? I see this comment
in backend/access/transam/xlog.c:

/*
 * Zero-fill the file.  We have to do this the hard way to ensure that
 * all the file space has really been allocated --- on platforms that
 * allow holes in files, just seeking to the end doesn't allocate
 * intermediate space.  This way, we know that we have all the space
 * and (after the fsync below) that all the indirect blocks are down
 * on disk.  Therefore, fdatasync(2) or O_DSYNC will be sufficient to
 * sync future writes to the log file.
 */

This seems to imply to me that fdatasync will, when synchronizing the
data blocks of a file, not necessarially synchronize the indirect
blocks, which seems a little...odd. It's not like there's much point
in writing the data to the disk if you can't get to it. I'd understood
fdatasync as just a way to avoid updaing the inode's last changed date.
Are there OSes that implement fdatasync in a way that you could still
lose data?

 It'd be *awfully* nice if you could issue a command to roll a table
 (or, perhaps, a tablespace, if you've got a bunch of foreign keys and
 such) back to a particular point in time, from the command line, with
 no significant advance preparation (so long as the required files are
 still around, and if they're not then abort the operation with the
 appropriate error message).  But it doesn't sound like that's what
 we're talking about when we talk about PITR...

I don't think most people are thinking of that when they think of PITR;
I think they're thinking of applying changes from a log to a previous
version of a database.

And you can't do such a rollback at all, except on an entire database,
because of the potential integrity violations. The best you could do
would be to generate SQL for the changes you'd need to get back to the
previous point, and see if you can execute these changes.


cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Incremental backup

2003-02-15 Thread Curt Sampson
On Sat, 15 Feb 2003, Tom Lane wrote:

 Curt Sampson [EMAIL PROTECTED] writes:
  ... But there's really no need for all fifty of those,
  if you don't mind not being able to restore to any time before the
  current time.

 Which, of course, is exactly the point of PITR designs.

 When you know that your assistant trainee DBA deleted most of your
 database with a mistyped command last Tuesday evening around 8pm,
 it is cold comfort to know that your database has faithfully preserved
 his committed changes.  You want to get back to where you were Tuesday
 afternoon, or preferably Tuesday evening 7:59pm.  This is what PITR
 setups can do for you.

 If you don't feel you need PITR capability, fine ... but don't tell
 the people who want it that they have no need for it.

Hey, I never said you *have* to do this compression!

I envisioned it as an option. I'd like, for example, to be able to aim a
program that the last eight weeks worth of log files and say, compress
the first seven weeks of this, but leave the last week fully intact.
Then I can save some space (quite a lot, if my updates have certain
characteristics), and yet still get back to Tuesday evening at 7:59 p.m.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] location of the configuration files

2003-02-14 Thread Curt Sampson
On Fri, 14 Feb 2003, scott.marlowe wrote:

 Asking for everything in a directory with the name local in it to be
 shared is kind of counter intuitive to me.

Not really. If you install a particular program that doesn't come with
the OS on one machine on your site, why would you not want to install it
separately on all of the others?

Typically, I want my favourite non-OS utilities on all machines, not
just one. (Even if I don't use them on all machines.) Thus /usr/local is
for site-local stuff.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [pgsql-advocacy] [HACKERS] Changing the default configuration

2003-02-14 Thread Curt Sampson
On Thu, 13 Feb 2003, Bruce Momjian wrote:

 To auto-tune, you would need to monitor swap usage and other stuff that
 may vary too much based on load from other systems.  Only the admin
 knows how to answer some of those questions.

No, to auto-tune many parameters that currently require manual
configuration, you merely need to change things so that you work with
the OS, rather than beside it.

There are a lot of things we could mmap that would remove the need for
tuning certain things altogether. The buffer cache is the biggest one;
mmaping that would let the OS take care of it all, and a few hundred KB
of shared memory would be all postgres would need. (Or none at all, if
you used mmap'd memory regions where for that part of your IPC as well.)

You could probably also make sort_mem need a lot less tuning if you
sorted in mmap'd memory and let the OS deal with paging only when and
if it needed it (as well as asking the OS not to page memory you're
randomly accessing, since it really destroys your peformance when you do
that).

I'm not sure if you could get rid of semaphores or not, but if you
could somehow do that, that would be another limited resource that you
wouldn't have to deal with.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Incremental backup

2003-02-14 Thread Curt Sampson
On Fri, 14 Feb 2003, Bruce Momjian wrote:

 OK, once we have PITR, will anyone want incremental backups?

Well, I'm not entirely clear on how PITR will work, so I may be off-base
here, but it seems to me that offering incremental backups that back
up only changed pages might not be all that big a win, given how
postgres writes its pages. On DBMSs that don't use MVCC, if you change a
particular item in a row ten times, one page is changed. If you do it in
postgres, you could well be changing ten pages, as the system writes the
two copies of the entire row wherever it can find space. So in databases
where a lot of rows are changed, where an incremental backup would
normally be a win because it would be much smaller than the logs over a
given period, it isn't going to be with postgres.

But you know, if we could get rid of redundant changes in the logs we're
using for backup, that could save a lot of space in a situation like
the one I described above. If a particular row and column is changed
fifty times over the course of a month, it's going to be recorded fifty
times in the log. But there's really no need for all fifty of those,
if you don't mind not being able to restore to any time before the
current time. You can reduce the size of the logs you need to store
for backup by throwing away the first forty-nine of those changes, and
keeping only the most recent version. There shouldn't be any worries
about referential integrity, because when you do a restore, you start
with a full backup that is ok, and once you've successfully applied
all the transactions in the log, you know it will be ok again, so any
intermediate states during the restore where integrity is not maintained
are not a problem.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] PostgreSQL Tuning Results

2003-02-13 Thread Curt Sampson
On Wed, 12 Feb 2003, [ISO-8859-1] Hans-J$B|(Brgen Sch$Bv(Bnig wrote:
(B
(B Be careful with sort_mem - this might lead to VERY unexpected results. I
(B did some testing on my good old Athlon 500 with a brand new IBM 120 Gigs
(B HDD. Reducing the sort_mem gave me significantly faster results when
(B sorting/indexing 20.000.000 randon rows.
(B
(BActually, the results are completely expected once you know what's
(Bexactly is going on. I found it weird that my sorts were also slowing
(Bdown with more sort memory until Tom or Bruce or someone pointed out to
(Bme that my stats said my sorts were swapping.
(B
(BIf I'm understanding this correctly, this basically meant that my sort
(Bresults would start hitting disk becuase they were being paged out to
(Bswap space, but then once the block was sorted, it would be read in
(Bagain from disk, and then written out to disk again (in a different
(Bplace), creating a lot more I/O than was really necessary.
(B
(BThis strikes me, too, as another area where mmap might allow the system
(Bto do a better job with less tuning. Basically, the sort is getting
(Bsplit into a bunch of smaller chunks, each of which is individually
(Bsorted, and then you merge at the end, right? So if all those individual
(Bchunks were mmaped, the system could deal with paging them out if and
(Bwhen necessary, and for the sorts you do before the merge, you could
(Bmlock() the area that you're currently sorting to make sure that it
(Bdoesn't thrash.
(B
(BIf the VM system accepts hints, you might also get some further
(Boptimizations because you can tell it (using madvise()) when you're
(Bdoing random versus sequential access on a chunk of memory.
(B
(Bcjs
(B-- 
(BCurt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
(BDon't you know, in this new Dark Age, we're all light.  --XTC
(B
(B---(end of broadcast)---
(BTIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] location of the configuration files

2003-02-13 Thread Curt Sampson
On Wed, 12 Feb 2003, Peter Bierman wrote:

 What do you gain by having the postmaster config and the database
 data live in different locations?

You can then standardize a location for the configuration files.

Everybody has room in /etc for another 10K of data. Where you have
room for something that might potentially be a half terrabyte of
data, and is not infrequently several gigabytes or more, is pretty
system-depenendent.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] location of the configuration files

2003-02-13 Thread Curt Sampson
On Thu, 13 Feb 2003, Christopher Browne wrote:

 1.  It assumes that there is a location for the configuration files
 for /the single database instance./

No; it assumes that there's a location for the default instance. If
you have more than one, you could have one default and one elsewhere, or
just do what I often do, which is put in an empty config file except for
a comment saying we have several instances of xxx on this machine; look
in yyy for them.

 2.  It assumes I have write access to /etc

 If I'm a Plain Old User, as opposed to root, I may only have
 read-only access to /etc.

Right. It's dependent on the sysadmin to create /etc/postgres/ and make
it writeable, or set up proper symlinks, or whatever.

Fortunately, the files in /etc are only the defaults, to be used if
they're not overridden on the command line. If you're in a situation
like #2, you're basically stuck where we are now all the time: you have
to just put it somewhere and hope that, if someone else needs to find
it, they can.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Changing the default configuration (was Re:

2003-02-13 Thread Curt Sampson
On Wed, 12 Feb 2003, Bruce Momjian wrote:

 Christopher Kings-Lynne wrote:

  You cannot change SHMMAX on the fly on FreeBSD.

 And part of the reason is because some/most BSD's map the page tables
 into physical RAM (kernel space) rather than use some shared page table
 mechanism. This is good because it prevents the shared memory from
 being swapped out (performance disaster).

Not at all! In all the BSDs, as far as I'm aware, SysV shared memory is
just normal mmap'd memory.

FreeBSD offers a sysctl that lets you mlock() that memory, and that is
helpful only because postgres insists on taking data blocks that are
already in memory, fully sharable amongst all back ends and ready to be
used, and making a copy of that data to be shared amongst all back ends.

 It doesn't actually allocate RAM unless someone needs it, but it does
 lock the shared memory into a specific fixed location for all processes.

I don't believe that the shared memory is not locked to a specific VM
address for every process. There's certainly no reason it needs to be.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] PGP signing release

2003-02-11 Thread Curt Sampson
On Wed, 11 Feb 2003, Greg Copeland wrote:

 On Wed, 2003-02-05 at 18:53, Curt Sampson wrote:

 [Re: everybody sharing a single key]

 This issue doesn't change regardless of the mechanism you pick.  Anyone
 that is signing a key must take reasonable measures to ensure the
 protection of their key.

Right. Which is why you really want to use separate keys: you can determine
who compromised a key if it is compromised, and you can revoke one without
having to revoke all of them.

Which pretty much inevitably leads you to just having the developers use
their own personal keys to sign the release.

 Basically, you are saying:
   You trust a core developer
   You trust they can protect their keys
   You trust they can properly distribute their trust
   You don't trust a core developer with a key

Not at all. I trust core developers with keys, but I see no reason to
weaken the entire system by sharing keys when it's not necessary. Having
each developer sign the release with his own personal key solves every
problem you've brought up.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Changing the default configuration (was Re: [pgsql-advocacy]

2003-02-11 Thread Curt Sampson
On Tue, 11 Feb 2003, Tom Lane wrote:

 It's a lot too conservative.  I've been thinking for awhile that we
 should adjust the defaults.

Some of these issues could be made to Just Go Away with some code
changes. For example, using mmap rather than SysV shared memory
would automatically optimize your memory usage, and get rid of the
double-buffering problem as well. If we could find a way to avoid using
semephores proportional to the number of connections we have, then you
wouldn't have to worry about that configuration parameter, either.

In fact, some of this stuff might well improve our portability, too.
For example, mmap is a POSIX standard, whereas shmget is only an X/Open
standard. That makes me suspect that mmap is more widely available on
non-Unix platforms. (But I could be wrong.)

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] location of the configuration files

2003-02-11 Thread Curt Sampson

I, personally, also think it makes more sense to pass to the postmaster
a configuration file that contains all the rest of the information about
the database system, including the disk locations of the various data
directories and whatnot.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Windows SHMMAX (was: Default configuration)

2003-02-11 Thread Curt Sampson
On Tue, 11 Feb 2003, Merlin Moncure wrote:

 Another way of looking at it is memory mapped files.  This probably most
 closely resembles unix shared memory and is the de facto standard way
 for interprocess memory block sharing.  Sadly, performance will suffer
 because you have to rely on the virtual memory system (think: writing to
 files) to do a lot of stupid stuff you don't necessarily want or need.

To the contrary, for the majority of the shared memory usage of
postgres, which is cached file data, the virtual memory system is doing
exactly what you want it to: managing the movement of data between
memory and disk, and caching the more frequently accessed data to reduce
the chances you will actually need to access the disk for it.

For shared memory used only for IPC, typically a VM system treats it no
differently from any other non-shared memory, so if it's doing something
you don't want or need (a proposition I quite heartily disagree with),
it's going to be doing that very every piece of memory your application
allocates and uses, shared or not.

 The OS has to guarantee that the memory can be swapped out to file at
 any time and therefore mirrors the pagefile to the allocated memory
 blocks.

The OS does not need to write the pagefile. On modern Unix systems that
are not allowing overcommit, the space will be allocated but never
written unless there's a need to free up some physical memory, and the
pages in question are used infrequently enough that the system decides
that they are good candidates to be paged out. I would imagine that
Windows does the same.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] PGP signing releases

2003-02-05 Thread Curt Sampson
On Wed, 4 Feb 2003, Greg Copeland wrote:

 If three people are required to sign a package prior to release,
 what happens when one of them is unavailable for signing (vacation,
 hospital, etc). This is one of the reasons why having a single project
 key which the core developers sign may appear to be easier.

I don't see that it makes that much difference. So the release is signed
only by, say, only three people instead of four. It's still signed.

  One hopes that situations like last week's ousting of one of the
  core FreeBSD developers are rare but if such a situation were to
  arise, a shared project key would be Very Bad (tm).

 If a revocation key has been properly generated (as it should of been),
 this is not a problem at all.

Actually, it is still a problem. Revocations are not reliable in PGP,
and there's really no way to make them perfectly reliable in any system,
because you've got no way to force the user to check that his cached
data (i.e., the key he holds in his keyring) is still valid. This is why
we generally expire signing keys and certificates and stuff like that on
a regular basis.

This one element alone makes me think that individual signing is a
better thing. (With individual signing you'd have to compromise several
keys before you have to start relying on revocation certificates.)

   Who will actually hold the key? Where will it be physically kept?

 Good question but can usually be addressed.

It can be addressed, but how well? This is another big issue that I
don't see any plan for that I'm comfortable with..

   How many people will know the passphrase?

 As few as possible.  Ideally only two, maybe three core developers.

Um...I'm not sure that this is a relevant question at all. The
passphrase is not part of the key; it's just used to encrypt the key for
storage. If you know the passphrase, you can make unlimited copies of
the key, and these copies can be protected with any passphrases you like,
or no passphrase, for that matter.

 One could also only allow a single person to hold the passphrase and
 divide it into parts between two or more. This is commonly done in
 financial circles.

Hm. Splitting the key into parts is a very interesting idea, but I'd
be interested to know how you might implement it without requiring
everybody to be physically present at signing.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] PGP signing releases

2003-02-05 Thread Curt Sampson
On Thu, 5 Feb 2003, Greg Copeland wrote:

 Who will actually hold the key? Where will it be physically kept?
  
   Good question but can usually be addressed.
 
  It can be addressed, but how well? This is another big issue that I
  don't see any plan for that I'm comfortable with..

 The reason I was vague is because it depends on the key route.
 Obviously, if each person signs, each person must protect their own
 key.  If there is a central project key, it's simply a matter of
 determining which box is used for signing, etc...while important, it's
 certainly not difficult to address.

It seems to me extremely difficult to address. Unless you are physically
monitoring someone, how do you prevent someone from copying the key off
of that machine. At which point anybody with the passphrase can use it
for anything.

 How many people will know the passphrase?
  
   As few as possible.  Ideally only two, maybe three core developers.
 
  Um...I'm not sure that this is a relevant question at all. The
  passphrase is not part of the key; it's just used to encrypt the key for
  storage. If you know the passphrase, you can make unlimited copies of
  the key, and these copies can be protected with any passphrases you like,
  or no passphrase, for that matter.

 If you're concerned about this to that extent, clearly those people
 should not part of the web of trust nor should they be receiving the
 passphrase nor a copy of the private key.  Remember, trust is a key (pun
 intended) part of a reliable PKI.

In that case, I would trust only one person with the key. Making copies of
the key for others gives no additional protection (since it takes only one
person out of the group to sign the release) while it increases the chance
of key compromise (since there are now more copies of the key kicking around,
and more people who know the passphrase).

Keys cannot be transfered from one person to another since, being digital
data, there's no way to ascertain that the original holder does not still
(on purpose or inadvertantly) have copies of the key. So in the case where
we want to transfer trust from one person to another, we must also generate
a new key and revoke the old one.

This is now exactly equivalant to having each developer sign postgres
with a signing key (signed by his main key) for which the other
developers (or appropriate authority) have a revocation certificate.

And back to the passphrase issue, once again, can't you see that it's
completely irrelevant? At some point, someone who knows the passphrase is
going to have to be in a position to use that to decrypt the key. At that
point he has the key, period. Changing the passphrase does no good, because
you can't change the passphrase on the copy of the key he may have made.

A passphrase is like a lock on your barn door. After you've given
someone the key and he's gone in and taken the cow, changing the lock
gives you no protection at all.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] PGP signing releases

2003-02-04 Thread Curt Sampson
On Tue, 4 Feb 2003, Kurt Roeckx wrote:

 I know how it works, it's just very unlikely I'll ever meet
 someone so it gives me a good chain.

One postgresql conference is all it takes.

 Anyway, I think pgp is good thing to do, just don't assume that
 it's always better then just md5.

I think it is. Even if you can't personally trust the signature properly,
it offers much more opportunity to discover a forgery because if you grab
the signing key when it's first published, the aquisition of the key and
the potentially forged binary are separated in time, making substitution
of both much more difficult.

Someone can easily change an MD5 signature file that's sitting right next
to a binary on an FTP server. Someone can not easily change a PGP key that's
already sitting in your keyring on your computer.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] PGP signing releases

2003-02-04 Thread Curt Sampson
On Tue, 4 Feb 2003, Kurt Roeckx wrote:

  There really isn't any comparison here.

 I didn't say you could compare the security offered by both of
 them.  All I said was that md5 also makes sense from a security
 point of view.

MD5, or any other unsigned check, makes sense from a security point of
view only if it is stored independently from the thing you are checking.
So NetBSDs package system MD5 hashes help a little with security.
(You'd have to modify both those on the NetBSD FTP server or anoncvs or
whatever *and* the binary on an independently run FTP server.)

If the security token is stored with the item to be secured (i.e.,
on the same FTP server) and is unsigned, it is just as subject to
modification as the item itself, and provides no extra security.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] PGP signing releases

2003-02-04 Thread Curt Sampson
On Tue, 2003-02-04 at 16:13, Kurt Roeckx wrote:
 On Tue, Feb 04, 2003 at 02:04:01PM -0600, Greg Copeland wrote:
 
  Even improperly used, digital signatures should never be worse than
  simple checksums.  Having said that, anyone that is trusting checksums
  as a form of authenticity validation is begging for trouble.

 Should I point out that a fingerprint is nothing more than a
 hash?

Since someone already mentioned MD5 checksums of tar files versus PGP
key fingerprints, perhaps things will become a bit clearer here if I
point out that the important point is not that these are both hashes of
some data, but that the time and means of acquisition of that hash are
entirely different between the two.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] PGP signing releases

2003-02-03 Thread Curt Sampson
On Mon, 3 Feb 2003, Kurt Roeckx wrote:

 I'm not saying md5 is as secure as pgp, not at all, but you can't
 trust those pgp keys to be the real one either.

Sure you can. Just verify that they've been signed by someone you trust.

For example, next time I happen to run into Bruce Momjian, I hope he'll
have his PGP key fingerprint with him. I can a) verify that he's the
same guy I who, under the name Bruce Momjian, was giving the seminar I
went to last weekend, and b) check his passport ID to see that the U.S.
government believes that someone who looks him is indeed Bruce Momjian
and a U.S. citizen. That, for me, is enough to trust that he is who he
says he is when he gives me the fingerprint.

I take that fingerprint back to my computer and verify that the key I
downloaded from the MIT keyserver has the same fingerprint. Then I sign
that key with my own signature, assigning it an appropriate level of trust.

Next time I download a postgres release, I then grab a copy of the
postgres release-signing public key, and verify that its private key was
used to sign the postgres release, and that it is signed by Bruce's key.

Now I have a direct chain of trust that I can evaluate:

1. Do I believe that the person I met was indeed Bruce Momjian?

2. Do I trust him to take care of his own key and be careful signing
other keys?

3. Do I trust his opinion that the postgres release-signing key that
he signed is indeed valid?

4. Do I trust the holder of the postgres release-signing key to have
taken care of the key and have been careful about signing releases
with it?

Even if you extend this chain by a couple of people, that's trust in a
lot fewer people than you're going to need if you want to trust an MD5
signature.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] PGP signing releases

2003-02-03 Thread Curt Sampson
On Tue, 3 Feb 2003, Greg Copeland wrote:

 Surely there are a couple of key developers whom would be willing to
 sign each other's keys and have previously met before.  Surely this
 would be the basis for phone validation.  Then, of course, there is 'ol
 snail-mail route too.  Of course, nothing beats meeting in person having
 valid ID and fingerprints in hand.  ;)

I should mention, I'm not always so paranoid that I check ID and all of
that. It really depends on how well I know the person. I've met Bruce only
once, so I wouldn't do it over the phone at all, since we don't share much
non-public background and I'm not dead certain that I could tell his voice
from a similar one. The same is not true when it comes to doing this with
some of my close friends.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Linux.conf.au 2003 Report

2003-02-02 Thread Curt Sampson
On Sun, 2 Feb 2003, Kurt Roeckx wrote:

 On Sat, Feb 01, 2003 at 02:35:15PM +0900, Curt Sampson wrote:
 
  Sure. But you still want to be able to say (and can say, in some [many?]
  socket API implementations) that you want to accept only IPv4 or only IPv6
  connections. I also want to be able to say the same thing in my database.

 You just create either an ipv4 or ipv6 socket.

Um...I'm talking about inserting data into a column in postgres. How do
I declare that a column can accept IPv4 addresses only?

And why will *nobody* address the question of whether this type should
include ISO/OSI addresses or not, and why?

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Linux.conf.au 2003 Report

2003-02-02 Thread Curt Sampson
On Sun, 2 Feb 2003, Tom Lane wrote:

 I think I was the one who talked us into assuming that ipv4 and ipv6
 should be treated as a single protocol.  But some people have since made
 pretty good cases that it's better to regard them as separate protocols.

From a security standpoint, I think it's definitely better to regard
them as separate protocols. They are certainly separately filtered on
firewalls, and they are often routed differently, too.

That said, I see no reason not to have some sort of easy way of saying,
listen on all the interfaces you can find using all the protocols you
know. So long as you have the ability to distinguish where you listen
by both protocol and address, it's easy to be as secure as you need to be.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] On file locking

2003-02-02 Thread Curt Sampson
On Sun, 2 Feb 2003, Tom Lane wrote:

 This all doesn't look good for using file locks in the way I had in
 mind :-( ... but considering that all these man pages seem pretty vague,
 maybe some direct experimentation is called for.

Definitely. I wonder about the NetBSD manpage quotes in the post you
followed up to, given that last time I checked flock() was implmented,
in the kernel, using fcntl(). Either that's changed, or the manpages
are unclear or lying.

This has been my experience in the past; locking semantics are subtle
and unclear enough that you really need to test for exactly what you
want at build time on every system, and you've got to do this testing
on the filesystem you intend to put the locks on. (So you don't, e.g.,
test a local filesystem but end up with data on an NFS filesystem with
different locking semantics.) That's what procmail does.

Given this, I'm not even sure the whole idea is worth persuing. (Though
I guess I should find out what NetBSD is really doing, and fix the
manual pages correspond to reality.)

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_hba.conf hostmask.

2003-02-02 Thread Curt Sampson
On Sun, 2 Feb 2003, Bruce Momjian wrote:

 Yes, some have asked about this.  My understanding was that CIDR
 (host/len) was mostly for networks, while hostname/mask was for hosts.
 Now, you can specify hosts using /32, but is is unusual?  Maybe not.

Typically, if you have something like an access list where you're
specifying hosts or networks, you default the netmask to /32 if it's not
supplied.

However, if we're going to maintain backward compatability with the old
format (i.e., using a separately specified netmask in the next column if
no slash is present in the address column) we can't do that.

Personally, I'm all for breaking backwards compatability (as I usually
am :-)) but could quite easily live with specifying all most hosts as
n.n.n.n/32 forever into the future, too.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Linux.conf.au 2003 Report

2003-02-01 Thread Curt Sampson
On Fri, 31 Jan 2003, Kurt Roeckx wrote:

 It's a good things that the socket interface can actually work
 with all protocol!  It doesn't only work with AF_INET, but also
 AF_UNIX, and probably others.  It's a good things that things
 like socket(), bind(), connect() don't need to be replaced by
 other things.

Sure. But you still want to be able to say (and can say, in some [many?]
socket API implementations) that you want to accept only IPv4 or only IPv6
connections. I also want to be able to say the same thing in my database.

So probably, this means setting up separate types for IPv4 and IPv6
addresses, and a third type to hold both IPv4 and IPv6 addresses. That
third type could also be extended to hold OSI, NS, and whatever other
type of addresses people feel would be useful.

I suppose another way of implementing this would be to set up some easy
way to put a constraint on a column such that you could constrain it to
hold only IPv4 or IPv6 addresses.

And I would be interested to hear the opinions of those who want to put
IPv4 and IPv6 addresses in the same type as to whether you do or do not
support also putting ISO/OSI and Novell addresses into that type as well,
and why or why not.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] On file locking

2003-02-01 Thread Curt Sampson
On Fri, 31 Jan 2003, Tom Lane wrote:

 Antti Haapala [EMAIL PROTECTED] writes:
  And from SunOS 5.8 flock
   Locks are on files, not file  descriptors.   That  is,  file
   descriptors  duplicated  through  dup(2)  or  fork(2) do not
   result in multiple instances of a lock, but rather  multiple
   references to a single lock.  If a process holding a lock on
   a file forks and the child explicitly unlocks the file,  the
   parent  will  lose  its  lock.  Locks are not inherited by a
   child process.

 That seems self-contradictory.

Yes. I note that in NetBSD, that paragraph of the manual page is
identical except that the last sentence has been removed.

At any rate, it seems to me highly unlikely that, since the child has
the *same* descriptor as the parent had, that the lock would disappear.

The other option would be that the lock belongs to the process, in which
case one would think that a child doing an unlock should not affect the
parent, because it's a different process

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] PostgreSQL, NetBSD and NFS

2003-02-01 Thread Curt Sampson
On Fri, 31 Jan 2003, mlw wrote:

 . There are always issues with file locking across various
 platforms. I recall reading about mmap issues across NFS a while ago...

Postgres uses neither of these, IIRC, so that should be fine. (Actually,
postgres does effectively use mmap for shared memory on NetBSD, but
that's not mapping data on the NFS filesystem, so it's not an issue.)

 The NFS client may also have isses with locking, fsync, and mmap.

Any fsync problems would affect data integrity during a crash, but
nothing otherwise.

(Of course, I'm happy to be corrected on any of these issues, if someone
can point out particular parts of postgres that would fail over NFS.)

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [mail] Re: [HACKERS] Windows Build System

2003-02-01 Thread Curt Sampson
On Sat, 1 Feb 2003, Peter Eisentraut wrote:

 Curtis Faith writes:

 a) Running as a service is important as this the way NT/2000
  administrators manage server tasks. The fact that PostgreSQL's Cygwin
  emulation doesn't do this is very indicative of inferior Windows
  support.

 No, it is indicative of the inability to read the documentation.
 PostgreSQL on Cygwin runs as a service if and only if you ask it to.

I would say that not supporting those who have an inability to read
documentation would count as inferior Windows support. :-)

What I'm hearing here is that all we really need to do to compete with
MySQL on Windows is to make the UI a bit slicker. So what's the problem
with someone building, for each release, a set of appropriate binaries, and
someone making a slick install program that will install postgres,
install parts of cygwin if necessary, and set up postgres as a service?

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] On file locking

2003-01-31 Thread Curt Sampson
On Fri, 31 Jan 2003, Shridhar Daithankar[EMAIL PROTECTED] wrote:

 Besides file locking is implemented using setgid  bit on most unices. And
 everybody is free to do what he/she thinks right with it.

I don't believe it's implemented with the setgid bit on most Unices. As
I recall, it's certainly not on Xenix, SCO Unix, any of the BSDs, Linux,
SunOS, Solaris, and Tru64 Unix.

(I'm talking about the flock system call, here.)

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Linux.conf.au 2003 Report

2003-01-31 Thread Curt Sampson
On Fri, 31 Jan 2003, Andrew Sullivan wrote:

 But the pain of making them
 interoperate is part of the cause of resistance.  The compatibility
 addresses are going to _have_ to work if people are really going to
 move...

There is no pain in this respect; you get your compatability by simply
running both IPv4 and IPv6 on the hosts that interoperate.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Linux.conf.au 2003 Report

2003-01-30 Thread Curt Sampson
On Thu, 30 Jan 2003, Andrew Sullivan wrote:

 Given that IPv6 is supposed to allow co-operation with IPv4, it seems
 it'd be pretty hard to force such a view on every application using
 IP addresses.  DNS, for instance.

Hm? DNS completely separates IPv4 and IPv6 addresses; they're different
record types (A versus ) in the DNS database.

And the interoperation if IPv4 and IPv6 is pretty much not happening,
if you're talking about the compatability addresses. I won't get into
all the reasons why.

All that said, I'm not advocating separating (or not separating)
IPv4 and IPv6 addresses. I'm still undecided on the issue. I can see
situations where I might want to store both together, but then again, I
can see situations where I certainly wouldn't.

Perhaps we should think about another example to try to illuminate this:
what about storing ISO/OSI addresses in the same type as well? Isn't
that just the same thing as storing IPv4 and IPv6 addresses together?

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Linux.conf.au 2003 Report

2003-01-30 Thread Curt Sampson
On Thu, 30 Jan 2003, Tom Lane wrote:

 I don't know anything
 about v6, but I'd imagine v4 addresses form a defined subset of the v6
 address space ...

No, they do not. The address spaces are completely independent. (There
is a compatability space for IPv4 addresses, but it turned out to be
impractical, and thus is generally not used. Certainly you cannot route
to arbitrary v4 hosts using a v6 address.)

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] PostgreSQL, NetBSD and NFS

2003-01-30 Thread Curt Sampson
On Thu, 30 Jan 2003, D'Arcy J.M. Cain wrote:

 Does the shared memory stuff use disk at all? Perhaps that's the
 difference between PostgreSQL and other applications.

Shared memory in NetBSD is just an interface to mmap'd pages, so it can
be swapped to disk. But I assume your swap is not on NFS

A ktrace would be helpful. Also, it would be helpful if you tried doing
an initdb to a directory on the filer to see if you can even create a
database cluster, and tried doing that or rsyncing and accessing your
data over NFS with a NetBSD system as the NFS server.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] postgres installation - best configuration?

2003-01-30 Thread Curt Sampson
On Fri, 31 Jan 2003, Bruce Momjian wrote:

 Also, I have heard a lot of people reocommending RAID 0+1 or 1+0 as a
 good mix of reliability and performance.

Right. Striping across mirrored drives will definitely be better, but
you can't do that with only three drives.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Specifying Rowtypes

2003-01-29 Thread Curt Sampson
On Tue, 28 Jan 2003, Stephan Szabo wrote:

 You can also return records at which point you have to give a definition
 at select time.

 create function aa1() returns record as 'select 1,2;' language 'sql';
 select * from aa1() as aa1(a int, b int);

Yeah, I tried that approach too, but it got ugly quickly. Changing that
line in all my unit tests every time I changed the signature of the return
value was a fair amount of extra effort.

 Also, for defined types like that, you probably want to use
 CREATE TYPE ... AS rather than CREATE TABLE.

That's much better! Thanks!

 I believe only the column names and types are considered for purposes of
 this.  Check constraints and the like defined on the column aren't applied
 either.  I can see arguments for both ways since things like foreign keys
 or the not yet supported check constraints with subselects would seem to
 have not terribly meaningful results.

Well, it might make sense to declare that you can't return anything that
couldn't, in the current transaction, be inserted into that table.

But easier, perhaps, would just be to provide the ability to add limited
constraints to CREATE TYPE, and only honour the constranints that can be
applied in a CREATE TYPE statement.

 Although if you make the column on a domain and the domain has a
 constraint it does seem to be applied.

Hmmm. Interesting. This would be basically what I described above, then,
wouldn't it, except it doesn't work for me (with types or tables):

CREATE DOMAIN nonnull_int AS
int
DEFAULT 0
CONSTRAINT nonnull_int_not_null NOT NULL;

CREATE TYPE t2_retval AS (
value1 nonnull_int,
value2 nonnull_int,
value3 nonnull_int
);

CREATE FUNCTION t2()
RETURNS SETOF t2_retval
AS '
DECLARE
retval t2_retval%ROWTYPE;
BEGIN
SELECT INTO retval 1;
RETURN NEXT retval;
SELECT INTO retval 1, 2, 3;
RETURN NEXT retval;
SELECT INTO retval null, null, null;
RETURN NEXT retval;
RETURN;
END
' LANGUAGE 'plpgsql';

SELECT * FROM t2();

...produces rows with nulls in them.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



[HACKERS] Specifying Rowtypes

2003-01-28 Thread Curt Sampson

So currently the only way to specify a row type is by using a table,
right? E.g.:

CREATE TABLE t2_retval (
value1 int NOT NULL DEFAULT -1,
value2 int NOT NULL,
value3 int
);

Are there plans to add another way of declaring this sort of thing so
that I don't have to add a new table to my schema for every function
that returns a rowtype?

Second, it seems that there's still a problem with NULLs here:

CREATE FUNCTION t2()
RETURNS t2_retval
AS '
DECLARE
retval t2_retval%ROWTYPE;
BEGIN
SELECT INTO retval null, null, null;
RETURN retval;
END
' LANGUAGE 'plpgsql';

This is returning a row that (to my mind) doesn't match the type of the
table above, because it's returning null for non-nullable columns:

cjs= select coalesce(value1, -999), coalesce(value2, -999),
cjs- coalesce(value3, -999) from t2();
 case | case | case
--+--+--
 -999 | -999 | -999
(1 row)

(You get the same result if you delete the SELECT INTO line above.)

Am I misunderstanding something here, or is this a bug?

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Call for objections: put back OIDs in CREATE TABLE

2003-01-27 Thread Curt Sampson
On Mon, 27 Jan 2003, Antti Haapala wrote:

For an INSERT command, the tag is INSERT oid rows, where rows
   is the number of rows inserted, and oid is the object ID of the
   inserted row if rows is 1, otherwise oid is 0.

 Wouldn't it be nice to add here

   If table doesn't contain row oids, in place of oid is the
   primary key of the newly inserted record (if any) as column =
   'value' [ and column = 'value [ and ... ]]

Well, as was pointed out, that's a lot of data to send back if your
primary key is a huge text column, and you've still got a problem if
you have a multi-column primary key.

Since this sort of functionality is not all that frequently needed, I'd
still say it would probably be cleaner to make a new query that selects
the most recently inserted primary key. That means that a) you don't
have to send back a potentially large amount of data unless the user
asks for it, and b) multi-column primary keys work just fine.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Call for objections: put back OIDs in CREATE TABLE

2003-01-26 Thread Curt Sampson
On Sun, 26 Jan 2003, Bruce Momjian wrote:

 Ross, you make some powerful arguments here.  Probably the most
 significant was the idea that you need a unique identifier for every
 row, and it should be of a consistent type, which primary key is not.

I don't see why you need a unqiue identifier per row, nor do I see why,
if you are going to have one, it needs to be the same type across all
tables.

Having this may be very desirable, and even necessary, for many or
all object-to-relational mapping frameworks, but that is certainly not
the only thing that postgres is used for. And I still maintain that
if something does need something like of OIDs, it should be declared
explicitly in the database schema (as you have to do in other DBMSes)
and not use a hidden feature.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] WAL replay logic (was Re: [PERFORM] Mount options for Ext3?)

2003-01-25 Thread Curt Sampson
On Sat, 25 Jan 2003, Tom Lane wrote:

 We'd have to take it on faith that we should replay the visible files
 in their name order.

Couldn't you could just put timestamp information at the beginning if
each file, (or perhaps use that of the first transaction), and read the
beginning of each file to find out what order to run them in. Perhaps
you could even check the last transaction in each file as well to see if
there are holes between the available logs.

 This might mean we'd have to abandon the current
 hack of recycling xlog segments by renaming them --- which would be a
 nontrivial performance hit.

Rename and write a this is an empty logfile record at the beginning?
Though I don't see how you could do this in an atomic manner Maybe if
you included the filename in the WAL file header, you'd see that if the name
doesn't match the header, it's a recycled file

(This response sent only to hackers.)

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Oracle rant

2003-01-23 Thread Curt Sampson
On Thu, 16 Jan 2003, Fred Zellinger wrote:

 With Oracle, you can screw around with files and tablespaces and
 extents and segments and partition striping and local and global
 indexing and block sizes and 400+ other tuning parameters to your
 heart's content. ... I am a control freak and I think a lot of
 other people are too. Oracle is tremendously complex and has a
 steep learning curve, but it gives me control. With PG, a lot
 of effort has been made to simplify. ... If PostgreSQL were to
 open up all the internals of storage and become as complex as
 Oracle, there probably would be a lot of high profile crashes and PG
 would get a bad reputation. However, I think that having a mode of
 operation(controlled at compile time) where all the dirty details of
 storage was made accessible in the data dictionary, would be something
 good to pacify the control freaks.

The reason you don't have all of these storage parameters available to
you, and they can't be opened up is that they simply aren't there.
Oracle uses raw devices and does all of its own management of disk space
at the block level. Postgres relies on the filesystem and the operating
system to deal with block allocation and a lot of the I/O scheduling.

Adding an effecient layer to replace this is a major project that would
probably provide few gains, since operating systems have gotten a lot
smarter about block allocation and I/O scheduling over the years. There
has been some discussion about this before, if you look back in the
archives.

There are various other places in postgres that the coders could work
on that are likely to provide more performance gain for less effort.
The optimizer comes to mind. At the low level, dropping shared memory
and moving to mmap might (but it's not certain) provide some noticable
improvement for not too much implementation effort.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Survey results from the PostgreSQL portal page

2003-01-23 Thread Curt Sampson
On Sun, 19 Jan 2003, [ISO-8859-1] Hans-J$B|(Brgen Sch$Bv(Bnig wrote:
(B
(B + people measure postgresql by the speed of bulk imports
(B
(B This is a good point. I can complete agree. What we might need is
(B something called "SQL Loader" or so. This may sound funny and it doesn't
(B make technical sense but it is an OBVIOUS way of importing data. People
(B often forget to use transactions or don't know about COPY.
(B
(BEven "doing it right," postgres 7.2 was significantly slower than MySQL
(Bfor bulk data imports, at least for tables with relatively narrow rows.
(BI was going to put this down to higher row overhead, except that it was
(Bnowhere near raw file I/O speed, either.
(B
(BSo this could use improvement, if it's not been improved already.
(B
(BThere's room for performance increases in a lot of other areas, too, but
(Bin the end, a lot of people just don't design their databases for good
(Bperformance. And I've killed enough non-postgres database servers in my
(Blife to know that if you don't really know what you're doing, you can
(Beasily make the performance of any DBMS suck. :-)
(B
(BPersonally, I think there's still a fair amount of room in the features
(Barea, too. I'm always running into something that I'd like to have.
(BToday it was being able to defer a UNIQUE constraint.
(B
(Bcjs
(B-- 
(BCurt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
(BDon't you know, in this new Dark Age, we're all light.  --XTC
(B
(B---(end of broadcast)---
(BTIP 5: Have you checked our extensive FAQ?
(B
(Bhttp://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Call for objections: put back OIDs in CREATE TABLE

2003-01-23 Thread Curt Sampson
On Tue, 21 Jan 2003, Tom Lane wrote:

 We've gotten a couple of complaints now about the fact that 7.3 doesn't
 include an OID column in a table created via CREATE TABLE AS or SELECT
 INTO.  Unless I hear objections, I'm going to revert it to including an
 OID, and back-patch the fix for 7.3.2 as well.

I object. I personally think we should be moving towards not using OIDs
as the default behaviour, inasmuch as we can, for several reasons:

1. It's not a relational concept.

2. The OID wraparound problem can get you.

3. Other SQL databases don't do this.

4. It's hidden, rather than exposed, and hidden things are generally a
bad idea.

5. We should default to what gives us better performance, rather than
worse.

 See discussion a couple days ago on pgsql-general, starting at
 http://archives.postgresql.org/pgsql-general/2003-01/msg00669.php

There didn't seem to be many people clamouring to have it back.

The ideal sitaution for me would be to have WITHOUT OIDS be the default
for all table creations, and but of course allow WITH OIDS for backward
compatability. But yeah, I know that this can introduce problems with
old dumps, and may not be entirely easy to implement.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Threads

2003-01-23 Thread Curt Sampson
On Sat, 4 Jan 2003, Christopher Kings-Lynne wrote:

 Also remember that in even well developed OS's like FreeBSD, all a
 process's threads will execute only on one CPU.

I would say that that's not terribly well developed. Solaris will split
a single processes' threads over multiple CPUs, and I expect most other
major vendors Unixes will as well. In the world of free software, the
next release of NetBSD will do the same. (The scheduler activations
system, which support m userland to n kernel threads mapping, was
recently merged from its branch into NetBSD-current.)

From my experience, threaded sorts would be a big win. I managed to
shave index generation time for a large table from about 12 hours to
about 8 hours by generating two indices in parallel after I'd added a
primary key to the table. It would have been much more of a win to be
able to generate the primary key followed by other indexes with parallel
sorts rather than having to generate the primary key on one CPU (while
the other remains idle), wait while that completes, generate two more
indices, and then generate the last one .

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Options for growth

2003-01-23 Thread Curt Sampson
On Wed, 22 Jan 2003, Sean Chittenden wrote:

   By the way, I too wonder which supported OS platform would support
   over 4GB of memory on a PC..
 
  Linux? I don't think there's any problem handling more than 4G
  memory in the system. On 32bit architectures, there's of course the
  3G (I think) per process limit, but as postgres uses multiprocess
  and not multithreading, this issue doesn't hit so soon. Of course,
  if the per process memory is the problem, you'd have to go to 64bit.

 Heh, don't kid yourself.  x86 can only handle 4GB of memory
 addressing.  The hack that Linux uses is to swap out 2GB sections of
 RAM to a 4GB+ memory range, then copy the memory range it needs down
 into usable memory space.  Can we say large page tables?  :)

 You need an actual 64bit CPU to access more than 4GB of RAM without
 paying for it through the nose.  -sc

No, you do not. If you need to access more than two to three GB
(depending on the OS) of RAM on a 32-bit machine *within a single
process* (as mentioned above), you have a problem. But this problem does
not necessarially involve copying; you could use, e.g., mmap to remap
chunks of your address space.

If you have multiple processes, and your OS is sensibly written, no
memory copying is necessary on the process side. All you do is change
the page tables, and the appropriate physical memory, no matter where in
the physical address space it resides, will be mapped into the 32-bit
virtual memory address space.

That's not to say that there might not be other issues with I/O on, say,
32-bit PCI buses. IIRC, typically PCI bus controllers use physical,
not virtual addresses on the bus for DMA, so you're going to have to
use bounce buffers if you wish a 32-bit PCI card to do I/O outside the
bottom 4 GB of memory. But on the other hand, if you're spending the
money on a motherboard that can take more than 4 GB of RAM, you're
almost certainly getting a few 64-bit PCI slots, and probably you'd also
be spending the money to buy 64-bit PCI disk controllers.

This is not to say you shouldn't go for a 64-bit system, especially
given that the AMD ones are probably going to get awfully cheap fairly
soon. But postgres itself is today not equipment to take any more
advantage of one than it is of a 32-bit system with a greater than
32-bit physical address space. (And there's been doubt about whether
the techniques that would take advantage of this would provide all that
much of a performance improvement, anyway. Still, it seems to me that
it would be pretty cool, when you're doing I/O on a table, just to say,
with one system call, mmap this entire file containing the table into
my address space, and not have to worry about running out of address
space when you do this on multiple large tables. (And yes, I know this
would actually be, map this 1 GB chunk of this large table in the
current postgres implemenation.)

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Call for objections: put back OIDs in CREATE TABLE

2003-01-23 Thread Curt Sampson
On Thu, 23 Jan 2003, Ross J. Reedstrom wrote:

 So in the longer term, we need to provide a replacement. Arguably, the
 primary key for a table is the right replacement, but we don't _require_
 a pkey, so what to do in cases where this isn't one?

You're stuck. SQL breaks with relational theory in this way; tables
need not have candidate keys, and thus you can have duplicate rows in
a table. (I.e., mathamatically, a table is not a set, it's a bag.) The
implications of this start to go on and on, but let's not get into that
here.

 Also, the pkey can be _any_ column(s), of _any_ type, which could be
 inconvenient for returning as the result of an insert, for example
 (imagine a text field as pkey, with a _huge_ block of text just
 written into it ...)

Well, this could be worked around to some extent, with some hackery.
But in the end I reckon it's much easier just to have the object system
force you to declare specific a specific object-ID column, if that's
what it takes. So long as you've got a candidate key, even if it's not
the primary key, you're fine.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Options for growth

2003-01-23 Thread Curt Sampson
On Fri, 23 Jan 2003, Hannu Krosing wrote:

 Curt Sampson kirjutas N, 23.01.2003 kell 17:42:
  If the OS can handle the scheduling (which, last I checked, Linux couldn't,

 When did you do your checking ?
 (just curious, not to start a flame war ;)

This was perhaps a year or so ago. IBM had some patches to fix a lot of
the scheduler problems. I wouldn't be surprised if things are in a much
better state now.

Anyway, there are lots of work-arounds. Find the appropriate patches if
the kernel still doesn't have them, run Solaris, whatever

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Call for objections: put back OIDs in CREATE TABLE

2003-01-23 Thread Curt Sampson
On Fri, 23 Jan 2003, Hannu Krosing wrote:

  1. [OIDs are] not a relational concept.
 so are other system tuples (cid, tid, tableiod, ...).

But there's a key difference here; nobody's advertising these others as
any sort of row identifier: i.e., a candidate key. And besides, I wouldn't
object at all to getting rid of these, except that they store essential
system information and I can't figure out how to get rid of them. :-)

 It is an OO concept.

Well, it's not, because we have an OID wrap-around problem, so an OID is
actually not an OID at all, but simply an arbitrary number tacked on to
a row. Other rows, in the same or other tables can have the same OID.

  2. The OID wraparound problem can get you.
 put an unique index on OID column.

That still doesn't make it a real OID, because you can't guarantee that
two rows in different tables won't have the same OID.

  3. Other SQL databases don't do this.
 Ask Date, hell tell you that SQL is evil, i.e. not relational ;)

I did, he said that, and I agreed with him. :-) So now we have something
that's evil because it's not relational and also evil because it's not
SQL. Double-yuck!

  5. We should default to what gives us better performance, rather than
  worse.
 Not if it breaks anything ;)

I disagree. We have to weigh the cost of the breakage versus the
benefits in each individual circumstance. We've broken plenty of things
before because we felt it was better to do so than maintain backward
compatability.

Because of its history as a research tool, there's a lot of experimental
stuff in postgres that, in hindsight, we can say didn't work so well.
When we find something that's not working so well, we should at least
consider making some sort of move toward the right thing, rather than
continuing to do the wrong thing forever just for the sake of backwards
compatability.

Summary: I don't want to hear absolutes like we should never break
backwards compatibility. I want to hear arguments that the cost of
breaking backwards compatability is X, and the benefit of the new way of
doing things is Y, and here is why you think X  Y.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Options for growth

2003-01-23 Thread Curt Sampson
On Thu, 16 Jan 2003, D'Arcy J.M. Cain wrote:

 Due to the fact that we are growing out of our current system
 (PostgreSQL on PCs) we are looking for ways to expand and one of the
 suggestions has been to toss PostgreSQL in favour of Oracle with
 Remote Access Cluster (RAC) software. The theory is that you can just
 plug machines into the cluster if the database appears to be straining
 and they automagically take over some of the load.
 ...
 My idea is to create a new middleware layer that allows me to split
 things up based on various criteria without changing my application.

It's a basic principle of clustering that doing it in an application-
aware way will always be more efficient than trying to hide it from the
application.

If you've not read it already, I strongly suggest reading _In Search of
Clusters_ by Gregory F. Pfister.

 And finally, if you had your dream machine to run on, what would it
 be? We are also looking outside of PC hardware but we are concerned
 about not having access to that nice, cheap, generic hardware for when
 we need to grow again or for redundant backup.

If you can manage to stick with PC hardware, you are going to save a
*lot* of money. If you're considering buying a reasonably well loaded
Sun E6000 or similar, it's well worth spending twenty or thirty thousand
dollars on a big PC system and spending some time to see if that will do
the trick before you shell out a couple hundred thousand for the Sun.

As for how well postgres uses multiple CPUs: so long as you've got lots
of connections with the load distributed among them, it's dependent on
the OS, postgres. If the OS can handle the scheduling (which, last I
checked, Linux couldn't, at least not without patches), eight or sixteen
CPUs will be fine.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [PERFORM] [HACKERS] Terrible performance on wide selects

2003-01-23 Thread Curt Sampson
On Thu, 23 Jan 2003, Daniel Kalchev wrote:

 Does this mean, that constructing tables where fixed length fields are
 'before' variable lenght fields and 'possibly null' fields might increase
 performance?

This, I believe, is why DB2 always puts (in physical storage) all of the
fixed-length fields before the variable-length fields.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Improving speed of copy

2002-10-06 Thread Curt Sampson

On Fri, 20 Sep 2002, Shridhar Daithankar wrote:

 On 20 Sep 2002 at 21:22, Shridhar Daithankar wrote:

  Mysql takes 221 sec. v/s 1121 sec. for postgres. For postgresql,
  that is around 11.5K rows per second. Each tuple has 23 fields with
  fixed length of around 100 bytes

Yes, postgres is much slower than MySQL for doing bulk loading of data.
There's not much, short of hacking on the code, that can be done about
this.

 One more issue is time taken for composite index creation. It's 4341
 sec as opposed to 436 sec for mysql. These are three non-unique
 character fields where the combination itself is not unique as well.

Setting sort_mem appropriately makes a big difference here. I generally
bump it up to 2-8 MB for everyone, and when I'm building a big index, I
set it to 32 MB or so just for that session.

But make sure you don't set it so high you drive your system into
swapping, or it will kill your performance. Remember also, that in
7.2.x, postgres will actually use almost three times the value you give
sort_mem (i.e., sort_mem of 32 MB will actually allocate close to 96 MB
of memory for the sort).

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-10-06 Thread Curt Sampson

On 26 Sep 2002, Neil Conway wrote:

 The fact that ext2 defaults to asynchronous mode and UFS (at least on
 the BSDs) defaults to synchronous mode seems like a total non-issue to
 me. Is there any more to the alleged difference in reliability?

It was sort of pointed out here, but perhaps not made completely
clear, that Berkley FFS defaults to synchronous meta-data updates,
but asynchronous data updates. You can also specify entirely
synchronous or entirely asynchronous updates. Linux ext2fs supports
only these last two modes, which is the problem.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] [GENERAL] Large databases, performance

2002-10-06 Thread Curt Sampson

On Thu, 3 Oct 2002, Shridhar Daithankar wrote:

 Well, we were comparing ext3 v/s reiserfs. I don't remember the journalling
 mode of ext3 but we did a 10 GB write test. Besides converting the RAID to RAID-
 0 from RAID-5 might have something to do about it.

That will have a massive, massive effect on performance. Depending on
your RAID subsystem, you can except RAID-0 to be between two and twenty
times as fast for writes as RAID-5.

If you compared one filesystem on RAID-5 and another on RAID-0,
your results are likely not at all indicative of file system
performance.

Note that I've redirected followups to the pgsql-performance list.
Avoiding cross-posting would be nice, since I am getting lots of
duplicate messages these days.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] [GENERAL] Large databases, performance

2002-10-06 Thread Curt Sampson

On Thu, 3 Oct 2002, Shridhar Daithankar wrote:

 Our major concern remains load time as data is generated in real time and is
 expecetd in database with in specified time period.

If your time period is long enough, you can do what I do, which is
to use partial indexes so that the portion of the data being loaded
is not indexed. That will speed your loads quite a lot. Aftewards
you can either generate another partial index for the range you
loaded, or generate a new index over both old and new data, and
then drop the old index.

The one trick is that the optimizer is not very smart about combining
multiple indexes, so you often need to split your queries across
the two partitions of the table that have separate indexes.

 Shall I subscribe to performance?

Yes, you really ought to. The list is [EMAIL PROTECTED]

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] PGXLOG variable worthwhile?

2002-09-26 Thread Curt Sampson

On Thu, 26 Sep 2002, Jan Wieck wrote:

  But still, why set up a situation where your database might not
  start? Why not set it up so that if you get just *one* environment
  or command-line variable right, you can't set another inconsistently
  and screw up your start anyway? Why store configuration information
  outside of the database data directory in a form that's not easily
  backed up, and not easily found by other utilities?

 Apply that argumentation to all of our commandline switches and config
 options and we end up with something that behaves like Microsoft
 products ... they know everything better, you cannot tune them, they
 work ... and you needed a bigger machine anyway.

Talk about a straw man! I have repeatedly said:

I WANT THE FEATURE THAT LETS YOU TUNE THE LOCATION OF THE LOG FILE!

Read it again, and again, until you understand that we both want
that feature.

Then realize, I just want it implemented in a way that makes it
less likely that people will find themselves in a situation where
the server doesn't start.

 I am absolutely not in favour of the PGXLOG environment variable. But if
 someone else wants it, it doesn't bother me because I wouldn't use it
 and it cannot hurt me.

Responsible programmers, when confronted with a more accident-prone
and less accident-prone way of doing something, chose the less
accident-prone way of doing things. That way people who are naive,
or tired, or just having a bad day are less likely to come to harm.

Using the config file is not only safer, it's actually more
convenient.  And since we're going to have the config file option
anyway, removing the environment variable option means that others
have less documentation to read, and will spend less time wondering
why there's two different ways to do the same thing. And naive
people won't chose the wrong way because they don't know any better.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] PGXLOG variable worthwhile?

2002-09-25 Thread Curt Sampson

On Wed, 25 Sep 2002, Jan Wieck wrote:

 With the number of screws our product has, there are so many
 possible combinations that don't work, why worry about one more
 or less?

That's just silly, so I won't even bother replying.

 Seriously, if you move around files, make symlinks or adjust
 config variable to reflect that, there's allways the possibility
 that you fatfinger it and cannot startup.

True. But once your symlink is in place, it is stored on disk in the
postgres data directory. An environment variable is a transient setting
in memory, which means that you have to have a program set it, and you
have to make sure that program gets run before any startup, be it an
automated startup from /etc/rc on boot or a manual startup.

 I want to have it it the config file.

Well, then we're agreed.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] PGXLOG variable worthwhile?

2002-09-24 Thread Curt Sampson

On Tue, 24 Sep 2002, Jan Wieck wrote:

 And AFAICS it is scary only because screwing that up will simply corrupt
 your database. Thus, a simple random number (okay, and a timestamp of
 initdb) in two files, one in $PGDATA and one in $PGXLOG would be a
 totally sufficient safety mechanism to prevent starting with the wrong
 XLOG directory.

But still, why set up a situation where your database might not
start? Why not set it up so that if you get just *one* environment
or command-line variable right, you can't set another inconsistently
and screw up your start anyway? Why store configuration information
outside of the database data directory in a form that's not easily
backed up, and not easily found by other utilities?

It's almost like people *don't* want to put this in the config file
or something

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] PGXLOG variable worthwhile?

2002-09-23 Thread Curt Sampson

On Sun, 22 Sep 2002, Marc G. Fournier wrote:

 Thomas implemented an option that he felt was useful, and that doesn't
 break anything inside of the code ... he provided 2 methods of being able
 to move the xlog's to another location (through command line and
 environment variable, both of which are standard methods for doing such in
 server software) ... but, because a small number of ppl voted that it
 should go away, it went away ...

The option as he implemented it did make the system more fragile.
You can't back up an environment variable, it's separated from other
configuration information, and it's more easily changed without
realizing it. We should be building systems that are as resilient to
human failure as possible, not opening up more possibilities of failure.

We already have a place for configuration information: the configuration
file. If I created a patch to move a variable out of the configuration
file and make it an environment variable instead, everybody would
(rightly) think I was nuts, and the patch certainly would not be
accepted. So why should the situation be different for new configuration
information?

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] PGXLOG variable worthwhile?

2002-09-21 Thread Curt Sampson

On Fri, 20 Sep 2002, Thomas Lockhart wrote:

 Well, what I was hoping for, but no longer expect, is that features
 (store xlog in another area) can be implemented and applied without
 rejection by the new gatekeepers.

It can be, and very simply. So long as you do it in the way which
is not error-prone, rather than the way which is.

 I have no fundamental objection to extending and replacing
 implementation features as positive contributions to development. I do
 have trouble with folks rejecting features without understanding the
 issues, and sorry, there was a strong thread of why would anyone want
 to put storage on another device to the discussion.

I doubt it. There was perhaps a strong thread of windows users
are loosers, but certainly Unix folks put storage on another device
all the time, using symlinks. This was mentioned many, many times.

 There has been a fundamental shift in the quality and civility of
 discussions over issues over the last couple of years, and I was naively
 hoping that we could work through that on this topic. Not happening, and
 not likely too.

Well, when you're going to bring in Windows in a pretty heavily
open-source-oriented group, no, it's not likely you're going to bring
everyone together. (This is not a value judgement, it's just a, Hello,
this is the usenet (or something similar), observation.

That said, again, I don't think anybody was objecting to what you
wanted to do. It was simply a bad implementation that I, and probably
all the others, were objecting to. So please don't go on like we didn't
like the concept.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] PGXLOG variable worthwhile?

2002-09-20 Thread Curt Sampson

On Thu, 19 Sep 2002, Thomas Lockhart wrote:

 Actually, a core member did implement this just a few weeks ago. The
 same crew arguing this time rejected the changes and removed them from
 the 7.3 feature set.

The change to make a PG_XLOG environment variable was rejected. Is that
really the change you were talking about?

 So some folks have their heels dug in, and the vocal ones are not really
 interested in understanding the issues which this feature is addressing.

I was one of the vocal objectors, and I certainly understand the
issues very well. Perhaps we should be saying the vocal supporters
of the environment variable don't understand the issues.

None of the objectors I saw have any problem with enabling Windows NT to
have the log file somewhere else. In fact, I'm very strongly in support
of this. But I object to doing it in a way that makes the system more
fragile and susceptable to not starting properly, or even damage, when
there's a simple and obvious way of doing it right: put this in the
database configuration file rather than in an environment variable.

Why you object to that, and insist it must be an environment variable
instead (if that is indeed what you're doing), I'm not sure

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] PGXLOG variable worthwhile?

2002-09-12 Thread Curt Sampson

On Thu, 12 Sep 2002, Justin Clift wrote:

 Am just wondering if we've ever considered adding a PGXLOG environment
 variable that would point to the pg_xlog directory?

IMHO, a much better way to support this is to put this information into
the config file. That way it can't easily change when you happen to, say,
start postgres in the wrong window.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Script to compute random page cost

2002-09-11 Thread Curt Sampson

On Wed, 11 Sep 2002, Mark Kirkwood wrote:

 Yes...and at the risk of being accused of marketing ;-) , that is
 exactly what the 3 programs in my archive do (see previous post for url) :

Hm, it appears we've both been working on something similar. However,
I've just released version 0.2 of randread, which has the following
features:

Written in C, uses read(2) and write(2), pretty much like postgres.

Reads or writes random blocks from a specified list of files,
treated as a contiguous range of blocks, again like postgres. This
allows you to do random reads from the actual postgres data files
for a table, if you like.

You can specify the block size to use, and the number of reads to do.

Allows you to specify how many blocks you want to read before you
start reading again at a new random location. (The default is 1.)
This allows you to model various sequential and random read mixes.

If you want to do writes, I suggest you create your own set of files to
write, rather than destroying postgresql data. This can easily a be done
with something like this Bourne shell script:

for i in 1 2 3 4; do
dd if=/dev/zero of=file.$i bs=1m count=1024
done

However, it doesn't calculate the random vs. sequential ratio for you;
you've got to do that for yourself. E.g.,:

$ ./randread -l 512 -c 256 /u/cjs/z?
256 reads of 512 x 8.00 KB blocks (4096.00 KB)
  totalling 131072 blocks (1024.00 MB)
  from 524288 blocks (4092.00 MB) in 4 files.
256 reads in 36.101119 sec. (141019 usec/read, 7 reads/sec, 29045.53 KB/sec)

$ ./randread -c 4096 /u/cjs/z?
4096 reads of 1 x 8.00 KB blocks (8.00 KB)
  totalling 4096 blocks (32.00 MB)
  from 524288 blocks (4095.99 MB) in 4 files.
4096 reads in 34.274582 sec. (8367 usec/read, 120 reads/sec, 956.04 KB/sec)

In this case, across 4 GB in 4 files on my 512 MB, 1.5 GHz Athlon
with an IBM 7200 RPM IDE drive, I read about 30 times faster doing
a full sequential read of the files than I do reading 32 MB randomly
from it.  But because of the size of this, there's basically no
buffer cache involved. If I do this on a single 512 MB file:

$ ./randread -c 4096 /u/cjs/z1:0-65536
4096 reads of 1 x 8.00 KB blocks (8.00 KB)
  totalling 4096 blocks (32.00 MB)
  from 65536 blocks (511.99 MB) in 1 files.
4096 reads in 28.064573 sec. (6851 usec/read, 146 reads/sec, 1167.59 KB/sec)

$ ./randread -l 65535 -c 1 /u/cjs/z1:0-65536
1 reads of 65535 x 8.00 KB blocks (524280.00 KB)
  totalling 65535 blocks (511.99 MB)
  from 65536 blocks (0.01 MB) in 1 files.
1 reads in 17.107867 sec. (17107867 usec/read, 0 reads/sec, 30645.55 KB/sec)

$ ./randread -c 4096 /u/cjs/z1:0-65536
4096 reads of 1 x 8.00 KB blocks (8.00 KB)
  totalling 4096 blocks (32.00 MB)
  from 65536 blocks (511.99 MB) in 1 files.
4096 reads in 19.413738 sec. (4739 usec/read, 215 reads/sec, 1687.88 KB/sec)

Well, there you see some of the buffer cache effect from starting
with about half the file in memory. If you want to see serious buffer
cache action, just use the first 128 MB of my first test file:

$ ./randread -c 4096 /u/cjs/z1:0-16536
4096 reads of 1 x 8.00 KB blocks (8.00 KB)
  totalling 4096 blocks (32.00 MB)
  from 16536 blocks (129.18 MB) in 1 files.
4096 reads in 20.220791 sec. (4936 usec/read, 204 reads/sec, 1620.51 KB/sec)

$ ./randread -l 16535 -c 1 /u/cjs/z1:0-16536
1 reads of 16535 x 8.00 KB blocks (132280.00 KB)
  totalling 16535 blocks (129.18 MB)
  from 16536 blocks (0.01 MB) in 1 files.
1 reads in 3.469231 sec. (3469231 usec/read, 0 reads/sec, 38129.49 KB/sec)

$  ./randread -l 16535 -c 64 /u/cjs/z1:0-16536
64 reads of 16535 x 8.00 KB blocks (132280.00 KB)
  totalling 1058240 blocks (8267.50 MB)
  from 16536 blocks (0.01 MB) in 1 files.
64 reads in 23.643026 sec. (369422 usec/read, 2 reads/sec, 358072.59 KB/sec)

For those last three, we're basically limited completely by the
CPU, as there's not much disk I/O going on at all. The many-block
one is going to be slower because it's got to generate a lot more
random numbers and do a lot more lseek operations.

Anyway, looking at the real difference between truly sequential
and truly random reads on a large amount of data file (30:1 or so),
it looks to me that people getting much less than that are getting
good work out of their buffer cache. You've got to wonder if there's
some way to auto-tune for this sort of thing

Anyway, feel free to download and play. If you want to work on the
program, I'm happy to give developer access on sourceforge.

http://sourceforge.net/project/showfiles.php?group_id=55994

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Script to compute random page cost

2002-09-10 Thread Curt Sampson

On Tue, 10 Sep 2002, Bruce Momjian wrote:

 Interesting that random time is increasing, while the others were
 stable.  I think this may have to do with other system activity at the
 time of the test.

Actually, the random versus sequential time may also be different
depending on how many processes are competing for disk access, as
well. If the OS isn't maintaining readahead for whatever reason,
sequential access could, in theory, degrade to being the same speed
as random access. It might be interesting to test this, too.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Optimization levels when compiling PostgreSQL...

2002-09-10 Thread Curt Sampson

On Wed, 11 Sep 2002, Peter Eisentraut wrote:

 I disagree.  Choosing the compiler options is exactly the job of the
 installer, packager, or distributor.

If there is one, yes.

 I don't think we're doing anyone a service if we spread wild speculations
 about how risky certain compiler options are.  If your compiler creates
 broken code, don't use it.  Packagers are expected to know about their
 compiler.  If they create broken packages and behave irresponsibly about
 it they won't be making packages much longer.

However, many users are not as knowledgable as packagers, but may
still be compiling from source. For those people, I don't think it's
unreasonable to say, Use -O2 unless you know what you are doing.

(I'm not sure we're actually disagreeing here, but I just wanted to make
this point clear.)

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Script to compute random page cost

2002-09-10 Thread Curt Sampson

On Tue, 10 Sep 2002, Tom Lane wrote:

 Curt Sampson [EMAIL PROTECTED] writes:
  Well, for the sequential reads, the readahead should be trigerred
  even when reading from a raw device.

 That strikes me as an unportable assumption.

Not only unportable: but false. :-) NetBSD, at least, does read-ahead
only through the buffer cache. Thinking about it, you *can't* do
read-ahead on a raw device, because you're not buffering. Doh!

 Perhaps it's time to remind people that what we want to measure
 is the performance seen by a C program issuing write() and read()
 commands, transferring 8K at a time, on a regular Unix filesystem.

Right. Which is what randread does, if you give it a file rather
than a raw device. I'm actually just now working on some modifications
for it that will let you work against a bunch of files, rather than
just one, so it will very accurately emulate a postgres random read
of blocks from a table.

There are two other tricky things related to the behaviour, however:

1. The buffer cache. You really need to be working against your
entire database, not just a few gigabytes of its data, or sample
data.

2. Multiple users. You really want a mix of simultaneous accesses
going on, with as many processes as you normally have users querying
the database.

These can probably both be taken care of with shell scripts, though.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [JDBC] [HACKERS] problem with new autocommit config parameter

2002-09-10 Thread Curt Sampson

On Tue, 10 Sep 2002, Barry Lind wrote:

 I am waiting for this thread to conclude before deciding exactly what to
 do for the jdbc driver for 7.3.  While using the 'set autocommit true'
 syntax is nice when talking to a 7.3 server, the jdbc driver also needs
 to be backwardly compatible with 7.2 and 7.1 servers.

Can you not check the server's version on connect?

It would be ideal if the JDBC driver, without modification, ran
all tests properly against 7.3, 7.2 and 7.1.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [JDBC] [HACKERS] problem with new autocommit config parameter

2002-09-10 Thread Curt Sampson

On Tue, 10 Sep 2002, Bruce Momjian wrote:

 Do we want to say With autocommit off, SET will be in it's own
 transaction if it appears before any non-SET command, and SETs are
 rolled back except if autocommit off and they appear before any
 non-SET?

Not really, I don't think.

But I'm starting to wonder if we should re-think all SET commands being
rolled back if a transaction fails. Some don't seem to make sense, such
as having SET AUTOCOMMIT or SET SESSION AUTHORIZATION roll back.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Optimization levels when compiling PostgreSQL...

2002-09-10 Thread Curt Sampson

On Tue, 10 Sep 2002, Lamar Owen wrote:

 I still remember when the Alpha port _required_ -O0.  And it was documented
 that way, IIRC.

Good. It would also be very nice if, in situations like this, the
configure script could detect this and use -O0 when compiling on
the alpha.

 Compiling from source implies certain knowledge.

No it doesn't. All it means is that someone's using a system for
which they don't have a package handy.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [JDBC] [HACKERS] problem with new autocommit config parameter

2002-09-10 Thread Curt Sampson

On Tue, 10 Sep 2002, Barry Lind wrote:

 Yes I can check the server version on connect.  In fact that is what the
   driver already does.  However I can't check the version and then based
 on the version call set autocommit true in one round trip to the server.
   Since many people don't use connection pools, I am reluctant to add
 the overhead of an extra roundtrip to the database to set a variable
 that for most people will already be set to true.  It would be ideal if
 I could in one hit to the database determine the server version and
 conditionally call set autocommit based on the version at the same time.

Hmm. I don't think that there's any real way to avoid a second round
trip now, but one thing we might do with 7.3 would be to add a standard
stored procedure that will deal with setting appropriate variables and
suchlike, and returning the version number and any other information
that the JDBC driver needs. (Maybe it can return a key/value table.)
That way, once we desupport 7.2 in the far future, we can reduce this to
one round trip.

Or perhaps we we could try to execute that stored procedure and, if it
fails, create it. (Or, if creating it fails, do things the hard way.) That
way the first connection you make where the SP is not there you have the
overhead of adding it, but all connections after that can use it. (I assume
you'd grant all rights to it to the general public.) And it could return
its own version so that newer drivers could upgrade it if necessary. Or
maybe just have a differently-named one for each version of the driver.
This is a bit kludgy, but also sort of elegant, if you think about it

On the other hand, perhaps we should just live with two round trips. So
long as we've got command batching at some point, we can get the version,
and then send all the setup commands we need as a single batch after that.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Script to compute random page cost

2002-09-09 Thread Curt Sampson

On Mon, 9 Sep 2002, Bruce Momjian wrote:

 What do other people get for this value?

With your new script, with a 1.5 GHz Athlon, 512 MB RAM, and a nice fast
IBM 7200 RPM IDE disk, I get random_page_cost = 0.93.

 One flaw in this test is that it randomly reads blocks from different
 files rather than randomly reading from the same file.  Do people have a
 suggestion on how to correct this?  Does it matter?

From my quick glance, it also does a lot of work work to read each
block, including forking off serveral other programs. This would tend to
push up the cost of a random read. You might want to look at modifying
the randread program (http://randread.sourceforge.net) to do what you
want

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] problem with new autocommit config parameter and jdbc

2002-09-09 Thread Curt Sampson

On Mon, 9 Sep 2002, Tom Lane wrote:

 snpe [EMAIL PROTECTED] writes:

  snpe select * from org_ba;
  ERROR: relation org_ba does not exists
  snpe select * from org_ban;
  ERROR: current transactions is aborted, queries ignored until end of
  transaction block

 Um, what's wrong with that?

 It seems to me that an application that is using autocommit=off will
 expect the first SELECT to start a transaction block.

Yup. In fact, the standard (at least, insofar as I have information
relating to it), specifies that the first SELECT statement above
*must* start a transaction.

From Date's _A Guide to the SQL Standard_ (Fourth Edition):

An SQL-transaction is initiated when the relevant SQL-agent executes
a transaction-initiating SQL Statement (see below) and the
SQL-agent does not already have an SQL-transaction in progress.
...
The following SQL statements are _not_ transaction-initiating:

CONNECT
SET CONNECTION
DISCONNECT
SET SESSION AUTHORIZATION
SET CATALOG
SET SCHEMA
SET NAMES
SET TIME ZONE
SET TRANSACTION
SET CONSTRAINTS
COMMIT
ROLLBACK
GET DIAGNOSTICS

Nor, of course, are the nonexecutable statements DECLARE CURSOR,
DECLAR LOCAL TEMPORARY TABLE, BEGIN DECLARE SECTION, SEND DECLARE
SECTIONS, and WHENEVER.

So SELECT ought always to initiate a transaction, if one is not already
in progress. If auto-commit is enabled, of course, that statement may
be committed immediately after execution, if it doesn't fail.

As far as the JDBC driver goes, I'm not too sure of the issues here, but
it should certainly be ensuring that autocommit is enabled, as per the
JDBC specification, when a new connection is created. I see no reason
this couldn't be done with a SET AUTOCOMMIT TO OFF or whatever, if
that's necessary to override a possible configuration file setting.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Script to compute random page cost

2002-09-09 Thread Curt Sampson

On Mon, 9 Sep 2002, Tom Lane wrote:

 Finally, I wouldn't believe the results for a moment if they were taken
 against databases that are not several times the size of physical RAM
 on the test machine, with a total I/O volume also much more than
 physical RAM.  We are trying to measure the behavior when kernel
 caching is not helpful; if the database fits in RAM then you are just
 naturally going to get random_page_cost close to 1, because the kernel
 will avoid doing any I/O at all.

Um...yeah; another reason to use randread against a raw disk device.
(A little hard to use on linux systems, I bet, but works fine on
BSD systems.)

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Optimization levels when compiling PostgreSQL...

2002-09-09 Thread Curt Sampson

On Mon, 9 Sep 2002, Sean Chittenden wrote:

 I'm thinking about changing this from a beta port to a -devel port
 that I'll periodically update with snapshots.  I'll turn on -O6 for
 the -devel port and -O2 for production for now.  If I don't hear of
 any random bogons in the code I'll see if I can't increase it further
 to -O3 and beyond at a slow/incremental rate.

Keep in mind that, while gcc is pretty stable for i386, the higher
optimization levels (above -O2) do tend to have bogons on other
processors, that vary with which version of gcc you're running.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [JDBC] [HACKERS] problem with new autocommit config parameter

2002-09-09 Thread Curt Sampson

On Mon, 9 Sep 2002, Tom Lane wrote:

 If autocommit=off really seriously breaks JDBC then I don't think a
 simple SET command at the start of a session is going to do that much
 to improve robustness.  What if the user issues another SET to turn it
 on?

You mean, to turn it off again? The driver should catch this, in theory.

In practice we could probably live with saying, Don't use SET
AUTOCOMMIT; use the methods on the Connection class instead.

Probably the driver should be changed for 7.3 just to use the server's
SET AUTOCOMMIT functionality

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Script to compute random page cost

2002-09-09 Thread Curt Sampson

On Mon, 9 Sep 2002, Tom Lane wrote:

 Curt Sampson [EMAIL PROTECTED] writes:
  On Mon, 9 Sep 2002, Tom Lane wrote:
  ...  We are trying to measure the behavior when kernel
  caching is not helpful; if the database fits in RAM then you are just
  naturally going to get random_page_cost close to 1, because the kernel
  will avoid doing any I/O at all.

  Um...yeah; another reason to use randread against a raw disk device.
  (A little hard to use on linux systems, I bet, but works fine on
  BSD systems.)

 Umm... not really; surely randread wouldn't know anything about
 read-ahead logic?

Randread doesn't know anything about read-ahead logic, but I don't
see how that matters one way or the other. The chances of it reading
blocks sequentially are pretty much infinitesimal if you're reading
across a reasonably large area of disk (I recommend at least 4GB),
so readahead will never be triggered.

 The reason this is a difficult topic is that we are trying to measure
 certain kernel behaviors --- namely readahead for sequential reads ---
 and not others --- namely caching, because we have other parameters
 of the cost models that purport to deal with that.

Well, for the sequential reads, the readahead should be trigerred
even when reading from a raw device. So just use dd to measure
that.  If you want to slightly more accurately model postgres'
behaviour, you probably want to pick a random area of the disk,
read a gigabyte, switch areas, read another gigabyte, and so on.
This will model the split into 1GB files thing.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Inheritance

2002-09-06 Thread Curt Sampson

On 6 Sep 2002, Hannu Krosing wrote:

  In most object-oriented languages (Eiffel being a notable exception, IIRC),
  you can't specify constraints on objects. But in a relational database,
  you can specify constraints on tables, and it should *never* *ever* be
  possible to violate those constraints, or the constraints are pointless.

 That's not how real world (which data is supposed to model) operates ;)

Sure it is. Please don't blame the language for being wrong when you
incorrectly model things for your purposes. To chose a much simpler
and more obvious example: if you stored birthdate as a date only, and
someone complained that you're not born all day, but at a particular
time on that day, you don't blame the language for having the date type
not store the time of day. You fix your problem to use both a date and a
time to store that value.

If the language specifies that contstraints on tables are not to be
violated, then don't use those constraints when you don't want them.

 To elaborate on Gregs example if you have table GOODS and under it a
 table CAMPAIGN_GOODS then you may place a general overridable constraint
 valid_prices on GOODS which checks that you dont sell cheaper than you
 bought, but you still want sell CAMPAIGN_GOODS under aquiring price, so
 you override the constraint for CAMPAIGN_GOODS.

This looks like a classic case of incorrect modelling to me. Does the
good itself change when it becomes a campaign_good? No. The price
changes, but that's obviously not an integral part of the good itself.
So separate your price information from your good information, and then
you can do things like have campaign prices, multiple prices per good
(since you probably want to keep the original price information as
well), and so on.

I'm really getting the feeling a lot of these applications that
want table inheritance want it just to be different, not because
it provides anything useful.

I am completely committed to object-oriented programming, and use
inheritance heavily, so it's not that I don't understand or like the
concepts. But just because a concept works well in one type of use does
not mean it will do any good, or even not do harm, when brought into a
completely different world.

 SQL standard constraints should be non-overridable. I still think that
 Constraint triggers should be overridable/dynamic.

I still don't like it. Eiffel had good reasons for making the
constraints non-overridable. Other OO languages don't have constraints,
or they would probably do the same.

That said, I could live with dynamic dispatch, if the default were
to make it non-dynamic, and you had to add a special flag to make it
dynamic. That way it would be obvious to the casual user or a DBA
familiar with other databases but not postgres that something unusual is
going on.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



  1   2   3   >