date:20040402

Re: [HACKERS] Better support for whole-row operations and composite types

2004-04-02 Thread Tom Lane

Joe Conway <[EMAIL PROTECTED]> writes:
> Just for reference, what is arg[i] if it isn't a (TupleTableSlot *) 
> anymore -- is it just a HeapTuple?

No, it's a HeapTupleHeader pointer.  You need to reconstruct a HeapTuple
on top of that to work with heap_getattr and most other core backend
routines.  Also don't forget to ensure that you detoast the datum;
this is not useful at the moment but will be important Real Soon Now.
I added standard argument-fetch macros to fmgr.h to help with the
detoasting bit.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Better support for whole-row operations and composite

2004-04-02 Thread Joe Conway

Tom Lane wrote:
Joe Conway <[EMAIL PROTECTED]> writes:
... The SRF API was for user defined functions, not
procedural languages anyway. I'll look at how the other procedural
languages handle tuple arguments.
It was a dozen-or-so-lines change in each of the PL languages AFAIR.
You will probably also want to look at what you do to return tuple
results.
OK, thanks.

Just for reference, what is arg[i] if it isn't a (TupleTableSlot *) 
anymore -- is it just a HeapTuple?

Joe

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Better support for whole-row operations and composite types

2004-04-02 Thread Tom Lane

Joe Conway <[EMAIL PROTECTED]> writes:
> ... The SRF API was for user defined functions, not
> procedural languages anyway. I'll look at how the other procedural
> languages handle tuple arguments.

It was a dozen-or-so-lines change in each of the PL languages AFAIR.
You will probably also want to look at what you do to return tuple
results.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Better support for whole-row operations and composite

2004-04-02 Thread Joe Conway

Tom Lane wrote:
Joe Conway <[EMAIL PROTECTED]> writes:
/* for tuple args, convert to a one row data.frame */ 
TupleTableSlot *slot = (TupleTableSlot *) arg[i]; HeapTuple		tuples
= slot->val; TupleDesc		tupdesc = slot->ttc_tupleDescriptor;
Um.  Well, the arg is not a TupleTableSlot * anymore, so this is 
guaranteed to fail.  This isn't part of what I thought the documented
 SRF API was though.
I'm sure you're correct. The SRF API was for user defined functions, not
procedural languages anyway. I'll look at how the other procedural
languages handle tuple arguments.
If you take the arg[i] value and pass it to GetAttributeByName or
GetAttributeByNum it will work (with some compiler warnings) and
AFAICS we never documented more than that.
OK, thanks,

Joe

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] Better support for whole-row operations and composite types

2004-04-02 Thread Tom Lane

Joe Conway <[EMAIL PROTECTED]> writes:
> I still haven't had time to look closely, and well may have been doing 
> something non-standard all along, but in any case this is the current 
> failing code:

>  /* for tuple args, convert to a one row data.frame */
>  TupleTableSlot *slot = (TupleTableSlot *) arg[i];
>  HeapTupletuples = slot->val;
>  TupleDesctupdesc = slot->ttc_tupleDescriptor;

Um.  Well, the arg is not a TupleTableSlot * anymore, so this is
guaranteed to fail.  This isn't part of what I thought the documented
SRF API was though.  If you take the arg[i] value and pass it to
GetAttributeByName or GetAttributeByNum it will work (with some compiler
warnings) and AFAICS we never documented more than that.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Better support for whole-row operations and composite

2004-04-02 Thread Joe Conway

Tom Lane wrote:
Joe Conway <[EMAIL PROTECTED]> writes:
Almost compatible. I found that, to my surprise, PL/R compiles with no 
changes after your commit. However it no segfaults (as I expected) on 
composite type arguments. Should be easy to fix though (I think, really 
haven't looked at it hard yet).
Let me know what you find out --- if I missed a trick on compatibility,
there's still plenty of time to fix it.
I still haven't had time to look closely, and well may have been doing 
something non-standard all along, but in any case this is the current 
failing code:

  else if (function->arg_is_rel[i])
  {
/* for tuple args, convert to a one row data.frame */
TupleTableSlot *slot = (TupleTableSlot *) arg[i];
HeapTuple   tuples = slot->val;
TupleDesc   tupdesc = slot->ttc_tupleDescriptor;
PROTECT(el = pg_tuple_get_r_frame(1, &tuples, tupdesc));
  }
The problem was (I think -- I'll check a little later) that 
slot->ttc_tupleDescriptor is now '\0'.

Hmmm. I had thought maybe we could use a single datatype (anyarray?) 
with in/out functions that would need to do the right thing based on the 
element type.
If we have just one datatype, how will the parser determine the type of
a "foo[subscript]" expression?  After thinking a bit, I don't see how to
do that except by adding an out-of-line decoration to the underlying
type, somewhat like we do for "setof" or atttypmod.  This is doable as
far as the backend itself is concerned, but the compatibility
implications for clients and user-written extensions seem daunting :-(
I'll think-about/play-with this some more, hopefully this weekend.

Thanks,

Joe

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Function to kill backend

2004-04-02 Thread Bruce Momjian

Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Seems like useful functionality.  Right now, how does an administrator
> > kill another backend from psql?  They can't.
> 
> The question to ask is "should they be able to?"
> 
> I think any such facility is inherently a security risk, since it means
> that a remote attacker who's managed to break into your superuser
> account can randomly zap other backends.  Now admittedly there's plenty
> of other mischief he can do with superuser privs, but that doesn't mean
> we should hand him a pre-loaded, pre-sighted cannon.
> 
> Having to log into the database server locally to execute such
> operations doesn't seem that bad to me.

If they can read/write your data (as superuser), killing backends is the
least worry.

I can see it as useful as part of pg_stat_activity output.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Problems Vacuum'ing

2004-04-02 Thread Tom Lane

[EMAIL PROTECTED] (Jim Seymour) writes:
> But I see entries like that if I just *start* *up* psql, without
> doing anything:

Sure.  You are "doing something" when you execute "select from
pg_locks" ... that command executes inside a transaction, just
like any other Postgres operation.  The problem you're facing
is that WebObjects is creating a transaction that persists for
a long period of time.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Function to kill backend

2004-04-02 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> Seems like useful functionality.  Right now, how does an administrator
> kill another backend from psql?  They can't.

The question to ask is "should they be able to?"

I think any such facility is inherently a security risk, since it means
that a remote attacker who's managed to break into your superuser
account can randomly zap other backends.  Now admittedly there's plenty
of other mischief he can do with superuser privs, but that doesn't mean
we should hand him a pre-loaded, pre-sighted cannon.

Having to log into the database server locally to execute such
operations doesn't seem that bad to me.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Problems Vacuum'ing

2004-04-02 Thread Tom Lane

Alvaro Herrera <[EMAIL PROTECTED]> writes:
> But, if I read the code correctly, the oldest xmin vacuum cares about
> for a non-shared relation should be local to the database, shouldn't it?

It's the oldest xmin of any transaction that's local to your database,
but those xmin values themselves were computed globally --- so what
matters is the oldest transaction that was running when any local
transaction started.  In this case I expect it's the VACUUM's own
transaction that's seeing the other guy as determining its xmin.

We could fix this by making every transaction compute, and advertise in
the PGPROC array, both local and global xmin values.  In previous
iterations of this discussion we concluded that the extra cycles (which
would be spent in *every* transaction start) could not be justified by
making VACUUM better able to reclaim space in the face of misbehaving
clients.  That conclusion might be wrong, but it's not instantly obvious
that it is...

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Better support for whole-row operations and composite types

2004-04-02 Thread Tom Lane

Joe Conway <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> ...  I believe that CVS tip is source-code-compatible with
>> existing SRFs, even though I adjusted all the ones in the distribution
>> to stop using the TupleTableSlot stuff.

> Almost compatible. I found that, to my surprise, PL/R compiles with no 
> changes after your commit. However it no segfaults (as I expected) on 
> composite type arguments. Should be easy to fix though (I think, really 
> haven't looked at it hard yet).

Let me know what you find out --- if I missed a trick on compatibility,
there's still plenty of time to fix it.

>> ... I wonder if we could go over to a scheme where, say,
>> atttypid is the base type ID and attndims being nonzero is what you
>> check to find out it's really an array of atttypid.  Not sure how we
>> could map that idea into function and expression args/results, though.

> Hmmm. I had thought maybe we could use a single datatype (anyarray?) 
> with in/out functions that would need to do the right thing based on the 
> element type.

If we have just one datatype, how will the parser determine the type of
a "foo[subscript]" expression?  After thinking a bit, I don't see how to
do that except by adding an out-of-line decoration to the underlying
type, somewhat like we do for "setof" or atttypmod.  This is doable as
far as the backend itself is concerned, but the compatibility
implications for clients and user-written extensions seem daunting :-(

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Problems Vacuum'ing

2004-04-02 Thread Jim Seymour

Tom Lane <[EMAIL PROTECTED]> wrote:
> 
> [EMAIL PROTECTED] (Jim Seymour) writes:
> > Tom Lane <[EMAIL PROTECTED]> wrote:
> >> WebObjects is evidently holding an open transaction.
> 
> > It certainly isn't holding open a transaction in the database I'm
> > working with.
> 
> Which database the transaction is in isn't real relevant... the logic is
> done globally so that it will be correct when vacuuming shared tables.

It had occurred to me, early on, that if anything had an open
transaction, that would perhaps cause what I was seeing.  So I
killed-off WebObjects.  Ran my tests.  Psql'd as yet another user,
to another database, and did something like

begin;
insert into foo (bar) values ('Hello');

And then ran my tests.  Vacuum'ing worked completely.

> 
> > It's unclear to me it's holding any transaction open,
> > anywhere.
> 
> Sure it is, assuming that PID 18020 is the session we're talking about.
> 
> > postgres=# select * from pg_locks where transaction is not null;
> >  relation | database | transaction |  pid  | mode  | granted 
> > --+--+-+---+---+-
> >   |  | 1245358 | 18020 | ExclusiveLock | t
>  ^^^

But I see entries like that if I just *start* *up* psql, without
doing anything:

Script started on Fri 02 Apr 2004 09:42:58 PM EST
$ psql
Password: 
Welcome to psql 7.4.2, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help on internal slash commands
   \g or terminate with semicolon to execute query
   \q to quit

jseymour=> select * from pg_locks where transaction is not null;
 relation | database | transaction | pid  | mode  | granted 
--+--+-+--+---+-
  |  |8941 | 1480 | ExclusiveLock | t
(1 row)

jseymour=> select * from pg_stat_activity;
 datid | datname  | procpid | usesysid | usename  | current_query | query_start 
---+--+-+--+--+---+-
 17144 | jseymour |1480 |  101 | jseymour |   | 
(1 row)

jseymour=> \q
$ exit

script done on Fri 02 Apr 2004 09:43:27 PM EST

What does that entry for pid 1480, transaction 8941 mean?

> 
> This process has an open transaction number 1245358.  That's what an
> exclusive lock on a transaction means.
> 
> >  17142 | postgres | 267 |1 | postgres   |   | 
> >  17144 | qantel   |   18020 |  103 | webobjects |   | 
> 
> These entries didn't make a lot of sense to me since the other examples
> you mentioned did not seem to be getting executed in the 'postgres'
> database --- but I assume PID 18020 is the one you are referring to as
> webobjects.

I ran the pg_locks and pg_stat_activity selects as user postgres.  The
postgres db has nothing to do with either the WebObjects application
nor the script that's been populating the db I've been experimenting
with.

The point there was to show that the WebObjects application had nothing
open other than whatever it is seems to be there when anything connects
to a database (?) with psql (?).

Regards,
Jim

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Problems Vacuum'ing

2004-04-02 Thread Stephan Szabo


On Fri, 2 Apr 2004, Alvaro Herrera wrote:

> On Fri, Apr 02, 2004 at 07:35:20PM -0500, Tom Lane wrote:
> > [EMAIL PROTECTED] (Jim Seymour) writes:
> > > Again the difference: With WebObjects running, deleting rows and
> > > trying to vacuum immediately, even full, fails.  Shut-down WebObjects
> > > and I can.
> >
> > WebObjects is evidently holding an open transaction.  Ergo, anything
> > deleted after the start of that transaction isn't vacuumable.  You need
> > to do something about the client-side logic that is holding an open
> > transaction without doing anything ...
>
> But, if I read the code correctly, the oldest xmin vacuum cares about
> for a non-shared relation should be local to the database, shouldn't it?

AFAICS it's the oldest transaction at the start of any of the transactions
in this database, not the oldest transaction of any transaction in this
database.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Inconsistent behavior on Array & Is Null?

2004-04-02 Thread Joe Conway

Greg Stark wrote:
This really ought to work, it obviously shouldn't allow you to set a[5] and
then surreptitiously move it to a[1]. But nor should it generate an error,
since I may well have a specific meaning for a[5] and may be planning to fill
in a[1]..a[4] later.
The logical thing to do, I think, is pre-fill the entries a[1]..a[4] with
null. This could be implemented by actually storing the NULLs or else storing
some notation that's used to adjust the base of the index to save space.
I agree. I had always envisioned something exactly like that once we 
supported NULL elements. As far as the implementation goes, I think it 
would be very similar to tuples -- a null bitmask that would exist if 
any elements are NULL.

A related question is how to deal with non-existing array elements. 
Until now, you could do:

regression=# select f[0] from (select array[1,2]) as t(f);
 f
---
(1 row)

Even though index 0 does not exist, you get a NULL value returned 
instead of an ERROR. I'd think if we hardwire a lower bound of 1, this 
should produce an ERROR. Similarly:

regression=# select f[3] from (select array[1,2]) as t(f);
 f
---
(1 row)

Should this produce an ERROR instead of returning NULL once existing 
array elements can be NULL?

One thing that can't be made to work like it does now is extending the array
on the low end indefinitely:
slo=> update test set a[1] = 1;
UPDATE 1
slo=> update test set a[0] = 0;
UPDATE 1
slo=> update test set a[-1] = -1;
UPDATE 1
Right. In the new world order we're describing, the latter two examples 
would have to produce errors.

Joe

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] Function to kill backend

2004-04-02 Thread Bruce Momjian

Tom Lane wrote:
> Andrew Dunstan <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> it would definitely need to be a lot more constrained than
> >> send-any-signal-to-any-postgres-process ... even for a superuser,
> >> that's a mighty fat-gauge foot-gun.
> 
> > What sort of constraints do you have in mind?
> 
> I'd limit it to SIGINT (query cancel) and SIGTERM (fast shutdown),
> and I'm not even real sure about SIGTERM.  That facility is designed to
> work in the case of shutting down all backends together --- I'm not sure
> I want to promise that it behaves pleasantly to SIGTERM one backend and
> leave the rest going.  Nor do I see a real good use-case for it.
> 
> Also, no killing processes that aren't regular backends (eg, the
> bgwriter, the stats processes, and most especially the postmaster).
> 
> Another point is that killing by PID is not necessarily what you want to
> do --- kill by transaction ID might be a better API, especially for
> query-cancel cases.

Seems like useful functionality.  Right now, how does an administrator
kill another backend from psql?  They can't.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Problems Vacuum'ing

2004-04-02 Thread Alvaro Herrera

On Fri, Apr 02, 2004 at 07:35:20PM -0500, Tom Lane wrote:
> [EMAIL PROTECTED] (Jim Seymour) writes:
> > Again the difference: With WebObjects running, deleting rows and
> > trying to vacuum immediately, even full, fails.  Shut-down WebObjects
> > and I can.
> 
> WebObjects is evidently holding an open transaction.  Ergo, anything
> deleted after the start of that transaction isn't vacuumable.  You need
> to do something about the client-side logic that is holding an open
> transaction without doing anything ...

But, if I read the code correctly, the oldest xmin vacuum cares about
for a non-shared relation should be local to the database, shouldn't it?
If this is so, why does it matter that he has open transaction on a
different database?

-- 
Alvaro Herrera ()
www.google.com: interfaz de línea de comando para la web.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Better support for whole-row operations and composite

2004-04-02 Thread Joe Conway

Tom Lane wrote:
Joe Conway <[EMAIL PROTECTED]> writes:
So you mean like an array, but with possibly mixed datatypes?
'{1 , "abc def", 2.3}'
Seems to make sense.
The unresolved question in my mind is how to represent NULL elements.
However, we have to solve that sooner or later for arrays too.  Any
thoughts?
Good point. What's really ugly is that the external representation of 
string types differs depending on whether quotes are needed or not. If 
strings were *always* surrounded by quotes, we could just use the word 
NULL, without the quotes.

Another option might be to use the ROW keyword, something like:
ROW[1 , 'abc', 2.3]


This is a separate issue, just as the ARRAY[] constructor has different
uses from the array I/O representation.  I do want some kind of runtime
constructor, but ROW[...] doesn't get the job done because it doesn't
provide any place to specify the rowtype name.  Maybe we could combine
ROW[...] with some sort of cast notation?
ROW[1 , 'abc', 2.3] :: composite_type_name
CAST(ROW[1 , 'abc', 2.3] AS composite_type_name)
Does SQL99 provide any guidance here?
The latter seems to agree with 6.12 () of SQL2003. 
I'd think we'd want the former supported anyway as an extension to standard.

Almost.  I ended up keeping TupleDescGetSlot as a live function, but its
true purpose is only to ensure that the tupledesc gets registered with
the type cache (see BlessTupleDesc() in CVS tip).  The slot per se never
gets used.  I believe that CVS tip is source-code-compatible with
existing SRFs, even though I adjusted all the ones in the distribution
to stop using the TupleTableSlot stuff.
Almost compatible. I found that, to my surprise, PL/R compiles with no 
changes after your commit. However it no segfaults (as I expected) on 
composite type arguments. Should be easy to fix though (I think, really 
haven't looked at it hard yet).

The main point though is that row Datums now contain sufficient info
embedded in them to allow runtime type lookup the same as we do for arrays.
Sounds good to me.

There are several in the PL sources now, for instance plpgsql does this
with an incoming rowtype argument:
Perfect -- thanks.

As an aside, it would be quite useful to have support for arrays of 
tuples. Any idea on how to do that without needing to define an explicit 
array type for each tuple type?
Hmm, messy ...

I wonder now whether we still really need a separate pg_type entry for
every array type.  The original motivation for doing that has been at
least partly subsumed by storing element type OIDs inside the arrays
themselves.  I wonder if we could go over to a scheme where, say,
atttypid is the base type ID and attndims being nonzero is what you
check to find out it's really an array of atttypid.  Not sure how we
could map that idea into function and expression args/results, though.
Hmmm. I had thought maybe we could use a single datatype (anyarray?) 
with in/out functions that would need to do the right thing based on the 
element type. This would also allow, for example, arrays-of-arrays, 
which is the way that SQL99/2003 seem to allow for multidimensional arrays.

Plan B would be to go ahead and create array types.  Not sure I would
want to do this for table rowtypes, but if we did it only for CREATE
TYPE AS then it doesn't sound like an unreasonable amount of overhead.
I was hoping we wouldn't need to do that.

Joe

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Problems Vacuum'ing

2004-04-02 Thread Tom Lane

[EMAIL PROTECTED] (Jim Seymour) writes:
> Tom Lane <[EMAIL PROTECTED]> wrote:
>> WebObjects is evidently holding an open transaction.

> It certainly isn't holding open a transaction in the database I'm
> working with.

Which database the transaction is in isn't real relevant... the logic is
done globally so that it will be correct when vacuuming shared tables.

> It's unclear to me it's holding any transaction open,
> anywhere.

Sure it is, assuming that PID 18020 is the session we're talking about.

> postgres=# select * from pg_locks where transaction is not null;
>  relation | database | transaction |  pid  | mode  | granted 
> --+--+-+---+---+-
>   |  | 1245358 | 18020 | ExclusiveLock | t
 ^^^

This process has an open transaction number 1245358.  That's what an
exclusive lock on a transaction means.

>  17142 | postgres | 267 |1 | postgres   |   | 
>  17144 | qantel   |   18020 |  103 | webobjects |   | 

These entries didn't make a lot of sense to me since the other examples
you mentioned did not seem to be getting executed in the 'postgres'
database --- but I assume PID 18020 is the one you are referring to as
webobjects.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] [GENERAL] Large DB

2004-04-02 Thread Manfred Koizar

On Fri, 02 Apr 2004 19:57:47 -0500, Tom Lane <[EMAIL PROTECTED]> wrote:
>If you like I can send you the Vitter paper off-list (I have a PDF of
>it).  The comments in the code are not really intended to teach someone
>what it's good for ...

Yes, please.  [Would have sent this off-list.  But I'm blacklisted.]

Servus
 Manfred

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Function to kill backend

2004-04-02 Thread Tom Lane

Andrew Dunstan <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> it would definitely need to be a lot more constrained than
>> send-any-signal-to-any-postgres-process ... even for a superuser,
>> that's a mighty fat-gauge foot-gun.

> What sort of constraints do you have in mind?

I'd limit it to SIGINT (query cancel) and SIGTERM (fast shutdown),
and I'm not even real sure about SIGTERM.  That facility is designed to
work in the case of shutting down all backends together --- I'm not sure
I want to promise that it behaves pleasantly to SIGTERM one backend and
leave the rest going.  Nor do I see a real good use-case for it.

Also, no killing processes that aren't regular backends (eg, the
bgwriter, the stats processes, and most especially the postmaster).

Another point is that killing by PID is not necessarily what you want to
do --- kill by transaction ID might be a better API, especially for
query-cancel cases.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] [GENERAL] Large DB

2004-04-02 Thread Tom Lane

Manfred Koizar <[EMAIL PROTECTED]> writes:
>> You'd run the Vitter
>> algorithm separately to decide whether to keep or discard each live row
>> you find in the blocks you read.

> You mean once a block is sampled we inspect it in any case?  This was
> not the way I had planned to do it, but I'll keep this idea in mind.

Well, once we've gone to the trouble of reading in a block we
definitely want to count the tuples in it, for the purposes of
extrapolating the total number of tuples in the relation.  Given
that, I think the most painless route is simply to use the Vitter
algorithm with the number-of-tuples-scanned as the count variable.
You could dump the logic in acquire_sample_rows that tries to estimate
where to read the N'th tuple from.

If you like I can send you the Vitter paper off-list (I have a PDF of
it).  The comments in the code are not really intended to teach someone
what it's good for ...

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Problems Vacuum'ing

2004-04-02 Thread Jim Seymour


Tom Lane <[EMAIL PROTECTED]> wrote:
> 
> [EMAIL PROTECTED] (Jim Seymour) writes:
> > Again the difference: With WebObjects running, deleting rows and
> > trying to vacuum immediately, even full, fails.  Shut-down WebObjects
> > and I can.
> 
> WebObjects is evidently holding an open transaction.  Ergo, anything
> deleted after the start of that transaction isn't vacuumable.  You need
> to do something about the client-side logic that is holding an open
> transaction without doing anything ...

It certainly isn't holding open a transaction in the database I'm
working with.  It's unclear to me it's holding any transaction open,
anywhere.  This is all that showed up:

postgres=# select * from pg_locks where transaction is not null;
 relation | database | transaction |  pid  | mode  | granted 
--+--+-+---+---+-
  |  | 1245358 | 18020 | ExclusiveLock | t
  |  | 1245364 |   267 | ExclusiveLock | t
(2 rows)

postgres=# select * from pg_stat_activity;
 datid | datname  | procpid | usesysid |  usename   | current_query | query_start 
---+--+-+--++---+-
 17142 | postgres | 267 |1 | postgres   |   | 
 17144 | qantel   |   18020 |  103 | webobjects |   | 
(2 rows)

I don't know what those are, but they list no database or relation.  I
get this just be running psql (this time at home):

jseymour=> select * from pg_locks where transaction is not null;
 relation | database | transaction | pid | mode  | granted 
--+--+-+-+---+-
  |  |8938 | 307 | ExclusiveLock | t
(1 row)

jseymour=> select * from pg_stat_activity;
 datid | datname  | procpid | usesysid | usename  | current_query | query_start 
---+--+-+--+--+---+-
 17144 | jseymour | 307 |  101 | jseymour |   | 
(1 row)

Without having touched a thing.

Jim

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] [GENERAL] Large DB

2004-04-02 Thread Manfred Koizar

On Fri, 02 Apr 2004 18:06:12 -0500, Tom Lane <[EMAIL PROTECTED]> wrote:
>You should not need to use the Vitter algorithm for the block-level
>selection, since you can know the number of blocks in the table in
>advance.  You can just use the traditional method of choosing each block
>or not with probability (k/K), where k = number of sample blocks still
>needed, K = number of blocks from here to the end.

Sounds reasonable.  I have to play around a bit more to get a feeling
where the Vitter method gets more efficient.

>  You'd run the Vitter
>algorithm separately to decide whether to keep or discard each live row
>you find in the blocks you read.

You mean once a block is sampled we inspect it in any case?  This was
not the way I had planned to do it, but I'll keep this idea in mind.

>Question: if the table size is less than N blocks, are you going to read
>every block or try to reduce the number of blocks sampled?

Don't know yet.

>people are setting the stats target to 100 which means a sample size of
>3 --- how do the page-access counts look in that case?

 rel  | page
 size | reads
--+-
  300 |   300
 3000 |  3000
 5000 |  4999
  10K |  9.9K
  30K |  25.8K
 300K |   85K
   1M |  120K
  10M |  190K
 100M |  260K
   1G |  330K

This is exactly the table I posted before (for sample size 3000) with
every entry multiplied by 10.  Well, not quite exactly, but the
differences are far behind the decimal point.  So for our purposes, for
a given relation size the number of pages accessed is proportional to
the sample size.

Servus
 Manfred

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Problems Vacuum'ing

2004-04-02 Thread Tom Lane

[EMAIL PROTECTED] (Jim Seymour) writes:
> Again the difference: With WebObjects running, deleting rows and
> trying to vacuum immediately, even full, fails.  Shut-down WebObjects
> and I can.

WebObjects is evidently holding an open transaction.  Ergo, anything
deleted after the start of that transaction isn't vacuumable.  You need
to do something about the client-side logic that is holding an open
transaction without doing anything ...

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Function to kill backend

2004-04-02 Thread Andrew Dunstan



Tom Lane wrote:

Andrew Dunstan <[EMAIL PROTECTED]> writes:
 

Magnus Hagander wrote:
   

The function is pg_kill_backend(,). superuser-only, of
course. Which simply sends a signal to the specified backend -
querycancel, terminate, etc.
 

 

If' we're going to have this shouldn't it be a proper command? And maybe 
an internal shutdown command to go with it?
   

I don't like the idea at all, but if we were to have something
it would definitely need to be a lot more constrained than
send-any-signal-to-any-postgres-process ... even for a superuser,
that's a mighty fat-gauge foot-gun.
 



What sort of constraints do you have in mind?

cheers

andrew

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Problems Vacuum'ing

2004-04-02 Thread Jim Seymour

Alvaro Herrera <[EMAIL PROTECTED]> wrote:
> 
[snip]
> 
> Turn on query logging and see if the BEGIN is issued right after the
> COMMIT/ROLLBACK, or whether it waits and issues it right before
> SELECT/CREATE TEMP TABLE.
> 
> It doesn't matter if it's only doing queries; if it does them inside a
> transaction, it would be enough to keep VACUUM from working "properly."

Will the following do as well?

(Thanks to Jochem for the pointer for how to determine open
transactions and the pg_stat_activity hint.)

Logged into work.  WebObects application is running.  Database I'm
working with partly populated from earlier work.

postgres=# select * from pg_locks where transaction is not null;
 relation | database | transaction |  pid  | mode  | granted 
--+--+-+---+---+-
  |  | 1245358 | 18020 | ExclusiveLock | t
  |  | 1245364 |   267 | ExclusiveLock | t
(2 rows)

postgres=# select * from pg_stat_activity;
 datid | datname  | procpid | usesysid |  usename   | current_query | query_start 
---+--+-+--++---+-
 17142 | postgres | 267 |1 | postgres   |   | 
 17144 | qantel   |   18020 |  103 | webobjects |   | 
(2 rows)

sysagent=> delete from ethers;
DELETE 368
sysagent=> delete from host_mac_hist;
DELETE 169
sysagent=> vacuum full analyze verbose ethers;
INFO:  vacuuming "public.ethers"
INFO:  "ethers": found 0 removable, 368 nonremovable row versions in 4 pages
DETAIL:  368 dead row versions cannot be removed yet.
Nonremovable row versions range from 64 to 88 bytes long.
There were 55 unused item pointers.
Total free space (including removable row versions) is 3724 bytes.
0 pages are or will become empty, including 0 at the end of the table.
1 pages containing 3628 free bytes are potential move destinations.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "ethers_hostname_key" now contains 368 row versions in 275 pages
DETAIL:  0 index row versions were removed.
223 index pages have been deleted, 223 are currently reusable.
CPU 0.01s/0.03u sec elapsed 0.23 sec.
INFO:  "ethers": moved 0 row versions, truncated 4 to 4 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "public.ethers"
INFO:  "ethers": 4 pages, 0 rows sampled, 0 estimated total rows
VACUUM
sysagent=> vacuum full analyze verbose host_mac_hist;
INFO:  vacuuming "public.host_mac_hist"
INFO:  "host_mac_hist": found 0 removable, 169 nonremovable row versions in 2 pages
DETAIL:  169 dead row versions cannot be removed yet.
Nonremovable row versions range from 64 to 80 bytes long.
There were 0 unused item pointers.
Total free space (including removable row versions) is 3556 bytes.
0 pages are or will become empty, including 0 at the end of the table.
1 pages containing 3532 free bytes are potential move destinations.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "host_mac_hist": moved 0 row versions, truncated 2 to 2 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "public.host_mac_hist"
INFO:  "host_mac_hist": 2 pages, 0 rows sampled, 0 estimated total rows
VACUUM

Shut down WebObjects.  Clear both tables.  Do full vacuum.
Re-populate tables.  Then...

postgres=# select * from pg_locks where transaction is not null;
 relation | database | transaction | pid  | mode  | granted 
--+--+-+--+---+-
  |  | 1245558 | 3110 | ExclusiveLock | t
(1 row)

postgres=# select * from pg_stat_activity;
 datid | datname  | procpid | usesysid | usename  | current_query | query_start 
---+--+-+--+--+---+-
 17142 | postgres |3110 |1 | postgres |   | 
(1 row)

sysagent=> delete from ethers;
DELETE 368
sysagent=> delete from host_mac_hist;
DELETE 169
sysagent=> vacuum full analyze verbose ethers;
INFO:  vacuuming "public.ethers"
INFO:  "ethers": found 10030 removable, 0 nonremovable row versions in 98 pages
DETAIL:  0 dead row versions cannot be removed yet.
Nonremovable row versions range from 0 to 0 bytes long.
There were 0 unused item pointers.
Total free space (including removable row versions) is 760736 bytes.
98 pages are or will become empty, including 98 at the end of the table.
0 pages containing 0 free bytes are potential move destinations.
CPU 0.01s/0.01u sec elapsed 0.01 sec.
INFO:  index "ethers_hostname_key" now contains 0 row versions in 275 pages
DETAIL:  10030 index row versions were removed.
271 index pages have been deleted, 271 are currently reusable.
CPU 0.00s/0.08u sec elapsed 0.08 sec.
INFO:  "ethers": truncated 98 to 0 pages
INFO:  analyzing "public.ethers"
INFO:  "ethers": 0 pages, 0 rows sampled, 0 estimated total rows
VACUUM
sysagent=> vacuum full analyze verbose host_mac_hist;
INFO:  vacuuming "public.host_mac_hist"
INFO:  "host_mac_hist": found 169 rem

Re: [HACKERS] Inconsistent behavior on Array & Is Null?

2004-04-02 Thread Greg Stark

Joe Conway <[EMAIL PROTECTED]> writes:

> Greg Stark wrote:
> > array_lower() and array_upper() are returning NULL for a non-null input, the
> > empty array, even though lower and upper bounds are known just as well as they
> > are for any other sized array. They are behaving as if there's something
> > unknown about the empty array that makes it hard to provide a lower bound or
> > upper bound.
> 
> Sorry, but I still disagree. There *is* something unknown about the lower and
> upper bound of an empty array because there are no bounds.

So I understand your point of view now. But I think the intuitive meaning here
for lower/upper bound as the lowest/highest index where an element is present
is only a part of the picture.

lower and upper bound are also related to other properties like where
array_prepend and array_append place things. And of course the array length.

So to give a practical example, say I was implementing a stack using an array.
I push things on by extending the array by storing in arr[array_upper(arr)+1].
(Incidentally I don't think we actually have any way to shorten an array, do
we?) As soon as I pop off the last element I lose my stack pointer. I need a
special case in my code to handle pushing elements in when the array is empty.

In reality array_append() would work fine. It's only array_upper() that's
going out of its way to make things weird. There's still an upper bound,
array_append() knows it, array_upper() just hides this value from the user.

> I don't see the spec defined CARDINALITY as a workaround. It defines length as
> the number of elements in the array. When the array is empty, that value is
> clearly 0. Nothing strange about it.

The workaround is having to have that case handled with a special case if
statement. If array_lower(), array_upper(), array_length()/CARDINALITY are all
defined in a consistent way it doesn't seem like there ought to be any special
cases in the implementations. There should be a simple rigid mathematical
relationship between them. namely "upper-lower+1 = length"

> > test=# select array_upper(a||b, 1), array_upper(a,1)+array_length(b) from (select 
> > '{}'::int[] as a, array[1,2] as b) as x;
> >  array_upper | ?column? -+--
> >2 | (1 row)
> 
> OK, you got me with this corner case. But using what you described as the
> result int_aggregate would give you in this case (-1), you would get an even
> stranger answer (-1 + 2 = 1) that would still need to be worked around.

No actually, 1 would be the correct answer, the original array would have
indexes ranging from [0,-1] and the new array would have indexes ranging from
[0,1], ie, two elements. The only strangeness is the unusual lower bound which
isn't the default for postgres arrays constructed from string literals.
Personally I prefer the zero-based indexes but as neither SQL-foo nor
backwards compatibility agree with me here I'll give that up as a lost cause
:)

-- 
greg

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Inconsistent behavior on Array & Is Null?

2004-04-02 Thread Greg Stark

Joe Conway <[EMAIL PROTECTED]> writes:

> Greg Stark wrote:
> > I'm leaning towards suggesting that postgres should follow sql-99 here and
> > normalize all array indexes to have a lower bound of 1. Then array_lower and
> > array_upper become entirely unnecessary. Instead we just have array_length
> > which is exactly equivalent to my idea of array_upper.
> >
> 
> Now we finally have something to agree on ;-)
> 
> I do think this is the way to go, but it is a significant hit to backward
> compatibility. Same is true for supporting NULL elements of arrays -- maybe we
> should bite the bullet and make both changes at the same time?

In fact on further thought I think they *have* to be done together.

I forgot that your code did something else cool allowing updates to extend
arrays by directly updating elements outside the current bounds. Ie:

slo=> update test set a = '{}';
UPDATE 1
slo=> update test set a[1] = 1;
UPDATE 1
slo=> select * from test;
  a  
-
 {1}
(1 row)

But if we normalize array indexes to start at 1 then this makes it hard to
fill in the array starting at higher values. For example:

slo=> update test set a = '{}';
UPDATE 1
slo=> update test set a[5] = 5;
UPDATE 1
slo=> select a[5] from test;
 a 
---
 5
(1 row)

This really ought to work, it obviously shouldn't allow you to set a[5] and
then surreptitiously move it to a[1]. But nor should it generate an error,
since I may well have a specific meaning for a[5] and may be planning to fill
in a[1]..a[4] later.

The logical thing to do, I think, is pre-fill the entries a[1]..a[4] with
null. This could be implemented by actually storing the NULLs or else storing
some notation that's used to adjust the base of the index to save space.

One thing that can't be made to work like it does now is extending the array
on the low end indefinitely:

slo=> update test set a[1] = 1;
UPDATE 1
slo=> update test set a[0] = 0;
UPDATE 1
slo=> update test set a[-1] = -1;
UPDATE 1
slo=> select * from test;
a 
--
 {-1,0,1}
(1 row)

If this all looks familiar it's because Perl, and other languages, also behave
this way:

bash-2.05b$ perl -e '@a = (); $a[10]=10; print join(",",@a),"\n"'
,,10
bash-2.05b$ perl -e '@a = (); $a[-1]=-1; print join(",",@a),"\n"'
Modification of non-creatable array value attempted, subscript -1 at -e line 1.

-- 
greg

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] [GENERAL] Large DB

2004-04-02 Thread Tom Lane

Manfred Koizar <[EMAIL PROTECTED]> writes:
> On Fri, 02 Apr 2004 14:48:13 -0500, Tom Lane <[EMAIL PROTECTED]> wrote:
>> A bigger problem is that this makes the sampling quite nonuniform,
>> because rows that are on relatively low-density pages would be more
>> likely to become part of the final sample than rows that are on pages
>> with lots of tuples.

> This sounds like you are assuming that I want to take exactly one tuple
> out of each block of the block sample.  This is not the case.

No, I understood that you wanted to resample, but [ ... thinks for
awhile ... ] hmm, now I can't construct a failure case either.  I must
have done the math wrong before.

There's still a risk of not being able to collect N rows out of N
blocks, if you are unfortunate enough to select a lot of wholly-empty
pages.  But that seems like a low-probability scenario; besides such a
table would be so desperately in need of VACUUM FULL that the possible
low quality of the stats hardly matters.

You should not need to use the Vitter algorithm for the block-level
selection, since you can know the number of blocks in the table in
advance.  You can just use the traditional method of choosing each block
or not with probability (k/K), where k = number of sample blocks still
needed, K = number of blocks from here to the end.  You'd run the Vitter
algorithm separately to decide whether to keep or discard each live row
you find in the blocks you read.

I do like this, since it eliminates the current method's bias towards
estimating the number of live rows from the density found near the start
of the table only.  At the end you'd know the total number of live rows
on all the pages you read, and it's reasonable to extrapolate that total
to the full table size.

Question: if the table size is less than N blocks, are you going to read
every block or try to reduce the number of blocks sampled?  If you don't
adjust the sample size then I think this would perform worse for
intermediate-size tables than the current method does ... perhaps not so
much at sample size = 3000, but at larger sizes it would hurt.  A lot of
people are setting the stats target to 100 which means a sample size of
3 --- how do the page-access counts look in that case?

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Function to kill backend

2004-04-02 Thread Dann Corbit

> -Original Message-
> From: Magnus Hagander [mailto:[EMAIL PROTECTED] 
> Sent: Friday, April 02, 2004 2:34 PM
> To: Andrew Dunstan; [EMAIL PROTECTED]
> Subject: Re: [HACKERS] Function to kill backend
> 
> 
> >>Hi!
> >>
> >>When debugging on win32, I've created myself a little 
> function that I 
> >>feel should be added to the "backend proper". While it adds 
> a lot of 
> >>vlaue on win32, I think it adds quite a bit of value on non-win32 
> >>platforms as well...
> >>
> >>The function is pg_kill_backend(,). superuser-only, of 
> >>course. Which simply sends a signal to the specified backend - 
> >>querycancel, terminate, etc.
> >>  
> >>
> >
> >If' we're going to have this shouldn't it be a proper command?
> >And maybe 
> >an internal shutdown command to go with it?
> 
> I guess it could be. I guess:
> 
> 1) I don't know how to do one of those ;-) Which is why I 
> didn't even think it.
> 
> 2) Won't that clutter up the namespace more, by introducing 
> more keywords that you can't use for other things?
> 
> 3) Will it still be possible to do the kind of things Rod 
> mentioned, e.g. SELECT pg_kill_backend(procpid, 'TERM') FROM 
> pg_stat_activity WHERE current_query LIKE '%'; ?

Sybase had something like that:

Syb_kill  to kill a dangling process.

And the undocumented:

Syb_terminate  
to absolutely, positively kill it (Syb_kill only worked sometimes).

In general, I think this approach is a bit worrisome.  It reminds one of
the famous tagline:

"Tip: Don't kill -9 the postmaster."


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Function to kill backend

2004-04-02 Thread Magnus Hagander

>>Hi!
>>
>>When debugging on win32, I've created myself a little function that I
>>feel should be added to the "backend proper". While it adds a lot of
>>vlaue on win32, I think it adds quite a bit of value on non-win32
>>platforms as well...
>>
>>The function is pg_kill_backend(,). superuser-only, of
>>course. Which simply sends a signal to the specified backend -
>>querycancel, terminate, etc.
>>  
>>
>
>If' we're going to have this shouldn't it be a proper command? 
>And maybe 
>an internal shutdown command to go with it?

I guess it could be. I guess:

1) I don't know how to do one of those ;-) Which is why I didn't even
think it.

2) Won't that clutter up the namespace more, by introducing more
keywords that you can't use for other things?

3) Will it still be possible to do the kind of things Rod mentioned,
e.g.
SELECT pg_kill_backend(procpid, 'TERM') FROM pg_stat_activity WHERE
current_query LIKE '%';
?

//Magnus

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Function to kill backend

2004-04-02 Thread Tom Lane

Andrew Dunstan <[EMAIL PROTECTED]> writes:
> Magnus Hagander wrote:
>> The function is pg_kill_backend(,). superuser-only, of
>> course. Which simply sends a signal to the specified backend -
>> querycancel, terminate, etc.

> If' we're going to have this shouldn't it be a proper command? And maybe 
> an internal shutdown command to go with it?

I don't like the idea at all, but if we were to have something
it would definitely need to be a lot more constrained than
send-any-signal-to-any-postgres-process ... even for a superuser,
that's a mighty fat-gauge foot-gun.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Function to kill backend

2004-04-02 Thread Andrew Dunstan

Magnus Hagander wrote:

Hi!

When debugging on win32, I've created myself a little function that I
feel should be added to the "backend proper". While it adds a lot of
vlaue on win32, I think it adds quite a bit of value on non-win32
platforms as well...
The function is pg_kill_backend(,). superuser-only, of
course. Which simply sends a signal to the specified backend -
querycancel, terminate, etc.
 

If' we're going to have this shouldn't it be a proper command? And maybe 
an internal shutdown command to go with it?

cheers

andrew



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

[HACKERS] Function to kill backend

2004-04-02 Thread Magnus Hagander

Hi!

When debugging on win32, I've created myself a little function that I
feel should be added to the "backend proper". While it adds a lot of
vlaue on win32, I think it adds quite a bit of value on non-win32
platforms as well...

The function is pg_kill_backend(,). superuser-only, of
course. Which simply sends a signal to the specified backend -
querycancel, terminate, etc.

The advantage over using the kill command from a shell account is, well,
you don't need shell access to the db server. On win32, that's going to
be more common than on Unix - plus, if you want to signal a specific
backend, you need a special tool (can't do from tas kmanager/service
manager etc - service manager can only do the postmaster, and task
manager can only do kill -9).

I also think a function like this could be good to have for e.g.
pgadmin, to implement some more "management functionality". For example,
in MSSQL I can go into a view called "current activity", pick a "bad
user", right-click and cancel query or terminate session. To do this
remote, a funciton like this is required. pg_stat_activity can be used
to get a list of sessions and their pids. The function should probably
be complemented with a pg_get_postmasterpid or something along that way,
to be able to send signals to th epostmaster itself.

So, would such a function be accepted into the backend code? And if so,
any preferences on where you want it put?


//Magnus


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] [GENERAL] Large DB

2004-04-02 Thread Manfred Koizar

On Fri, 02 Apr 2004 14:48:13 -0500, Tom Lane <[EMAIL PROTECTED]> wrote:
>Manfred Koizar <[EMAIL PROTECTED]> writes:
>> What I have in mind is a kind of "Double Vitter" algorithm.  [...]
>> random sample of sample_size block numbers, and then to sample the rows
>> out of this pool of blocks.
>
>That assumption is faulty, though --- consider wholly-empty pages.
>
>A bigger problem is that this makes the sampling quite nonuniform,
>because rows that are on relatively low-density pages would be more
>likely to become part of the final sample than rows that are on pages
>with lots of tuples.

This sounds like you are assuming that I want to take exactly one tuple
out of each block of the block sample.  This is not the case.  In the
second round I plan to apply the same (or a better) Vitter method as it
is done now.  The main difference is that blocks will be adressed
indirectly through the array of block numbers obtained in the first
round.

>  Thus for example your sample would tend to favor
>rows with wide values of variable-width columns and exclude narrower
>values.  (I am not certain that the existing algorithm completely avoids
>this trap, but at least it tries.)

I'm reading 7.4 source code and I fail to see how it does this.  If the
relation starts with an atypical distribution of wide/narrow or
dead/alive tuples, a wrong value for tuplesperpage is used for the rest
of the sampling.

Tuples immediately following one or more dead tuples have a better
chance of being selected.  This may be called as random as anything else
and not favouring a special property.  OTOH after long runs of dead
tuples consecutive tuples are likely to be selected.

Your comment about nonuniformity above exactly describes the current
algorithm:  Once the initial sample is fetched and tuplesperpage is
determined, targpos is computed without any further feedback.  If
targpos points to a sparsely populated area (with wide tuples or with
many dead tuples) tuples in this area are more likely to get into the
sample than tuples in densely populated areas (with many small active
tuples).

I think that cutting down the number of blocks to be looked at does not
affect these problems.

Servus
 Manfred

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Problems Vacuum'ing

2004-04-02 Thread Alvaro Herrera

On Fri, Apr 02, 2004 at 02:51:30PM -0500, Jim Seymour wrote:
> > 
> > On Fri, Apr 02, 2004 at 12:02:22PM -0500, Jim Seymour wrote:

> > Ok, so the WebObjects app keeps an idle open transaction?  (i.e. it
> > issues a BEGIN as soon as the previous transaction is finished.) I'm not
> > sure I read the code correctly --
> 
> I really couldn't say.  I don't know what the WebObjects app is
> doing.  I know, or *believe* I know, it's only doing queries.  (It
> may be doing temp tables internally, or some-such.)  Its interface
> to pgsql is via the JDBC that comes with pgsql.  I don't know what
> the Java code it generated looks like.

Turn on query logging and see if the BEGIN is issued right after the
COMMIT/ROLLBACK, or whether it waits and issues it right before
SELECT/CREATE TEMP TABLE.

It doesn't matter if it's only doing queries; if it does them inside a
transaction, it would be enough to keep VACUUM from working "properly."

-- 
Alvaro Herrera ()
"La gente vulgar solo piensa en pasar el tiempo;
el que tiene talento, en aprovecharlo"

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Better support for whole-row operations and composite types

2004-04-02 Thread Tom Lane

Joe Conway <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> We will be able to make generic I/O routines for composite types, 
>> comparable to those used now for arrays.  Not sure what a convenient 
>> external format would look like.  (Possibly use the same conventions
>> as for a 1-D array?)

> So you mean like an array, but with possibly mixed datatypes?
> '{1 , "abc def", 2.3}'
> Seems to make sense.

The unresolved question in my mind is how to represent NULL elements.
However, we have to solve that sooner or later for arrays too.  Any
thoughts?

> Another option might be to use the ROW keyword, something like:
> ROW[1 , 'abc', 2.3]

This is a separate issue, just as the ARRAY[] constructor has different
uses from the array I/O representation.  I do want some kind of runtime
constructor, but ROW[...] doesn't get the job done because it doesn't
provide any place to specify the rowtype name.  Maybe we could combine
ROW[...] with some sort of cast notation?

ROW[1 , 'abc', 2.3] :: composite_type_name
CAST(ROW[1 , 'abc', 2.3] AS composite_type_name)

Does SQL99 provide any guidance here?

>> TupleDescGetSlot: no-op, returns NULL TupleGetDatum: ignore slot,
>> return tuple t_data pointer as datum
>> 
>> This will work because heap_formtuple and BuildTupleFromCStrings can 
>> return a HeapTuple whose t_data part is already a valid row Datum,
>> simply by setting the appropriate length and type fields in it.  (If
>> the tuple is ever stored to disk as a regular table row, these fields
>> will be overwritten with xmin/cmin info at that time.)

> Is this the way you did things in your recent commit?

Almost.  I ended up keeping TupleDescGetSlot as a live function, but its
true purpose is only to ensure that the tupledesc gets registered with
the type cache (see BlessTupleDesc() in CVS tip).  The slot per se never
gets used.  I believe that CVS tip is source-code-compatible with
existing SRFs, even though I adjusted all the ones in the distribution
to stop using the TupleTableSlot stuff.

The main point though is that row Datums now contain sufficient info
embedded in them to allow runtime type lookup the same as we do for arrays.

>> To convert a row Datum into something that can be passed to
>> heap_getattr, one could use a local variable of type HeapTupleData
>> and set its t_data field to the datum's pointer value.  t_len is
>> copied from the datum contents, while the other fields of
>> HeapTupleData can just be set to zeroes.

> I think I understand this, but an example would help.

There are several in the PL sources now, for instance plpgsql does this
with an incoming rowtype argument:

if (!fcinfo->argnull[i])
{
HeapTupleHeader td;
OidtupType;
int32tupTypmod;
TupleDesctupdesc;
HeapTupleData tmptup;

td = DatumGetHeapTupleHeader(fcinfo->arg[i]);

/* Extract rowtype info and find a tupdesc */
tupType = HeapTupleHeaderGetTypeId(td);
tupTypmod = HeapTupleHeaderGetTypMod(td);
tupdesc = lookup_rowtype_tupdesc(tupType, tupTypmod);

/* Build a temporary HeapTuple control structure */
tmptup.t_len = HeapTupleHeaderGetDatumLength(td);
ItemPointerSetInvalid(&(tmptup.t_self));
tmptup.t_tableOid = InvalidOid;
tmptup.t_data = td;

exec_move_row(&estate, NULL, row, &tmptup, tupdesc);
 }

This is okay because the HeapTupleData is not needed after the call to
exec_move_row.

>> * We have to be able to re-use an already-existing cache entry if it 
>> matches a requested TupleDesc.

> For anonymous record types, how will that lookup be done efficiently? 
> Can the hash key be an array of attribute oids?

Right, that's the way I did it.  See src/backend/utils/cache/typcache.c

> As an aside, it would be quite useful to have support for arrays of 
> tuples. Any idea on how to do that without needing to define an explicit 
> array type for each tuple type?

Hmm, messy ...

I wonder now whether we still really need a separate pg_type entry for
every array type.  The original motivation for doing that has been at
least partly subsumed by storing element type OIDs inside the arrays
themselves.  I wonder if we could go over to a scheme where, say,
atttypid is the base type ID and attndims being nonzero is what you
check to find out it's really an array of atttypid.  Not sure how we
could map that idea into function and expression args/results, though.

Plan B would be to go ahead and create array types.  Not sure I would
want to do this for table rowtypes, but if we did it only for CREATE
TYPE AS then it doesn't sound like an unreasonable amount of overhead.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Inconsistent behavior on Array & Is Null?

2004-04-02 Thread Greg Stark

Tom Lane <[EMAIL PROTECTED]> writes:

> That would break even more things, no?
> 
> On the other hand, it'd get rid of the problem that we presently face
> with dump/restore of arrays that don't have lower bound 1.  Because
> pg_dump doesn't do anything to mark such values, they'll end up with
> lower bound 1 after reload anyway.  The fact that we haven't heard lots
> of squawks about that suggests to me that not many people are using such
> arrays at present ...

You have to be using not only arrays, but the new 7.4 functions provided to
manipulate them. In fact I think you have to be using array_prepend
specifically. But even there since it's not a mutator it's really not that
surprising that the elements of the brand new array it's returning should have
new indexes.

In fact I suspect there are more people with hidden bugs where they depend on
arrays starting at 1. This type of bug is insidious since it's hard to test
for, your application might never generate an array with a lower bound other
than 1 until someone adds some new code using array_prepend somewhere and all
of the sudden you get strange behaviours from unrelated code.

I can have the honour of being the first squawker like you describe, but my
problem was only evidence that having such non-normalized arrays at all was
surprising. I was using int_aggregate.c which generates non-standard arrays
with lower bounds of 0. My code assumed array_upper()+1 == length. After I
dumped and restored all my counts were off by one.

-- 
greg

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Problems Vacuum'ing

2004-04-02 Thread Jim Seymour

> 
> On Fri, Apr 02, 2004 at 12:02:22PM -0500, Jim Seymour wrote:
> 
> [...]
> 
> > Which is all well-and-good (tho, my ignorance readily conceded, four
> > minutes seems a mite... long), *except*: If I shut-down the
> > WebObjects application which, again, never accesses the db in
> > question, much-less any of its tables, this "time-out" doesn't seem
> > to apply.  (I tried it.)
> 
> Ok, so the WebObjects app keeps an idle open transaction?  (i.e. it
> issues a BEGIN as soon as the previous transaction is finished.) I'm not
> sure I read the code correctly --

I really couldn't say.  I don't know what the WebObjects app is
doing.  I know, or *believe* I know, it's only doing queries.  (It
may be doing temp tables internally, or some-such.)  Its interface
to pgsql is via the JDBC that comes with pgsql.  I don't know what
the Java code it generated looks like.

>   ISTM it would only matter when you try
> to vacuum a shared table, which this is not ...

That's what I would've thought.

Thanks for the follow-up.  I was beginning to wonder if anybody'd
noticed ;).

Jim

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] [GENERAL] Large DB

2004-04-02 Thread Tom Lane

Manfred Koizar <[EMAIL PROTECTED]> writes:
> What I have in mind is a kind of "Double Vitter" algorithm.  Whatever we
> do to get our sample of rows, in the end the sampled rows come from no
> more than sample_size different blocks.  So my idea is to first create a
> random sample of sample_size block numbers, and then to sample the rows
> out of this pool of blocks.

That assumption is faulty, though --- consider wholly-empty pages.

A bigger problem is that this makes the sampling quite nonuniform,
because rows that are on relatively low-density pages would be more
likely to become part of the final sample than rows that are on pages
with lots of tuples.  Thus for example your sample would tend to favor
rows with wide values of variable-width columns and exclude narrower
values.  (I am not certain that the existing algorithm completely avoids
this trap, but at least it tries.)

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] GiST future

2004-04-02 Thread Oleg Bartunov

On Fri, 2 Apr 2004, Joe Conway wrote:

> [EMAIL PROTECTED] wrote:
> > Now I wonder why 7.4 array queries using SOME/ANY/ALL don't make use of
> > this king of indexes.
> >
> > Is there any plan to integrate Oleg's job into the core as GiST indexes
> > were?
>
> I hope to get to it someday, but have higher priorities if I can find
> some time between now and the 7.5 release. If you want to pitch in and
> implement it, by all means, go for it.
>

current GiST lacks of concurrency and recovery support. It's doable and
there are some plans (me and Teodor) to work on it, but we still have no
time. That means, it's too early to integrate gist based contribs into
core. Also, current GiST interface should be enhanced to be  able support
some very interesting data types like digital trees, similarity tree, etc.


> Joe
>
>
> ---(end of broadcast)---
> TIP 9: the planner will ignore your desire to choose an index scan if your
>   joining column's datatypes do not match
>

Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] GiST future

2004-04-02 Thread Tom Lane

Joe Conway <[EMAIL PROTECTED]> writes:
> [EMAIL PROTECTED] wrote:
>> Now I wonder why 7.4 array queries using SOME/ANY/ALL don't make use of
>> this king of indexes.
>> 
>> Is there any plan to integrate Oleg's job into the core as GiST indexes
>> were?

> I hope to get to it someday, but have higher priorities if I can find 
> some time between now and the 7.5 release. If you want to pitch in and 
> implement it, by all means, go for it.

GiST indexes aren't really ready for prime time (no WAL logging and poor
concurrent behavior being the main gripes).  I'm hesitant to make any of
the core SQL features depend on them until those issues are fixed.

Oleg and Teodor have made noises about doing something about those
problems, but I'm not sure it's at the top of their to-do lists ...

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Problems Vacuum'ing

2004-04-02 Thread Alvaro Herrera

On Fri, Apr 02, 2004 at 12:02:22PM -0500, Jim Seymour wrote:

[...]

> Which is all well-and-good (tho, my ignorance readily conceded, four
> minutes seems a mite... long), *except*: If I shut-down the
> WebObjects application which, again, never accesses the db in
> question, much-less any of its tables, this "time-out" doesn't seem
> to apply.  (I tried it.)

Ok, so the WebObjects app keeps an idle open transaction?  (i.e. it
issues a BEGIN as soon as the previous transaction is finished.) I'm not
sure I read the code correctly -- ISTM it would only matter when you try
to vacuum a shared table, which this is not ...

-- 
Alvaro Herrera ()
"El hombre nunca sabe de lo que es capaz hasta que lo intenta" (C. Dickens)

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] GiST future

2004-04-02 Thread Joe Conway

[EMAIL PROTECTED] wrote:
Now I wonder why 7.4 array queries using SOME/ANY/ALL don't make use of
this king of indexes.
Is there any plan to integrate Oleg's job into the core as GiST indexes
were?
I hope to get to it someday, but have higher priorities if I can find 
some time between now and the 7.5 release. If you want to pitch in and 
implement it, by all means, go for it.

Joe

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] PITR for replication?

2004-04-02 Thread Simon Riggs

> Greg Stark
> "J. Andrew Rogers" <[EMAIL PROTECTED]> writes:
>
> > I may be completely missing the point here, but it looks to
> me as though
> > the PITR archival mechanism is also most of a native replication
> > facility.  Is there anyone reason this couldn't be extended to
> > replication, and if so, is anyone planning on using it as such?
> >
> > My memory is fuzzy on this point, but I seem to recall that this is
> > (was?) how replication is more-or-less done for many of the big
> > commercial RDBMS.

You're right...it is the basis of a log shipping replication facility.
I'm keen no to get too far ahead of ourselves. Let's eat the mammoth one
bite at a timeor one patch at a time.

> Well "replication" is one of those things that means
> different things to
> different people. IMHO, this is one of the simpler, more
> reliable, mechanisms
> and would be nice to have. And you're right that it shouldn't
> require much
> more work, in fact it's probably easier than a lot of other
> cases that PITR
> requires.

I agree. PITR is intended initially as a recovery mechanism. Replication
has other purposes as well, such as locating data close to where it is
required (in master-multi-slave replication scenarios), which is
certainly not an objective or even a likely end point of the PITR work.
The PostgreSQL community is lucky enough to have some very competent
people working on those other approaches and I would recommend everybody
checks out that work.

> For a long time Oracle has supported this mechanism for
> running warm standby
> databases. Basically you maintain a second database and every
> time an archived
> log is finished you ship it over and immediately "restore" it
> on the standby
> machine. Whenever you have a failure you can quickly fail
> over to the standby
> database which is all ready to go and up-to-date within
> minutes of your
> failure.

This facility is one of the intended features, if all goes well. But it
is an advanced feature, not the bread and butter.

> Since 8i Oracle has also supported a more advanced version
> where you can open
> the standby database in read-only mode.

This mode requires more thought, but is certainly possible, in time.

Best Regards

Simon Riggs

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] PITR for replication?

2004-04-02 Thread Alvaro Herrera

On Fri, Apr 02, 2004 at 08:47:09AM -0800, Josh Berkus wrote:

> I believe that Command Prompt's "Mammoth Replicator" works on something like 
> log shipping.There are both advantages and disadvantages to the approach, 
> meaning that we will still need other replication strategies, like Slony and 
> DBMirror.

I wonder what would you use DBMirror for once Slony-I is complete?

-- 
Alvaro Herrera ()
"El destino baraja y nosotros jugamos" (A. Schopenhauer)

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] [GENERAL] Large DB

2004-04-02 Thread Manfred Koizar

[time to move this to -hackers]

On Fri, 02 Apr 2004 11:16:21 -0500, Tom Lane <[EMAIL PROTECTED]> wrote:
>Manfred Koizar <[EMAIL PROTECTED]> writes:
>> The first step, however, (acquire_sample_rows() in analyze.c) has to
>> read more rows than finally end up in the sample.  It visits less than
>> O(nblocks) pages but certainly more than O(1).
>
>> A vague feeling tries to tell me that the number of page reads is
>> somehow related to the harmonic numbers 1 + 1/2 + 1/3 + ... + 1/n, which
>> grow like O(ln(n)).
>
>Good guess.  Vitter's paper says the expected time to sample n rows from
>a table of size N is O(n * (1 + log(N/n))).

Well, for what I tried to find out my wild guess seems to be wrong.

I don't doubt that Vitter's formula is correct, but it assumes that
access to any tuple has the same cost.  This does not apply to our
problem, however.  With 100 tuples per page, we access the first
sample_size tuples at a cost of 0.01 sequential page reads per tuple.
Later we use less and less tuples per page which results in higher
per-tuple-cost.  Near the end of a large relation we can expect to
access only one tuple per page and more and more pages are skipped, so
that prefetching doesn't help any more.

Playing around with some real numbers (for 100 tuples/page and a sample
size of 3000) I got:

 rel  | page
 size | reads
--+-
   30 |30
  300 |   300expectation is something like 299.9995 
  500 |   499
   1K |   990
   3K |  2.6K
  30K |8K
 100K |   12K
   1M |   19K
  10M |   26K
 100M |   33K

This growth rate is steeper than O(log(nblocks)).

>> I have an idea how this could be done with O(1) page reads.

What I have in mind is a kind of "Double Vitter" algorithm.  Whatever we
do to get our sample of rows, in the end the sampled rows come from no
more than sample_size different blocks.  So my idea is to first create a
random sample of sample_size block numbers, and then to sample the rows
out of this pool of blocks.

I have to think harder though, what to do about those 400 pages that are
not accessed when the sample size is 3000 ...

>The hard part is getting a genuinely random sample when we don't know N
>in advance.  We do however know the table size in blocks, so if you're
>willing to make assumptions about constant tuple density you could do
>something different.  (But the tuple density assumption is exactly the
>weak spot of what we've got, so I'm unconvinced that would be a big step
>forward.)

Starting the scan at some random blocks should help against the common
case of unusual distribution of dead tuples near the start of the
relation.  And I plan to factor information about dead tuple hits into
an increasingly better estimation of dead/live tuple ratio.

Servus
 Manfred

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Better support for whole-row operations and composite

2004-04-02 Thread Joe Conway

Tom Lane wrote:
We have a number of issues revolving around the fact that composite
types (row types) aren't first-class objects.  I think it's past time
to fix that.  Here are some notes about doing it.  I am not sure all
these ideas are fully-baked ... comments appreciated.
[Sorry for the delay in responding]

Nice work, and in general it makes sense to me. A few comments below.

We will be able to make generic I/O routines for composite types, 
comparable to those used now for arrays.  Not sure what a convenient 
external format would look like.  (Possibly use the same conventions
as for a 1-D array?)
So you mean like an array, but with possibly mixed datatypes?

'{1 , "abc def", 2.3}'

Seems to make sense.

Another option might be to use the ROW keyword, something like:

ROW[1 , 'abc', 2.3]

We could also think about allowing functions that are declared as 
accepting RECORD (ie, polymorphic-across-row-types functions).  They
would use the same methods already used by polymorphic functions to
find out the true types of their inputs.  (Might be best to invent a
separate pseudotype, say ANYRECORD, rather than overloading RECORD
for this purpose.)
Check. I really like this idea.

TupleDescGetSlot: no-op, returns NULL TupleGetDatum: ignore slot,
return tuple t_data pointer as datum
This will work because heap_formtuple and BuildTupleFromCStrings can 
return a HeapTuple whose t_data part is already a valid row Datum,
simply by setting the appropriate length and type fields in it.  (If
the tuple is ever stored to disk as a regular table row, these fields
will be overwritten with xmin/cmin info at that time.)
Is this the way you did things in your recent commit?

To convert a row Datum into something that can be passed to
heap_getattr, one could use a local variable of type HeapTupleData
and set its t_data field to the datum's pointer value.  t_len is
copied from the datum contents, while the other fields of
HeapTupleData can just be set to zeroes.
I think I understand this, but an example would help.

* We have to be able to re-use an already-existing cache entry if it 
matches a requested TupleDesc.
For anonymous record types, how will that lookup be done efficiently? 
Can the hash key be an array of attribute oids?

If an ALTER TABLE command does something that requires examining or 
changing every row of a table, it would presumably have to do the
same to all entries in any composite-type column of the table's
rowtype.  To avoid surprises and interesting debates about who has
permissions to do this, it might be wise to restrict on-disk
composite columns to be only of standalone composite types (ie, those
made with CREATE TYPE AS).  This restriction would also avoid debates
about whether table constraints apply to composite-type columns.
I agree.

As an aside, it would be quite useful to have support for arrays of 
tuples. Any idea on how to do that without needing to define an explicit 
array type for each tuple type?

Joe

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[HACKERS] Prepared select

2004-04-02 Thread Robert Turnbull




How can I use a prepared select statement 
as mentioned in the documentation on SQL PREPARE. Preparing the 
statement is easy, the problem is using the plan to get a cursor. My 
assumption is the SQL OPEN command is not documented or there is some 
other libpq API to make this happen.
 
Thanks

Re: [HACKERS] PITR for replication?

2004-04-02 Thread Josh Berkus

Andrew,

> I may be completely missing the point here, but it looks to me as though
> the PITR archival mechanism is also most of a native replication
> facility.  Is there anyone reason this couldn't be extended to
> replication, and if so, is anyone planning on using it as such?

I believe that Command Prompt's "Mammoth Replicator" works on something like 
log shipping.There are both advantages and disadvantages to the approach, 
meaning that we will still need other replication strategies, like Slony and 
DBMirror.

Also, the jump from PITR --> Log Shipping is not a trivial one.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Inconsistent behavior on Array & Is Null?

2004-04-02 Thread Joe Conway

Greg Stark wrote:
I'm leaning towards suggesting that postgres should follow sql-99 here and
normalize all array indexes to have a lower bound of 1. Then array_lower and
array_upper become entirely unnecessary. Instead we just have array_length
which is exactly equivalent to my idea of array_upper.
Now we finally have something to agree on ;-)

I do think this is the way to go, but it is a significant hit to 
backward compatibility. Same is true for supporting NULL elements of 
arrays -- maybe we should bite the bullet and make both changes at the 
same time?

Joe

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [HACKERS] Inconsistent behavior on Array & Is Null?

2004-04-02 Thread Joe Conway

Tom Lane wrote:
Greg Stark <[EMAIL PROTECTED]> writes:
I'm leaning towards suggesting that postgres should follow sql-99 here and
normalize all array indexes to have a lower bound of 1.
That would break even more things, no?

On the other hand, it'd get rid of the problem that we presently face
with dump/restore of arrays that don't have lower bound 1.  Because
pg_dump doesn't do anything to mark such values, they'll end up with
lower bound 1 after reload anyway.  The fact that we haven't heard lots
of squawks about that suggests to me that not many people are using such
arrays at present ...
The more I think about it, the more I like it. Does everyone else agree 
that a lower bound of 1 complies with the spec?

Joe

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Problems Vacuum'ing

2004-04-02 Thread Jim Seymour

I had written:
> 
[snip]
> 
> The problem is that attempts to vacuum these tables resulted in "NNN
> dead row versions cannot be removed yet."  Went through a lot of
> analysis (e.g.: "Any hanging txns?") and trying different things with
> folks on the #PostgreSQL IRC channel, all to no avail.
[snip]

Okay, the mystery is *partially* solved.  In IRC, when this was
brought up again this morning due to my post to -hackers, two things
happened:

I did a vacuumdb on one of the tables and, much to my surprise,
what wouldn't vacuum before I left work last nite *did* work this
morning.  And...

JanniCash discovered this in utils/time/tqual.c (I believe):

* OldestXmin is a cutoff XID (obtained from GetOldestXmin()).  Tuples
* deleted by XIDs >= OldestXmin are deemed "recently dead"; they might
* still be visible to some open transaction, so we can't remove them,
* even if we see that the deleting transaction has committed.

So the next thing I did was run a bunch of updates, quit the script,
then ran a while loop from the (Unix) command-line, trying to vacuum
the one table every 30 seconds.  To summarize...

Fri Apr  2 08:54:54 EST 2004
INFO:  "ethers": found 0 removable, 1834 nonremovable row
versions in 93 pages
DETAIL:  1466 dead row versions cannot be removed yet.
...
Fri Apr  2 08:58:56 EST 2004
INFO:  "ethers": found 1466 removable, 368 nonremovable row
versions in 93 pages
DETAIL:  0 dead row versions cannot be removed yet.

Which is all well-and-good (tho, my ignorance readily conceded, four
minutes seems a mite... long), *except*: If I shut-down the
WebObjects application which, again, never accesses the db in
question, much-less any of its tables, this "time-out" doesn't seem
to apply.  (I tried it.)

Any explanation for this behaviour?

Thanks,
Jim
-- 
Jim Seymour  | PGP Public Key available at:
[EMAIL PROTECTED] | http://www.uk.pgp.net/pgpnet/pks-commands.html
http://jimsun.LinxNet.com|

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Inconsistent behavior on Array & Is Null?

2004-04-02 Thread Joe Conway

Tom Lane wrote:
I think that maybe we ought to question these two properties:
* empty array is different from NULL ... really?  Why?
I think this makes sense, similar to the difference between '' and NULL.

* storing a value into an element of a NULL array yields
  a NULL array instead of a singleton array.
Same argument. If you think of text as an array or characters, you get 
this analogy (sort of):

regression=# create table s1(f1 int, f2 text);
CREATE TABLE
regression=# insert into s1 values(1, null);
INSERT 164679 1
regression=# select f1, substr(f2, 1, 1) is null from s1;
 f1 | ?column?
+--
  1 | t
(1 row)
regression=# update s1 set f2 = 'a' || substr(f2, 2);
UPDATE 1
regression=# select f1, substr(f2, 1, 1) is null from s1;
 f1 | ?column?
+--
  1 | t
(1 row)
Joe

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] Inconsistent behavior on Array & Is Null?

2004-04-02 Thread Joe Conway

Greg Stark wrote:
array_lower() and array_upper() are returning NULL for a non-null input, the
empty array, even though lower and upper bounds are known just as well as they
are for any other sized array. They are behaving as if there's something
unknown about the empty array that makes it hard to provide a lower bound or
upper bound.
Sorry, but I still disagree. There *is* something unknown about the 
lower and upper bound of an empty array because there are no bounds.

I know it's possible to work around this special case, but I'm saying it's odd
to have an irregularity in the interface. What justification is there for
breaking the invariant length = upper-lower+1 ?
I don't see the spec defined CARDINALITY as a workaround. It defines 
length as the number of elements in the array. When the array is empty, 
that value is clearly 0. Nothing strange about it.

Yes I read the examples you gave, but you a) had to work around the nit with a
special case in your function and b) still have corner cases where one of the
invariants I named fails, namely: 

test=# select array_upper(a||b, 1), array_upper(a,1)+array_length(b) from (select '{}'::int[] as a, array[1,2] as b) as x;
 array_upper | ?column? 
-+--
   2 | 
(1 row)
OK, you got me with this corner case. But using what you described as 
the result int_aggregate would give you in this case (-1), you would get 
an even stranger answer (-1 + 2 = 1) that would still need to be worked 
around.

Joe

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Inconsistent behavior on Array & Is Null?

2004-04-02 Thread Tom Lane

Greg Stark <[EMAIL PROTECTED]> writes:
> Well that breaks other things. Then lots of functions have to become
> non-strict to work properly because they should have valid output when passed
> null values. Ick.
> ...
> I'm leaning towards suggesting that postgres should follow sql-99 here and
> normalize all array indexes to have a lower bound of 1.

That would break even more things, no?

On the other hand, it'd get rid of the problem that we presently face
with dump/restore of arrays that don't have lower bound 1.  Because
pg_dump doesn't do anything to mark such values, they'll end up with
lower bound 1 after reload anyway.  The fact that we haven't heard lots
of squawks about that suggests to me that not many people are using such
arrays at present ...

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[HACKERS] GiST future

2004-04-02 Thread ohp

Hi every one.

I've played a bit with contrib/intarray and very much liked being able to
index a whole array.

Now I wonder why 7.4 array queries using SOME/ANY/ALL don't make use of
this king of indexes.

Is there any plan to integrate Oleg's job into the core as GiST indexes
were?

it would be nice to generalize things like
SELECT col1 from table where col2 contains (select col3 from table2 where
...) and use those indexes.

Or did I miss something in the docs?

What does core team think?

Best regards
-- 
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
6, Chemin d'Harraud Turrou   +33-5-61-50-97-01 (Fax)
31190 AUTERIVE   +33-6-07-63-80-64 (GSM)
FRANCE  Email: [EMAIL PROTECTED]
--
Make your life a dream, make your dream a reality. (St Exupery)

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Inconsistent behavior on Array & Is Null?

2004-04-02 Thread Greg Stark

Tom Lane <[EMAIL PROTECTED]> writes:

> Hm.  My idea of the index lower bound is "the smallest index for which
> there is an array member" --- so I agree with Joe that it's not very
> well defined for an empty array.  

Hm. The problem is that they have other implications. Like the upper bound is
one less than the index an element added to the upper end will get, and
similarly the lower bound is one more than the index that would be assigned to
an element added on the low end.

Currently there is a lower bound and upper bound in the implementation even
for empty arrays. I have empty arrays in my table that have a lower bound of
0, and they behave slightly differently than arrays with lower bounds of 1.

> I think that maybe we ought to question these two properties:
>   * empty array is different from NULL ... really?  Why?
>   * storing a value into an element of a NULL array yields
> a NULL array instead of a singleton array.

Well that breaks other things. Then lots of functions have to become
non-strict to work properly because they should have valid output when passed
null values. Ick.

I'm leaning towards suggesting that postgres should follow sql-99 here and
normalize all array indexes to have a lower bound of 1. Then array_lower and
array_upper become entirely unnecessary. Instead we just have array_length
which is exactly equivalent to my idea of array_upper.

-- 
greg

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] pre-loading a user table.

2004-04-02 Thread Zeugswetter Andreas SB SD


> Why do you think it useful to preload something during InitPostgres,
> anyway?  Any heavily used table will certainly be present in shared
> buffers already, and even more surely present in kernel buffers.

And if you really want it preloaded you can issue dummy selects
with a client right after startup. I really think locking a table into 
memory is a worthless feature if a good buffer manager is at work.

Andreas

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Inconsistent behavior on Array & Is Null?

2004-04-02 Thread Tom Lane

Greg Stark <[EMAIL PROTECTED]> writes:
> array_lower() and array_upper() are returning NULL for a non-null
> input, the empty array, even though lower and upper bounds are known
> just as well as they are for any other sized array. They are behaving
> as if there's something unknown about the empty array that makes it
> hard to provide a lower bound or upper bound.

Hm.  My idea of the index lower bound is "the smallest index for which
there is an array member" --- so I agree with Joe that it's not very
well defined for an empty array.  We could pick some arbitrary value,
such as zero, for the LB and UB to be reported for an empty array, but
I can't see any way that you could justify them except as arbitrary
choices.

I think that maybe we ought to question these two properties:
* empty array is different from NULL ... really?  Why?
* storing a value into an element of a NULL array yields
  a NULL array instead of a singleton array.
IIRC the latter is presently true because we couldn't figure out
just what dimensionality to assign, but it might be easier to agree on
that than to resolve these other arguments...

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

60 matches

Mail list logo