Re: [HACKERS] Fixing insecure security definer functions

2007-02-16 Thread Merlin Moncure

On 2/15/07, Tom Lane [EMAIL PROTECTED] wrote:

Merlin Moncure [EMAIL PROTECTED] writes:
 yikes!

 If you guys go through with forcing functions to attach to objects
 when they are created, it will break almost every project I've ever
 worked on :(.  The schema/function combo fits into all kinds of de
 facto partitioning strategies and organization methods.

If you read a bit further, I did suggest providing an option to retain
the current behavior.  I don't think it should be the default though.


Yeah, I saw that, but the issue is really deeper, functions that
create functions, etc. changing the default behavior affects how
functions work in a really fundamental way...all pl/pgsql code that
can float over schemas would have to be checked.  In the worst case,
this could mean converting large libraries to dynamic sql or creating
thousands of additional functions...ugh.

Maybe there could be a GUC setting(function default function schema
path=current path/path null)?  It would seem only appropriate to have
security definer raise a warning/error for path null though.  Then we
could debate about how that should be set by default but nobody really
loses that argument.

merlin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Chatter on DROP SOMETHING IF EXISTS

2007-02-16 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 On Thu, Feb 08, 2007 at 01:54:13PM -0500, Tom Lane wrote:
 I would be satisfied if the returned command tag were something else,
 maybe NO OPERATION.

 TABLE blah DID NOT EXIST might be less confusing...

You're confusing a command tag with a notice.  In the first place,
we shouldn't assume that applications are ready to deal with
indefinitely long command tags (the backend itself doesn't think they
can be longer than 64 bytes); in the second place, they should be
constant strings for the most part so that simple strcmp()s suffice
to see what happened.  Command tags are meant for programs to deal
with, more than humans.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] buildfarm failure in XML code

2007-02-16 Thread Alvaro Herrera
UX:acomp: ERROR: xml.c, line 2188: undefined symbol: INT64_MAX
UX:acomp: ERROR: xml.c, line 2188: undefined symbol: INT64_MIN

http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=warthogdt=2007-02-16%2009:06:01


-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Mail getting through?

2007-02-16 Thread Gregory Stark

I sent two emails last night, one to -patches and one to -hackers. Neither
seems to have come through. Has anyone else seen them?

On -patches I posted an updated patch that was functionally a noop but changed
the macro api to SETVARSIZE(). It also replaced the VARATT_SIZE and
VARATT_DATA macros with VARSIZE and VARDATA, changed inet to call detoast, and
changed arrays, inet, geometric data types and a handful of others to use
SET_VARSIZE instead of accessing a struct member directly.

On -hackers I reposted Tom's email where he proposed two sets of bitpatterns
with tradeoffs and also included an additional one that was the second of his
with the 2-byte cases removed.

In it I said that removing the 2-byte cases had no advantages but actually
since then I've thought of one. It makes the toaster code simpler since it can
just set a bit in the four-byte header just as it does now. It doesn't have to
worry about converting to a 2-byte header.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] HOT WIP Patch - version 1

2007-02-16 Thread Hannu Krosing
Ühel kenal päeval, N, 2007-02-15 kell 10:49, kirjutas Heikki
Linnakangas:

 We already log tuple removals by normal vacuums. We can't use that wal 
 entry as it is: if a dead tuple is in the middle of an update chain, it 
 needs to be unlinked from the chain. But I don't see any particular 
 problem with that, it just needs to be wal logged like every other data 
 changing operation.
 
 Do we actually ever want to remove dead tuples from the middle of the 
 chain? If a tuple in the middle of the chain is dead, surely every tuple 
 before it in the chain is dead as well, and we want to remove them as 
 well. I'm thinking, removing tuples from the middle of the chain can be 
 problematic, because we'd need to fiddle with the xmin/xmax of the other 
 tuples to make them match. Or change the tuple-following logic to not do 
 the xmin=xmax check, but it's a nice robustness feature.

What kind of robustness does it provide ? In other words - what failure
scenario does this guard against ?

I can't see the case where the xmin=xmax check can not succeed, at least
not for same page tuples.

-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [COMMITTERS] pgsql: Functions for mapping table data and table schemas to XML (a.k.a.

2007-02-16 Thread Peter Eisentraut
Am Freitag, 16. Februar 2007 08:46 schrieb Peter Eisentraut:
 Log Message:
 ---
 Functions for mapping table data and table schemas to XML (a.k.a. XML
 export)

Breaks various platforms.  I'm on it.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Mail getting through?

2007-02-16 Thread Alvaro Herrera
Gregory Stark wrote:
 
 I sent two emails last night, one to -patches and one to -hackers. Neither
 seems to have come through. Has anyone else seen them?

Not yet, but I got bounces from the server saying

451 - Server configuration problem

or something to that effect.  Since my emails were eventually delivered,
I'd expect yours to be too.


-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] pg_restore fails with a custom backup file

2007-02-16 Thread Magnus Hagander
On Fri, Feb 16, 2007 at 02:09:41PM +0900, Yoshiyuki Asaba wrote:

   Does not compile on my MinGW - errors in the system headers (unistd.h,
   io.h) due to changing the argument format for chsize(). The change of
   off_t propagated into parts of the system headers, thus chaos was
   ensured.
   
   I still think we need to use a pgoff_t. Will look at combining these two
   approaches.
  
  Here's a patch that tries this.
  *needs more testing*. But built with this patch, I can dump and
  restore a table at the end of a 10gb database without errors.
 
 I tried the attached patch. But I got the following error.
 
 pg_backup_archiver.o(.text+0x1fa4): In function `allocAH':
 C:/msys/1.0/home/y-asaba/postgresql-8.2.3-patch/src/bin/pg_dump/pg_backup_archiver.c:1580:
  undefined reference to `fseeko64'
 ...
 make[3]: *** [pg_dump] Error 1
 
   $ uname -sr
   MINGW32_NT-5.1 1.0.10(0.46/3/2)
 
 Is MINGW version too old?

I think so. It seems this was added in version 1.24 of stdio.h in mingw
(http://cygwin.com/cgi-bin/cvsweb.cgi/src/winsup/mingw/include/stdio.h?cvsroot=src).
Could you try upgrading mingw and see if that helps? Or possibly
instlaling side-by-side a different version (if they even allow that)?

//Magnus

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-16 Thread Hannu Krosing
Ühel kenal päeval, K, 2007-02-14 kell 10:41, kirjutas Tom Lane:
 Hannu Krosing [EMAIL PROTECTED] writes:
  OTOH, for same page HOT tuples, we have the command and trx ids stored
  twice first as cmax,xmax of the old tuple and as cmin,xmin of the
  updated tuple. One of these could probably be used for in-page HOT tuple
  pointer.
 
 This proposal seems awfully fragile, because the existing
 tuple-chain-following logic *depends for correctness* on comparing each
 tuple's xmin to prior xmax.  

What kinds of correctnes guarantees does this give for same-page tuples?

The comparing of each tuple's xmin to prior xmax should stay for
inter-page ctid links.

Mostly you can think of the same-page HOT chain as one extended tuple
when looking at it from outside of that page.

 I don't think you can just wave your hands and say we don't need that 
 cross-check.  

 Furthermore it seems to me you
 haven't fixed the problem, which is that you can't remove the chain
 member that is being pointed at by off-page links (either index entries
 or a previous generation of the same tuple).  

You can't remove any tuples before they are invisible for all
transactions (i.e. dead). And being dead implies that all previous
versions are dead as well. So if I can remove a tuple, I can also remove
all its previous versions as well. Or are you trying to say that VACUUM
follows ctid links of dead tuples for some purpose ?

The problem I am trying to fix is reusing in-page space without need to
touch indexes.

 As described, you've made
 that problem worse because you're trying to say we don't know which of
 the chain entries is pointed at.

There should be a flag, say HOT_CHAIN_ENTRY for the tuple the index(es)
point at. And this should be the preferred CTID for inserting new
versions once the old one is dead.

-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] buildfarm failure in XML code

2007-02-16 Thread Peter Eisentraut
Am Freitag, 16. Februar 2007 14:59 schrieb Alvaro Herrera:
 UX:acomp: ERROR: xml.c, line 2188: undefined symbol: INT64_MAX
 UX:acomp: ERROR: xml.c, line 2188: undefined symbol: INT64_MIN

 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=warthogdt=2007-02-1
6%2009:06:01

It needs a rebuild after the fix.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] Intermittent buildfarm failures due to timestamptz test

2007-02-16 Thread Seneca Cunningham
Multiple systems are occasionally failing the timestamptz test at the 
check stage.  From what I can tell, this is due to the newly added ISO
week checks referring to TIMESTAMP_TBL used by the timestamp test 
instead of TIMESTAMPTZ_TBL.  Both checks run at the same segment, so
the test fails if the timing between the timestamp and timestamptz 
checks doesn't match.

http://pgbuildfarm.org/cgi-bin/show_log.pl?nm=jackaldt=2007-02-16%2012:21:01

-- 
Seneca
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] buildfarm failure in XML code

2007-02-16 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 Am Freitag, 16. Februar 2007 14:59 schrieb Alvaro Herrera:
 UX:acomp: ERROR: xml.c, line 2188: undefined symbol: INT64_MAX
 UX:acomp: ERROR: xml.c, line 2188: undefined symbol: INT64_MIN
 
 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=warthogdt=2007-02-1
 6%2009:06:01

 It needs a rebuild after the fix.

This fix doesn't fix anything.  We have always in the past managed to
avoid assuming that int64 actually is available; I don't intend to give
the xml code a free pass to break that, especially for such an utterly
marginal purpose as this code has.  I'm also wondering why xml.c is the
only place anywhere in the code that uses stdint.h.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Mail getting through? Short varlena headers

2007-02-16 Thread Gregory Stark

I sent two emails last night, one to -patches and one to -hackers. Neither
seems to have come through. Has anyone else seen them? I sent this email once
already too and it hasn't come through. I'm resending it from a different
account now.

On -patches I posted an updated patch that was functionally a noop but changed
the macro api to SETVARSIZE(). It also replaced the VARATT_SIZE and
VARATT_DATA macros with VARSIZE and VARDATA, changed inet to call detoast, and
changed arrays, inet, geometric data types and a handful of others to use
SET_VARSIZE instead of accessing a struct member directly.

On -hackers I reposted Tom's email where he proposed two sets of bitpatterns
with tradeoffs and also included an additional one that was the second of his
with the 2-byte cases removed.

In it I said that removing the 2-byte cases had no advantages but actually
since then I've thought of one. It makes the toaster code simpler since it can
just set a bit in the four-byte header just as it does now. It doesn't have to
worry about converting to a 2-byte header.

So I'm thinking of doing it for now at least. I still think paying 2 bytes on
virtually every datum is silly even if mathematically it's only 2% space
savings that's still a 2% performance penalty on sequential scans and it gains
us nothing except a few lines of code saved in tuptoaster.c.

Comments?

-- 
greg


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] buildfarm failure in XML code

2007-02-16 Thread Peter Eisentraut
Am Freitag, 16. Februar 2007 16:09 schrieb Tom Lane:
 This fix doesn't fix anything.

Sure, it fixes the compilation failures.

 We have always in the past managed to 
 avoid assuming that int64 actually is available; I don't intend to give
 the xml code a free pass to break that,

I don't intend that either, but a proper fix will be more elaborate.  I will 
work on that.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-16 Thread Pavan Deolasee

On 2/16/07, Hannu Krosing [EMAIL PROTECTED] wrote:


Ühel kenal päeval, K, 2007-02-14 kell 10:41, kirjutas Tom Lane:
 Hannu Krosing [EMAIL PROTECTED] writes:
  OTOH, for same page HOT tuples, we have the command and trx ids stored
  twice first as cmax,xmax of the old tuple and as cmin,xmin of the
  updated tuple. One of these could probably be used for in-page HOT
tuple
  pointer.

 This proposal seems awfully fragile, because the existing
 tuple-chain-following logic *depends for correctness* on comparing each
 tuple's xmin to prior xmax.

What kinds of correctnes guarantees does this give for same-page tuples?



I agree with Tom that xmin/xmax check does help to guarantee correctness.
I myself have used it often during HOT development to find/fix bugs. But
ISTM that we don't need atleast for in-page tuple chain, if we are
careful. So if removing this buys us something important, I am all for it.

The comparing of each tuple's xmin to prior xmax should stay for

inter-page ctid links.



Agree.

Mostly you can think of the same-page HOT chain as one extended tuple

when looking at it from outside of that page.

 I don't think you can just wave your hands and say we don't need that
cross-check.

 Furthermore it seems to me you
 haven't fixed the problem, which is that you can't remove the chain
 member that is being pointed at by off-page links (either index entries
 or a previous generation of the same tuple).

You can't remove any tuples before they are invisible for all
transactions (i.e. dead). And being dead implies that all previous
versions are dead as well. So if I can remove a tuple, I can also remove
all its previous versions as well. Or are you trying to say that VACUUM
follows ctid links of dead tuples for some purpose ?



The only exception to this would be the case of aborted updates. In that
case a tuple is dead, but the one pointing to it is still live. But I don't
see
any reason somebody would want to follow a chain past a live tuple. Not
sure about the VACUUM FULL code path though. Thats the only
place other than EvalPlanQual where we follow ctid chain.


The problem I am trying to fix is reusing in-page space without need to

touch indexes.



Can we do some kind of indirection from the root line pointer ? Haven't
completely thought through yet, but the basic idea is to release the actual
space consumed by the root tuple once it becomes dead, but store the
offnum of the new root in the line pointer of the original root tuple. We
may need to flag the line pointer for that, but if I am not wrong, LP_DELETE
is not used for heap tuples.

We would waste  4 bytes of line pointer until the tuple is COLD updated and
the entire chain and the associated index entry is removed.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-16 Thread Zeugswetter Andreas ADI SD

  As described, you've made
  that problem worse because you're trying to say we don't know which
of 
  the chain entries is pointed at.
 
 There should be a flag, say HOT_CHAIN_ENTRY for the tuple the

it's called HEAP_UPDATE_ROOT

 index(es) point at. And this should be the preferred CTID for 
 inserting new versions once the old one is dead.

This is not possible, see my reply to Bruce (maybe unless the whole hot 
chain is dead).
(because that would need a back pointer, so readers arriving at the root
find the visible tuple) 

Andreas

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-16 Thread Pavan Deolasee

On 2/16/07, Zeugswetter Andreas ADI SD [EMAIL PROTECTED] wrote:



  As described, you've made
  that problem worse because you're trying to say we don't know which
of
  the chain entries is pointed at.

 There should be a flag, say HOT_CHAIN_ENTRY for the tuple the

it's called HEAP_UPDATE_ROOT



Just to avoid any confusion with the patch I sent out this week, we are
setting HEAP_UPDATE_ROOT on all tuples which are HOT-updated.

We set HEAP_ONLY_TUPLE for all tuples which does not have index
reference. So may be combination of (HEAP_UPDATE_ROOT  ~HEAP_ONLY_TUPLE)
can be used to identify index referred tuple in a HOT-update chain.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] Intermittent buildfarm failures due to timestamptz test

2007-02-16 Thread Bruce Momjian
Seneca Cunningham wrote:
 Multiple systems are occasionally failing the timestamptz test at the 
 check stage.  From what I can tell, this is due to the newly added ISO
 week checks referring to TIMESTAMP_TBL used by the timestamp test 
 instead of TIMESTAMPTZ_TBL.  Both checks run at the same segment, so
 the test fails if the timing between the timestamp and timestamptz 
 checks doesn't match.
 
 http://pgbuildfarm.org/cgi-bin/show_log.pl?nm=jackaldt=2007-02-16%2012:21:01

Alvaro has committed a fix for this.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-16 Thread Zeugswetter Andreas ADI SD

 Just to avoid any confusion with the patch I sent out this 
 week, we are setting HEAP_UPDATE_ROOT on all tuples which are 
 HOT-updated.
 
 We set HEAP_ONLY_TUPLE for all tuples which does not have 
 index reference. So may be combination of (HEAP_UPDATE_ROOT  
 ~HEAP_ONLY_TUPLE) can be used to identify index referred 
 tuple in a HOT-update chain.

Oh sorry. Thanks for the clarification. Imho HEAP_UPDATE_ROOT should be
renamed for this meaning then (or what does ROOT mean here ?).
Maybe HEAP_UPDATE_CHAIN ?

Andreas

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Mail getting through? Short varlena headers

2007-02-16 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 In it I said that removing the 2-byte cases had no advantages but actually
 since then I've thought of one. It makes the toaster code simpler since it can
 just set a bit in the four-byte header just as it does now. It doesn't have to
 worry about converting to a 2-byte header.

Run that by me again?  A toast pointer datum ought to have a 1-byte
header, since its only twenty-something bytes long.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] buildfarm failure in XML code

2007-02-16 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 Am Freitag, 16. Februar 2007 16:09 schrieb Tom Lane:
 This fix doesn't fix anything.

 Sure, it fixes the compilation failures.

Not here:

gcc -O1 -Wall -Wmissing-prototypes -Wpointer-arith -Winline 
-fno-strict-aliasing -g -I../../../../src/include -D_XOPEN_SOURCE_EXTENDED   -c 
-o xml.o xml.c
xml.c: In function `map_sql_type_to_xmlschema_type':
xml.c:2192: `INT64_MAX' undeclared (first use in this function)
xml.c:2192: (Each undeclared identifier is reported only once
xml.c:2192: for each function it appears in.)
xml.c:2192: `INT64_MIN' undeclared (first use in this function)
make[4]: *** [xml.o] Error 1

And this is in a build WITHOUT xml enabled --- why is this code being
compiled at all?  Kindly fix.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Mail getting through? Short varlena headers

2007-02-16 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 In it I said that removing the 2-byte cases had no advantages but actually
 since then I've thought of one. It makes the toaster code simpler since it 
 can
 just set a bit in the four-byte header just as it does now. It doesn't have 
 to
 worry about converting to a 2-byte header.

 Run that by me again?  A toast pointer datum ought to have a 1-byte
 header, since its only twenty-something bytes long.

I was referring to compressed inline data.

By not having inline compressed data be 2-byte headers it eliminates having to
check a lot of corner cases and reduces the changes in tuptoaster.c since it
means pg_lzcompress can return a normal 4-byte header and nobody has to
convert it to a 2-byte header.

So I am doing that for now. I suspect we'll never get around to reintroducing
2-byte headers, but we could if we wanted to. It would be a small change
everywhere else but an annoying bunch of fiddly changes in tuptoaster.c.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-16 Thread Pavan Deolasee

On 2/16/07, Zeugswetter Andreas ADI SD [EMAIL PROTECTED] wrote:



Oh sorry. Thanks for the clarification. Imho HEAP_UPDATE_ROOT should be
renamed for this meaning then (or what does ROOT mean here ?).
Maybe HEAP_UPDATE_CHAIN ?



Yes, you are right. There is some disconnect between what Simon had
originally posted and the patch I sent out. Hopefully as we discuss HOT
more here, everything will converge.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] anyelement2 pseudotype

2007-02-16 Thread Tom Dunstan

Tom Lane wrote:

So it seems neither can_coerce_type() nor find_coercion_pathway() are
really particularly well thought out in terms of what they test or don't
test.  I'm not very sure what a good refactoring would look like,
but I am sure that I don't want all their call sites having to
individually account for ANYfoo types.  Any thoughts?


Yeah, I remember thinking at the time that some of it was a bit 
backwards, but it's been almost 6 months since I did the original enum 
patch, so I'll need to refresh my memory. I'll have a look over the 
weekend and see if I can come up with something that'll work for these 
various cases. To begin with I'll need to do a survey of the call sites 
to see what they really need, since perhaps it isn't what the coerce 
functions are currently offering. :) I completely agree that anything 
requiring call sites to understand specifics about ANY* types is a bad 
idea, the most that we would want would be a generic IsGeneric(typoid) 
macro, but it would be nice to hide that inside a coerce function as 
well. We'll see.


Cheers

Tom

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] buildfarm failure in XML code

2007-02-16 Thread Peter Eisentraut
Tom Lane wrote:
 Not here:

Fixed.

 And this is in a build WITHOUT xml enabled --- why is this code being
 compiled at all?

The enablement pertains to libxml, which this code doesn't use.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] buildfarm failure in XML code

2007-02-16 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Not here:

 Fixed.

Looks good, thanks.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] anyelement2 pseudotype

2007-02-16 Thread Tom Lane
Tom Dunstan [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 So it seems neither can_coerce_type() nor find_coercion_pathway() are
 really particularly well thought out in terms of what they test or don't
 test.  I'm not very sure what a good refactoring would look like,
 but I am sure that I don't want all their call sites having to
 individually account for ANYfoo types.  Any thoughts?

 To begin with I'll need to do a survey of the call sites 
 to see what they really need, since perhaps it isn't what the coerce 
 functions are currently offering. :)

I realized that I can probably fix ATAddForeignKeyConstraint to do the
right thing by having it pass the two actual column types to
can_coerce_type, thus allowing check_generic_type_consistency to kick
in and detect the problem.  I haven't got round to trying that (up to my
rear in planner bugs ATM :-() but I think the immediate problem can be
dealt with without refactoring.  Still, if you have any ideas for making
this code cleaner, I'm all ears.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[HACKERS] Confusing message on startup after a crash while recovering

2007-02-16 Thread Florian G. Pflug

Hi

When postgres crashes during recovery, and is then restarted, it
says:
database system was interrupted while in recovery at ...
This probably means that some data is corrupted and
you will have to use the last backup for recovery.

When I first read that message, I assumed that there are cases were
postgres can't recover from a crash that happened during recovery.
I guessed that some operations done during wal restore are not
idempotent, and lead to corrupt data if performed twice.

Only after actually reading the sourcecode of xlog.c, and seeing that
the a similar (but better worded) warning is output after a crash during
archive log replay, I realized that this warning probably just means
that corrupt data could be the _cause_ for the crash during recovery, not
the _caused_by_ a crash during recovery.

I'd suggest that the text is changed to something along the line of:
database system was interrupted while in recovery at ...
If this has occurred more than once some data may be corrupted and
you may need to restore from the last backup.

This would also match the message for interrupted while doign archive
log replay more closely.

greetings, Florian Pflug

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL

2007-02-16 Thread Jim C. Nasby
My suggestion would be to focus on a period data type first and
foremost, as that's something that could be readily used by a lot of
folks. Of particular note, it's difficult to query tables that have
start_time and end_time fields to define a period; it's easy to screw up
the boundary conditions, and it's also hard to make those queries
perform well without going to extra lengths (such as defining a 'bogus'
GiST index on something like box(point(start,start),point(end,end)). And
it's not possible to do that in a way that avoids floating points and
their errors.

On Sat, Feb 10, 2007 at 12:20:28AM -0700, Warren Turkal wrote:
 Temporal Extensions for PostgreSQL
 by: Warren Turkal
 
 I would like to see a comprehensive solution to time varying tables (or
 temporal) in PostgreSQL. I specifically want to see suuport for valid-time and
 transacation-time and bitemporal (valid-time and transaction-time) tables. I
 will be defering the descriptions of much of the functionality to Dr. Richard 
 T.
 Snodgrass's _Developing Time-Oriented Database Applications in SQL_ at [1]. 
 The
 mangled pages 30-31 are at [2].
 
 
 a) Functionality
 
 Dr. Richard T. Snodgrass has worked on defining semantics of temporal very
 completely in several writings. He was also involved in an unsuccessful effort
 to standardize temporal extensions to SQL. I believe his book does a good job
 in presenting the semantics of temporal databases and describing extensions to
 SQL that make the data much more natural with which to work.
 
 
 b) How current solutions fall flat
 
 Current solutions fall flat due to the extreme complexity of implementing
 valid-time and transaction time semantics on tables by adding columns to track
 all of the data. Please see chapter 11 of [1] for a more complete description 
 of
 this complexity. Chapter 12 of [1] goes on to lay out new syntax for SQL that
 will make dealing with data of this nature much more natural.
 
 
 c) Examples
 
 --create normal table
 CREATE TABLE products
( id SERIAL PRIMARY KEY
, description TEXT
);
 
 -- Add valid-time support to the table with granularity of timestamp.
   ALTER TABLE products
 ADD VALIDTIME PERIOD(TIMESTAMP WITH TIMEZONE);
 
 -- Insert row valid from 2006-01-01 to just before 2007-01-01
 VALIDTIME PERIOD '[2006-01-01 - 2007-01-01)'
  INSERT INTO products
( description
)
   VALUES
( 'red ball'
);
 
 -- Insert row valid from 2007-01-01 to just before 2008-01-01
 -- Should be smart enough to realize the id=777 does not conflict in this time
 --  of validity.
 VALIDTIME PERIOD '[2007-01-01 - 2008-01-01)'
  INSERT INTO products
( id
, description
)
   VALUES
( 777
, 'blue ball'
);
 
 -- Select history of products with id=777
 VALIDTIME
SELECT *
  FROM product
 WHERE id=777;
 
  id | description | valid_period
 --
  777| red ball| [2006-01-01 - 2007-01-01)
  777| blue ball   | [2007-01-01 - 2008-01-01)
 
 -- Select current products with id=777
 -- The date when query was run was 2007-02-10.
 SELECT *
   FROM products
  WHERE id=777;
 
  id | description 
 --
  777| blue ball
 
 There are many more details in chapter 12 of [1].
 
 
 d) New stuff (dependencies, indices, syntax, libraries)
 
 One of the base level additions is the PERIOD datatype. I think that
 implementing temporal support is reliant on developing such a type. The
 description of this datatype is laid out in chapter 4 of [1]. The SQL syntax 
 is
 present in chapter 12 of [1]. I see this as the first piece that needs to be
 implemented in order to take steps toward a DBMS to supports full temporal
 capabilities. I think that PERIOD can largely reuse the datatime functionality
 for parsing of literals and for comparisons. The RTREE seems to nicely
 incorporate needed indexing of the PERIOD type. The syntax of the parser will
 have to be extended to handle the PERIOD literals and constructor. I believe 
 any
 additional libraries will be required.
 
 There are also extensions to the syntax of table creation, table altering,
 querying, inserting, and updating on temporal tables. These are all discussed 
 in
 some detail in chapter 12 of [1]. I don't think that any of these changes will
 require new libraries.
 
 The semantics of temporal tables and querying them could have a dramatic 
 affect
 on how things like primary keys and unique constraints work. I would like to 
 get
 some comments about this from the community.
 
 
 e) See Also
 
 Addtional resources can be found at Dr. Richard T. Snodgrass's website at [3],
 including SQL valid-time table support spec at [4] and SQL transaction-time
 table support spec at [5].
 
 Thoughts? Questions? Comments?
 
 [1]http://www.cs.arizona.edu/~rts/tdbbook.pdf
 

Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL

2007-02-16 Thread Alvaro Herrera
Jim C. Nasby wrote:
 My suggestion would be to focus on a period data type first and
 foremost, as that's something that could be readily used by a lot of
 folks. Of particular note, it's difficult to query tables that have
 start_time and end_time fields to define a period; it's easy to screw up
 the boundary conditions, and it's also hard to make those queries
 perform well without going to extra lengths (such as defining a 'bogus'
 GiST index on something like box(point(start,start),point(end,end)). And
 it's not possible to do that in a way that avoids floating points and
 their errors.

FWIW there's already a type called tinterval that stores (start,end).  I
don't think it's very much documented; maybe it can be extended or used
as base for a new, more complete and robust type, indexable in a more
natural way, etc etc.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] autovacuum next steps

2007-02-16 Thread Alvaro Herrera
After staring at my previous notes for autovac scheduling, it has become
clear that this basics of it is not really going to work as specified.
So here is a more realistic plan:

First, we introduce an autovacuum_max_workers parameter, to limit the
total amount of workers that can be running at any time.  Use this
number to create extra PGPROC entries, etc, similar to the way we handle
the prepared xacts stuff.  The default should be low, say 3 o 4.

The launcher sends a worker into a database just like it does currently.
This worker determines what tables need vacuuming per the pg_autovacuum
settings and pgstat data.  If it's more than one table, it puts the
number of tables in shared memory and sends a signal to the launcher.

The launcher then starts
min(autovacuum_max_workers - currently running workers, tables to vacuum - 1)
more workers to process that database.  Maybe we could have a
max-workers parameter per-database in pg_database to use as a limit here
as well.

Each worker, including the initial one, starts vacuuming tables
according to pgstat data.  They recheck the pgstat data after finishing
each table, so that a table vacuumed by another worker is not processed
twice (maybe problematic: a table with high update rate may be vacuumed
more than once.  Maybe this is a feature not a bug).


Once autovacuum_naptime has passed, if the workers have not finished
yet, the launcher wants to vacuum another database.  At this point, the
launcher wants some of the workers processing the first database to exit
early as soon as they finish one table, so that they can help vacuuming
the other database.  It can do this by setting a flag in shmem that the
workers can check when finished with a table; if the flag is set, they
exit instead of continuing with another table.  The launcher then starts
a worker in the second database.  The launcher does this until the
number of workers is even among both databases.  This can be done till
having one worker per database; so at most autovacuum_max_workers
databases can be under automatic vacuuming at any time, one worker each.

When there are autovacuum_max_workers databases under vacuum, the
launcher doesn't have anything else to do until some worker exits on its
own.

When there is a single worker processing a database, it does not recheck
pgstat data after each table.  This is to prevent a high-update-rate
table from starving the vacuuming of other databases.


How does this sound?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Confusing message on startup after a crash while recovering

2007-02-16 Thread Tom Lane
Florian G. Pflug [EMAIL PROTECTED] writes:
 I'd suggest that the text is changed to something along the line of:
 database system was interrupted while in recovery at ...
 If this has occurred more than once some data may be corrupted and
 you may need to restore from the last backup.

It seems the real problem is that it's not specifying *which* data is
probably corrupted.  Maybe:

HINT: If recovery fails repeatedly, it probably means that the recovery log
data is corrupted; you may have to restore from your last full backup.

Also, do we want to suggest use of pg_resetxlog in the message?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL

2007-02-16 Thread Warren Turkal
On Fri, Feb 16, 2007 at 05:39:24PM -0300, Alvaro Herrera wrote:
 FWIW there's already a type called tinterval that stores (start,end).  I
 don't think it's very much documented; maybe it can be extended or used
 as base for a new, more complete and robust type, indexable in a more
 natural way, etc etc.

The book I cited has a very complete description of the period data type
including details on what extensions to SQL are needed. I am very
interested in starting a robust implementation of the period datatype.

I think the datetime infrastructure will already do most of the needed
parsing and packing of the hard parts of the period datatype (namely the
date and time formats). I will investigate the tinterval to see if it
meets the needs of the PERIOD datatypes.

I agree with focusing on the PERIOD datatype. I think that is a major
part of the foundation for temporal extensions and would have to be
implemented first. Therefore, I present the following plan for getting
there.

1) Focus first on PERIOD(DATE) to keep things as simple as possible.
2) Implement a first cut on the period datatype that only handles
   storing two dates. (Maybe tinterval will get us here for free?)
3) Add information to the datatype for open or closed interval for
   beginning and ending sides of the period.

I could probably have this done in time for the freeze with some
mentoring. I could probably even start implementation of some indices
and operator function for the type. This functionality is what I expect
to have a shot of making an appearance in 8.3. It will be minimally
functional at this point.

The next project will be altering the parser to be able to construct and
operate on PERIOD types with the syntax extensions to SQL in Dr.
Snodgrass's book.

Once all of the syntax is implemented for PERIOD(DATE), the next project
will be to extend to support PERIOD(DATETIME WITH TIMEZONE). Again, I
think the datatime infrastructure will be very useful here.

wt

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] autovacuum next steps

2007-02-16 Thread Matthew T. O'Connor

Alvaro Herrera wrote:

After staring at my previous notes for autovac scheduling, it has become
clear that this basics of it is not really going to work as specified.
So here is a more realistic plan:


[Snip Detailed Description]


How does this sound?


On first blush, I'm not sure I like this as it doesn't directly attack 
the table starvation problem, and I think it could be a net loss of speed.


VACUUM is I/O bound, as such, just sending multiple vacuum commands at a 
DB isn't going to make things faster, you are now going to have multiple 
processes reading from multiple tables at the same time.  I think in 
general this is a bad thing (unless we someday account for I/O made 
available from multiple tablespaces).  In general the only time it's a 
good idea to have multiple vacuums running at the same time is when a 
big table is starving a small hot table and causing bloat.


I think we can extend the current autovacuum stats to add one more 
column that specifies is hot or something to that effect.  Then when 
the AV launcher sends a worker to a DB, it will first look for tables 
marked as hot and work on them.  While working on hot tables, the 
launcher need not send any additional workers to this database, if the 
launcher notices that a worker is working on regular tables, it can send 
another worker which will look for hot tables to working, if the worker 
doesn't find any hot tables that need work, then it exits leaving the 
original working to continue plodding along.


Thoughts?



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] autovacuum next steps

2007-02-16 Thread Gregory Stark

In an ideal world I think you want precisely one vacuum process running per
tablespace on the assumption that each tablespace represents a distinct
physical device.

The cases where we currently find ourselves wanting more are where small
tables are due for vacuuming more frequently than the time it takes for a
large table to receive a single full pass.

If we could have autovacuum interrupt a vacuum in mid-sweep, perform a cycle
of vacuums on smaller tables, then resume, that problem would go away. That
sounds too difficult though, but perhaps we could do something nearly as good.

One option that I've heard before is to have vacuum after a single iteration
(ie, after it fills maintenance_work_mem and does the index cleanup and the
second heap pass), remember where it was and pick up from that point next
time.

If instead autovacuum could tell vacuum exactly how long to run for (or
calculated how many pages that represented based on cost_delay) then it could
calculate when it will next need to schedule another table in the same
tablespace and try to arrange for the vacuum of the large table to be done by
then.

Once there are no smaller more frequently vacuumed small tables due to be
scheduled it would start vacuum for the large table again and it would resume
from where the first one left off.

This only works if the large tables really don't need to be vacuumed so often
that autovacuum can't keep up. Our current situation is that there is a size
at which this happens. But arranging to have only one vacuum process per
tablespace will only make that less likely to happen rather than more.

I think the changes to vacuum itself are pretty small to get it to remember
where it left off last time and start from mid-table. I'm not sure how easy it
would be to get autovacuum to juggle all these variables though.

Of course users may not create separate tablespaces for physical devices, or
they may set cost_delay so high you really do need more vacuum processes, etc.
So you probably still need a num_vacuum_daemons but the recommended setting
would be the same as the number of physical devices and autovacuum could try
to divide them equally between tablespaces which would amount to the same
thing.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] autovacuum next steps

2007-02-16 Thread Alvaro Herrera
Matthew T. O'Connor wrote:
 Alvaro Herrera wrote:
 After staring at my previous notes for autovac scheduling, it has become
 clear that this basics of it is not really going to work as specified.
 So here is a more realistic plan:
 
 [Snip Detailed Description]
 
 How does this sound?
 
 On first blush, I'm not sure I like this as it doesn't directly attack 
 the table starvation problem, and I think it could be a net loss of speed.
 
 VACUUM is I/O bound, as such, just sending multiple vacuum commands at a 
 DB isn't going to make things faster, you are now going to have multiple 
 processes reading from multiple tables at the same time.  I think in 
 general this is a bad thing (unless we someday account for I/O made 
 available from multiple tablespaces).

Yeah, I understand that.  However, I think that can be remedied by using
a reasonable autovacuum_vacuum_cost_delay setting, so that each worker
uses less than the total I/O available.  The main point of the proposal
is to allow multiple workers on a DB while also allowing multiple
databases to be processed in parallel.

 I think we can extend the current autovacuum stats to add one more 
 column that specifies is hot or something to that effect.  Then when 
 the AV launcher sends a worker to a DB, it will first look for tables 
 marked as hot and work on them.  While working on hot tables, the 
 launcher need not send any additional workers to this database, if the 
 launcher notices that a worker is working on regular tables, it can send 
 another worker which will look for hot tables to working, if the worker 
 doesn't find any hot tables that need work, then it exits leaving the 
 original working to continue plodding along.

How would you define what's a hot table?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] autovacuum next steps

2007-02-16 Thread Matthew T. O'Connor

Alvaro Herrera wrote:

Matthew T. O'Connor wrote:
On first blush, I'm not sure I like this as it doesn't directly attack 
the table starvation problem, and I think it could be a net loss of speed.


VACUUM is I/O bound, as such, just sending multiple vacuum commands at a 
DB isn't going to make things faster, you are now going to have multiple 
processes reading from multiple tables at the same time.  I think in 
general this is a bad thing (unless we someday account for I/O made 
available from multiple tablespaces).


Yeah, I understand that.  However, I think that can be remedied by using
a reasonable autovacuum_vacuum_cost_delay setting, so that each worker
uses less than the total I/O available.  The main point of the proposal
is to allow multiple workers on a DB while also allowing multiple
databases to be processed in parallel.


So you are telling people to choose an autovacuum_delay so high that 
they need to run multiple autovacuums at once to keep up?  I'm probably 
being to dramatic, but it seems inconsistent.


I think we can extend the current autovacuum stats to add one more 
column that specifies is hot or something to that effect.  Then when 
the AV launcher sends a worker to a DB, it will first look for tables 
marked as hot and work on them.  While working on hot tables, the 
launcher need not send any additional workers to this database, if the 
launcher notices that a worker is working on regular tables, it can send 
another worker which will look for hot tables to working, if the worker 
doesn't find any hot tables that need work, then it exits leaving the 
original working to continue plodding along.


How would you define what's a hot table?


I wasn't clear, I would have the Admin specified it, and we can store it 
as an additional column in the pg_autovacuum_settings table.  Or perhaps 
if the table is below some size threshold and autovacuum seems that it 
needs to be vacuumed every time it checks it 10 times in a row or 
something like that.	


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] autovacuum next steps

2007-02-16 Thread Chris Browne
[EMAIL PROTECTED] (Alvaro Herrera) writes:
 When there is a single worker processing a database, it does not recheck
 pgstat data after each table.  This is to prevent a high-update-rate
 table from starving the vacuuming of other databases.

This case is important; I don't think that having multiple workers
fully alleviates the problem condition.

Pointedly, you need to have a way of picking up tables often enough to
avoid the XID rollover problem.  That may simply require that on some
periodic basis, a query is run to queue up tables that are getting
close to having an XID problem.
-- 
(reverse (concatenate 'string ofni.secnanifxunil @ enworbbc))
http://linuxfinances.info/info/finances.html
Rules of  the Evil Overlord #189. I  will never tell the  hero Yes I
was the one who  did it, but you'll never be able  to prove it to that
incompetent  old fool.  Chances  are, that  incompetent  old fool  is
standing behind the curtain.  http://www.eviloverlord.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] autovacuum next steps

2007-02-16 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Each worker, including the initial one, starts vacuuming tables
 according to pgstat data.  They recheck the pgstat data after finishing
 each table, so that a table vacuumed by another worker is not processed
 twice (maybe problematic: a table with high update rate may be vacuumed
 more than once.  Maybe this is a feature not a bug).

How are you going to make that work without race conditions?  ISTM
practically guaranteed that all the workers will try to vacuum the same
table.

 Once autovacuum_naptime has passed, if the workers have not finished
 yet, the launcher wants to vacuum another database.

This seems a rather strange design, as it will encourage concentrations
of workers in a single database.  Wouldn't it be better to spread them
out among multiple databases by default?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] n-gram search function

2007-02-16 Thread Tatsuo Ishii
Hi,

Is anybody working on implementing n-gram search functionality for
text type data? tsearch2 is great for long text but it's not
appropreate for short (10-100 bytes) text data. What I want to achieve
is a fast partial match search using indexes, i.e. foo ~ 'bar' or foo
LIKE '%bar%' type matching.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] 8.3 patches hold queue empty

2007-02-16 Thread Bruce Momjian
I have completed processing of the 8.3 patches hold queue.  There are
some emails where I am waiting on a reply from the authors, but I will
now just handle them as part of the normal patch process.

I will now return to processing patches as they come in, and deal with
the patches that are now waiting.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [pgsql-patches] [PATCHES] [HACKERS] [Fwd: Index Advisor]

2007-02-16 Thread Bruce Momjian

I need someone to review this patch to make sure the API used is
logical.  You can do that by reading the README file in the patch.  To
me, the procedure seems overly complicated, and too restrictive.

The patch is in the patches queue.

---

Gurjeet Singh wrote:
 On 1/20/07, Bruce Momjian [EMAIL PROTECTED] wrote:
 
 
  I can't read a 7z file on my end.  Please email me the file and I will
  put it at a URL.
 
 
  ---
 
  Gurjeet Singh wrote:
   Please find attached the patches ported to HEAD as of now. The patch to
  the
   contrib modules is the same as before; the version number has been kept
  but
   branch designator has been changed.
  
   1) pg_post_planner_plugin-HEAD_20070116-v2.patch.gz
   2) pg_index_adviser-HEAD_20070116-v26.7z
  
 
 
 I am attaching the .gz versions of both the patches, and CC'ing to -patches
 also. If it doesn't turn up on -patches even this time, then please do the
 needful.
 
 Thanks and best regards,
 
 -- 
 [EMAIL PROTECTED]
 [EMAIL PROTECTED] gmail | hotmail | yahoo }.com

[ Attachment, skipping... ]

[ Attachment, skipping... ]

 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] integer datetimes

2007-02-16 Thread Bruce Momjian

OK, mention removed.  We can always re-add it if we find we need to warn
people away from integer timestamps again.

---

Magnus Hagander wrote:
 On Wed, Feb 14, 2007 at 12:38:12PM -0500, Andrew Dunstan wrote:
  Tom Lane wrote:
  Magnus Hagander [EMAIL PROTECTED] writes:

  Our docs for the integer datetime option says:
  Note also that the integer datetimes
  code is newer than the floating-point code, and we still find bugs in it
  from time to time.
  
  

  Is the last sentence about bugs really true anymore? At least the 
  buildfarm
  seems to have a lot *more* machines with it enabled than without.
  
  
  Buildfarm proves only that the regression tests don't expose any bugs,
  not that there aren't any.
  

  (I'm thinking about making it the defautl for the vc++ build, which is
  why I came across that)
  
  
  FWIW, there are several Linux distros that build their RPMs that way,
  so it's not like people aren't using it.  But it seems like we find bugs
  in the datetime/interval stuff all the time, as people trip over
  different weird edge cases.
  
 

  
  I think it's disappointing, to say the least, that we treat this code as 
  a sort of second class citizen. BTW, the buildfarm has a majority of 
  machines using it by design - it's in the default set of options in the 
  distributed config file. If we think there are bugs we haven't found, 
  then we need to engage in some sort of analytical effort to isolate 
  them. I don't see any reason in principle why this code should be any 
  more buggy than the float based datetimes, and I see plenty of reason in 
  principle why we should make sure it's right.
 
 That was exactly what I thought, which is why I was kinda surprised to
 see that note in the configure stuff.
 
 If we go with that, then we can say that *any* new feature is less
 tested, no? ;-)
 
 //Magnus
 
 ---(end of broadcast)---
 TIP 7: You can help support the PostgreSQL project by donating at
 
 http://www.postgresql.org/about/donate

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/installation.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/installation.sgml,v
retrieving revision 1.282
diff -c -c -r1.282 installation.sgml
*** doc/src/sgml/installation.sgml	3 Feb 2007 23:01:06 -	1.282
--- doc/src/sgml/installation.sgml	17 Feb 2007 01:24:57 -
***
*** 965,973 
   the full range (see
   ![%standalone-include[the documentation about datetime datatypes]]
   ![%standalone-ignore[xref linkend=datatype-datetime]]
!  for more information).  Note also that the integer datetimes code is
!  newer than the floating-point code, and we still find bugs in it from
!  time to time.
  /para
 /listitem
/varlistentry
--- 965,971 
   the full range (see
   ![%standalone-include[the documentation about datetime datatypes]]
   ![%standalone-ignore[xref linkend=datatype-datetime]]
!  for more information).
  /para
 /listitem
/varlistentry

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] NULL and plpgsql rows

2007-02-16 Thread Bruce Momjian

Added to TODO under PL/pgSQL:

o Allow row and record variables to be set to NULL constants,
  and allow NULL tests on such variables

  Because a row is not scalar, do not allow assignment
  from NULL-valued scalars.


---

Jim C. Nasby wrote:
 On Tue, Feb 13, 2007 at 05:55:11PM -0500, Bruce Momjian wrote:
  
  Is there a TODO here?
  
  ---
  
  Jim Nasby wrote:
   On Oct 2, 2006, at 6:28 PM, Tom Lane wrote:
Jim C. Nasby [EMAIL PROTECTED] writes:
However, the test right above that means that we'll fail if the user
tries something like row_variable := NULL;:
   
The patch you seem to have in mind would allow
row_variable := int_variable;
to succeed if the int_variable chanced to contain NULL, which is  
surely
not very desirable.
 
 Well, that's Tom's objection, though I'm not sure if by 'int_variable'
 he means 'internal' or 'integer'.
 
 Personally, I think it would be useful to just allow setting a row or
 record variable to NULL as I showed it above; ie: no variables involved.
 This is something you might want to do to invalidate a row/record
 variable after taking some action (perhaps deleting a row).
 
 You'd also think that you should be able to detect if a record variable
 is null, as you can with row.
 
 So, I suggest:
 
 * Allow row and record variables in plpgsql to be set to NULL
 
 It's not clear if it's a wise idea to allow this assignment from a
 variable. It may be better to only allow explicitly setting them,
 ie:
 
 row_variable := NULL;
 
 * Allow testing a record variable to see if it's NULL
 
 Currently works for row variables, but not record variables
 -- 
 Jim Nasby[EMAIL PROTECTED]
 EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] autovacuum next steps

2007-02-16 Thread Ron Mayer
Alvaro Herrera wrote:
 
 Once autovacuum_naptime... autovacuum_max_workers...
 How does this sound?

The knobs exposed on autovacuum feel kinda tangential to
what I think I'd really want to control.

IMHO vacuum_mbytes_per_second would be quite a bit more
intuitive than cost_delay, naptime, etc.


ISTM I can relatively easily estimate and/or spec out how
much extra I/O bandwidth I have per device for vacuum;
and would pretty much want vacuum to be constantly
running on whichever table that needs it the most so
long as it can stay under that bandwith limit.

Could vacuum have a tunable that says X MBytes/second
(perhaps per device) and have it measure how much I/O
it's actually doing and try to stay under that limit?

For more fine-grained control a cron job could go
around setting different MBytes/second limits during
peak times vs idle times.


If people are concerned about CPU intensive vacuums
instead of I/O intensive ones (does anyone experience
that? - another tuneable vacuum_percent_of_cpu would
be more straightforward than delay_cost, cost_page_hit,
etc.   But I'd be a bit surprised if cpu intensive
vacuums are common.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] add to ToDo, please

2007-02-16 Thread Bruce Momjian
Pavel Stehule wrote:
 Hello
 
 please add to ToDo: Holdable cursor support in SPI

Added:

* Allow holdable cursors in SPI

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Invalid to_date patterns (was: [PATCHES] [GENERAL] ISO week dates)

2007-02-16 Thread Brendan Jurd

On 2/17/07, Alvaro Herrera [EMAIL PROTECTED] wrote:

Bruce Momjian escribió:

 Maybe now would be an appropriate time to discuss the open questions in
 the submitting email:

  Brendan Jurd wrote:
   I'd also like to raise the topic of how conversion from text to ISO
   week dates should be handled, where the user has specified a bogus
   mixture of fields.  Existing code basically ignores these issues; for
   example, if a user were to call to_date('1998-01-01 2454050',
   '-MM-DD J') the function returns 2006-01-01, a result of setting
   the year field from , then overwriting year, month and day with
   the values from the Julian date in J, then setting the month and day
   normally from MM and DD.
  
   2006-01-01 is not a valid representation of either of the values the
   user specified.  Now you might say ask a silly question, get a silly
   answer; the user shouldn't send nonsense arguments to to_date and
   expect a sensible result.  But perhaps the right way to respond to a
   broken timestamp definition is to throw an error, rather than behave
   as though everything has gone to plan, and return something which is
   not correct.
  
   The same situation can arise if the user mixes ISO and Gregorian data;
   how should Postgres deal with something like to_date('2006-250',
   'IYYY-DDD')?  The current behaviour in my patch is actually to assume
   that the user meant to say 'IYYY-IDDD', since the 250th Gregorian day
   of the ISO year 2006 is total gibberish.  But perhaps it should be
   throwing an error message.

My thinking is that erroneous patterns should throw an error, and not
try to second-guess the user.  (IIRC this was being discussed in some
other thread not long ago).


It seems to me there are basically two different responses to the
problem of invalid patterns.  One is to reject all patterns which
potentially under- or over-constrain the date value, and the other is
to only reject those patterns which, when applied to the given date
string, actually cause a conflict.

For example, on the surface the pattern '-MM-DD J' would appear to
be invalid, because it specifies the date using both the Gregorian and
Julian conventions.  You could argue that the whole idea of using a
pattern like this is bogus, and reject the pattern as soon as it is
parsed.

On the other hand, if a user called to_date('2007-02-17 2454149',
'-MM-DD J'), and you attempted to resolve the pattern you would
find that the Julian date and the Gregorian date agree perfectly with
each other, and there is no reason to reject the conversion.

My gut reaction at first was to go with the former approach.  It's
programmatically more simple, and it's easier to explain in
documentation/error messages.  But then it occurred to me that one of
the use cases for to_date is slurping date information out of textual
reports which may contain redundant date information.  If a user
wanted to parse something like 2007-02-17 Q1, he would probably try
'-MM-DD QQ', even though this pattern is logically
over-constraining.  Would it be fair to throw an error in such a case?

Please let me know what you think.

BJ

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL

2007-02-16 Thread Oleg Bartunov

On Fri, 16 Feb 2007, Alvaro Herrera wrote:


Jim C. Nasby wrote:

My suggestion would be to focus on a period data type first and
foremost, as that's something that could be readily used by a lot of
folks. Of particular note, it's difficult to query tables that have
start_time and end_time fields to define a period; it's easy to screw up
the boundary conditions, and it's also hard to make those queries
perform well without going to extra lengths (such as defining a 'bogus'
GiST index on something like box(point(start,start),point(end,end)). And
it's not possible to do that in a way that avoids floating points and
their errors.


FWIW there's already a type called tinterval that stores (start,end).  I
don't think it's very much documented; maybe it can be extended or used
as base for a new, more complete and robust type, indexable in a more
natural way, etc etc.


RI-Tree (Relational intervar tree)
http://www.dbs.informatik.uni-muenchen.de/Forschung/CAD/presentations/RI-Tree.pdf
looks promising for that purposes.

Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] n-gram search function

2007-02-16 Thread Oleg Bartunov

3-gram is implemented as a contrib/pg_trgm. It currently uses GiST index,
but may be enhanced with the GiN.

Oleg

On Sat, 17 Feb 2007, Tatsuo Ishii wrote:


Hi,

Is anybody working on implementing n-gram search functionality for
text type data? tsearch2 is great for long text but it's not
appropreate for short (10-100 bytes) text data. What I want to achieve
is a fast partial match search using indexes, i.e. foo ~ 'bar' or foo
LIKE '%bar%' type matching.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings