date:20070214

Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-14 Thread Hannu Krosing

Ühel kenal päeval, T, 2007-02-13 kell 09:38, kirjutas Tom Lane:
 Heikki Linnakangas [EMAIL PROTECTED] writes:
  Hannu Krosing wrote:
  Are we actually doing that ? I.E are null bitmaps really allocated in 1
  byte steps nowadays ?
 
  Yes.
 
 Not really; we still have to MAXALIGN at the end of the bitmap.  The
 point is that you can get 8 bits in there before paying the first
 additional MAXALIGN increment.
 
 It's all moot anyway since 8 bits isn't enough for a pointer ...

With 8k pages and MAXALIGN=8 we just barely can, as with current page
structure (tuple headers together with data) the minimal tuple size for
that case is 24 for header + 8 for data = 32 bytes which means 256
tuples per page (minus page header) and so we can store tuple number in
8 bits.

OTOH, for same page HOT tuples, we have the command and trx ids stored
twice first as cmax,xmax of the old tuple and as cmin,xmin of the
updated tuple. One of these could probably be used for in-page HOT tuple
pointer.

As we can't rely on get-xmax-from-next-in-chain behaviour for off-page
tuples, we need to move the other way, that is storing a pointer to
previous version and getting xmin from there. That means keeping a chain
back pointers in xmin fields for all but the oldest tuple in HOT
chain/cycle.

This requires a little shuffling around of xmin/xmax info when removing
dead HOT chain members, but this would void the need to do any index
changes during HOT updates. Avoiding all index updates is probably good,
as it means we dont need to do any index page locking.

So the proposed structure for HOT chain is like this:

* Oldest tuple in chain has real xmin/xmax, this needs a flag to be
recognisable, of we may just start transaction counter at 64K so that
any xmin that fits in 2 bytes is immediately recognized as hot back
pointer. Or start cmin at 1 and use cmin=0 as pointer flag.

* All newer tuples have real xmax, and in-page pointer to previous
version in place of xmin. the real xmin is xmax of the previous tuple.
When previous tuple is removed, xmin is stored in the new oldest tuple.

* Index can point to any tuple in HOT chain. Finding the tuple by index

* To find a tuple, one errives to (possibly) center of HOT chain, and
the tuple may be found either by following the in-page pointer stored in
xmin field or ctid pointers.

* If the tuples ctid is pointing to off page, which means it must thus
be the newest one in HOT chain and also end of it. The chain ends also
when the tuple is inserted by an rollbacked transaction.

* there is a special case when the index pointer points to a rollbacked
tuple. In this case the tuple can't be removed, but should be reused on 
next update. But this is the case for any HOT rollbacks.

The proposed structure should enables cyclic reuse of dead tuples as
they fall out of visibility window and avoids updating index pointers.

-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Fixing insecure security definer functions

2007-02-14 Thread Zeugswetter Andreas ADI SD


 Regarding the advisory on possibly insecure security definer functions

 that I just sent out (by overriding the search path you can make the 
 function do whatever you want with the privileges of the function 
 owner), the favored solution after some initial discussion in the core

 team was to save the search path at creation time with each function.

Have you considered hardcoding the schema for each object where it was
found at creation time ? This seems more intuitive to me. Also using a
search
path, leaves the possibility to inject an object into a previous schema.

Andreas


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[HACKERS] HOT WIP Patch - version 1

2007-02-14 Thread Pavan Deolasee


This is a WIP patch based on the recent posting by Simon and discussions
thereafter. We are trying to do one piece at a time and intention is to post
the work ASAP so that we could get early and continuous feedback from
the community. We could then incorporate those suggestions in the next
WIP patch.

To start with, this patch implements  HOT-update for a simple case
when there is enough free space in the same block so that it can
accommodate the new version of the tuple. A necessary condition for
doing HOT-update is that none of the index columns is changed.
The old version is marked as HEAP_UPDATE_ROOT and the new
version is marked as HEAP_ONLY_TUPLE. If a tuple is HOT-updated,
no new index entry is added.

When fetching a tuple using an index, if the root tuple is not visible to
the given snapshot, the ctid chain is followed until a visible tuple is
found or
end of HOT-update chain is reached. The prior_xmax/next_xmin chain
is validated while following the ctid chain.

This patch is generated on the current CVS head. It passes all the
regression
tests, but I haven't measured any performance impact since thats not the
goal for posting this early version. There are several things that are not
yet implemented and there are few unresolved issues for which I am looking
for community help and feedback.

Open Issues:
--

- CREATE INDEX needs more work in the HOT context. The existing HOT
tuples may require chilling for the CREATE INDEX to work correctly. There
are concerns about the crash-safety on chilling operation. Few suggestions
were posted in this regard. We need to conclude that and post a working
design/patch.

- We need to find a way to handle DEAD root tuples, either convert them into
stubs or overwrite them with a new version. We can also perform pointer
swinging from the index. Again there are concerns about crash-safety and
concurrent index-scans working properly. We don't have a community
consensus on any of the suggestions in this regard. But hopefully we
would converge on some design soon.

- Retail VACUUM. We need to implement the block-level vacuum for
UPDATEs to find enough free space in the block to do HOT-update.
Though we are still discussing how to handle the dead root tuples, we
should be able to remove any intermediate dead tuples in the HOT-update
chain safely. If we do so without fixing the root tuple, the
prior_xmax/next_xmin chain would be broken. A similar problem exists
with freezing HOT tuples.

Whats Next:
-

In the current implementation, an HOT-updated tuple can not be vacuumed
because it might be in the middle of the access path to the heap-only
visible tuple.
This can cause the table to grow rapidly even if autovacuum is turned on.
The
HOT-update chain also keeps growing if there is enough free space in the
block.
I am thinking of implementing some sort of HOT-update chain squeezing logic
so that intermediate dead tuples can be retired and vacuumed away. This
would
also help us keep the HOT-update chain small enough so that the chain
following
does not become unduly costly.

I am thinking of squeezing the HOT-update chain while following it in the
index fetch.
If the root tuple is dead, we follow the chain until the first LIVE or
RECENTLY_DEAD tuple is found. The ctid pointer in the root tuple is made
point to the first LIVE or RECENTLY_DEAD tuple. All the intermediate
DEAD tuples are marked ~HEAP_UPDATE_ROOT so that they can be vacuumed
in the next cycle. We hold an exclusive lock on the page while doing so.
That should
avoid any race conditions. This infrastructure should also help us retail
vacuum the
block later.

Please let me know your comments.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com


NewHOT-v1.1-pgsql-head.patch.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Fixing insecure security definer functions

2007-02-14 Thread Peter Eisentraut

Am Mittwoch, 14. Februar 2007 10:21 schrieb Zeugswetter Andreas ADI SD:
 Have you considered hardcoding the schema for each object where it was
 found at creation time ?

Unless anyone has solved the halting problem lately, I don't think it's 
possible to determine at creation time which objects will be accessed.  At 
least not for all languages.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 6: explain analyze is your friend

[HACKERS] integer datetimes

2007-02-14 Thread Magnus Hagander

Our docs for the integer datetime option says:

Use 64-bit integer storage for datetimes and intervals, rather than the
default floating-point storage. This reduces the range of representable
values but guarantees microsecond precision across the full range (see
Section 8.5 for more information). Note also that the integer datetimes
code is newer than the floating-point code, and we still find bugs in it
from time to time.


Is the last sentence about bugs really true anymore? At least the buildfarm
seems to have a lot *more* machines with it enabled than without.

(I'm thinking about making it the defautl for the vc++ build, which is
why I came across that)

//Magnus

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] anyelement2 pseudotype

2007-02-14 Thread Tom Dunstan


Tom Lane wrote:

As for actually adding it, grep for all references to ANYELEMENT and add
code accordingly; shouldn't be that hard.  Note you'd need to add an
anyarray2 at the same time for things to keep working sanely.


The enum patch [1] does exactly this with an ANYENUM pseudo-type. It 
should provide a pretty good overview of what will be required.


Cheers

Tom

[1] http://archives.postgresql.org/pgsql-patches/2007-02/msg00239.php


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] 8.1 stats issues on Win32

2007-02-14 Thread Dave Page

Magnus Hagander wrote:
 I've been looking at backporting the stats fix committed to head and 8.2
 into 8.1, but realised that it's just not going to work. 8.1 still uses
 the dual stats processor stuff, which means that the simplification just
 is not possible.
 
 The most obvious result is that autovacuum is very likely to fail on 8.1
 if your system load is high enough. (all of stats fail of course, but
 autovac is a very common user of this)
 
 Should we note this somewhere?

Probably.

 Oh, and if we were looking for reasons to deprecate 8.1, this sounds
 like a pretty good one for me. I still think we should keep patchin it,
 but it is a very good reason to encourage our users to switch to 8.2.
 
 Now, we could try to fix it there, but we've seen a lot of issues since
 day one coming from the inherit socket in two steps, so even if we can
 get this one fix, there could be more lurking around in the dual-process
 model. I personally don't think it's worth investing the required time
 into fixing that on 8.1.

I'm inclined to agree - time would be better spent improving current and
future versions than chasing stats issues which as we know can be a real
pita to find.

Regards, Dave

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Writing triggers in C++

2007-02-14 Thread Andreas Pflug

Tom Lane wrote:
 Jacob Rief [EMAIL PROTECTED] writes:
   
 I tried to write a trigger using C++.
 

 That is most likely not going to work anyway, because the backend
 operating environment is C not C++.  If you dumb it down enough
 --- no exceptions, no RTTI, no use of C++ library --- then it might
 work, 
I can confirm that it does work this way.

Regards,
Andreas


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Writing triggers in C++

2007-02-14 Thread Florian G. Pflug


Andreas Pflug wrote:

Tom Lane wrote:

Jacob Rief [EMAIL PROTECTED] writes:
  

I tried to write a trigger using C++.


That is most likely not going to work anyway, because the backend
operating environment is C not C++.  If you dumb it down enough
--- no exceptions, no RTTI, no use of C++ library --- then it might
work, 

I can confirm that it does work this way.


I've written an aggregate function that uses c++ stl hashes, and it 
seems to work pretty well. I'd think that using exceptions should be

fine, as long as you make sure to _always_ catch any exception that
might be thrown inside your own c++ code, and don't let it propagate
into backend code. STL allows you to specify custom allocator classes
as template parameters to hash, vector and the like. You can use that
to let STL allocate memory from the correct memory context.

greetings, Florian Pflug

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-14 Thread Pavan Deolasee


On 2/14/07, Hannu Krosing [EMAIL PROTECTED] wrote:



OTOH, for same page HOT tuples, we have the command and trx ids stored
twice first as cmax,xmax of the old tuple and as cmin,xmin of the
updated tuple. One of these could probably be used for in-page HOT tuple
pointer.



I think we recently merged cmin/cmax into a single combo cid. So I don't
think we can use cmin for storing back pointer. Idea of using xmin  seems
feasible though.

As we can't rely on get-xmax-from-next-in-chain behaviour for off-page

tuples, we need to move the other way, that is storing a pointer to
previous version and getting xmin from there. That means keeping a chain
back pointers in xmin fields for all but the oldest tuple in HOT
chain/cycle.



Why can't we store the real xmin/xmax at the end of the chain and store
forward pointers in the xmax ? I find the idea of having back pointers more
compelling though.


This requires a little shuffling around of xmin/xmax info when removing

dead HOT chain members, but this would void the need to do any index
changes during HOT updates. Avoiding all index updates is probably good,
as it means we dont need to do any index page locking.



I agree. If we can design a solution that does not require any index
updates,
that would save us a lot. One problem with such a approach though is that we
might have to live with a dead tuple/stub until another update occurs and
the tuple/stub is reused.

So the proposed structure for HOT chain is like this:


* Oldest tuple in chain has real xmin/xmax, this needs a flag to be
recognisable, of we may just start transaction counter at 64K so that
any xmin that fits in 2 bytes is immediately recognized as hot back
pointer. Or start cmin at 1 and use cmin=0 as pointer flag.

* All newer tuples have real xmax, and in-page pointer to previous
version in place of xmin. the real xmin is xmax of the previous tuple.
When previous tuple is removed, xmin is stored in the new oldest tuple.

* Index can point to any tuple in HOT chain. Finding the tuple by index

* To find a tuple, one errives to (possibly) center of HOT chain, and
the tuple may be found either by following the in-page pointer stored in
xmin field or ctid pointers.



Can we quickly figure out based on the given snapshot and the root tuple
xmin/xmax, whether to follow the forward or the backward pointer to arrive
at a visible tuple ?


* If the tuples ctid is pointing to off page, which means it must thus

be the newest one in HOT chain and also end of it. The chain ends also
when the tuple is inserted by an rollbacked transaction.

* there is a special case when the index pointer points to a rollbacked
tuple. In this case the tuple can't be removed, but should be reused on
next update. But this is the case for any HOT rollbacks.



One problem is with aborted HOT-updates. According to this scheme, xmin
of the aborted version is stored in the previous version. What happens when
that gets updated again and committed ? If we follow the back pointer from
the aborted tuple to fetch the xmin, we would actually be fetching the txid
of the committed transaction which later updated the tuple. This would fool
us to incorrectly believe that the aborted tuple is LIVE.

May be we can set XMIN_INVALID for the next tuple in the chain when
a tuple being updated has t_ctid pointing to a tuple in the same page,
other than itself.

I am sure there are interesting interactions between vacuum-ing that needs
to be considered as well.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com

Re: [HACKERS] Writing triggers in C++

2007-02-14 Thread Alvaro Herrera

Florian G. Pflug wrote:
 Andreas Pflug wrote:
 Tom Lane wrote:
 Jacob Rief [EMAIL PROTECTED] writes:
   
 I tried to write a trigger using C++.
 
 That is most likely not going to work anyway, because the backend
 operating environment is C not C++.  If you dumb it down enough
 --- no exceptions, no RTTI, no use of C++ library --- then it might
 work, 
 I can confirm that it does work this way.
 
 I've written an aggregate function that uses c++ stl hashes, and it 
 seems to work pretty well. I'd think that using exceptions should be
 fine, as long as you make sure to _always_ catch any exception that
 might be thrown inside your own c++ code, and don't let it propagate
 into backend code. STL allows you to specify custom allocator classes
 as template parameters to hash, vector and the like. You can use that
 to let STL allocate memory from the correct memory context.

What happens if Postgres raises an elog(ERROR) in the code you're
catching exceptions in?  Is it propagated outwards?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[HACKERS] Plan for compressed varlena headers

2007-02-14 Thread Gregory Stark


So to implement the agreed upon plan I see the following items.

1) Replace the VARATT_SIZEP macro with SET_VARLENA_LEN. I intend to keep this
   patch separate as it will bitrot quickly and would be best if it could be
   applied as soon as possible even before the main patch is committed.

   I just sent a patch to do this with some notes to pgsql-patches.

2) Replace VARATT* macros to store and retrieve the toast bits in a manner
   that will work for variable length headers. This either means storing the
   bits at the least-significant position or using network byte order.

   If we want to allow storing 1 headers unaligned which I think would be
   good then I still think we have to read them using bytewise lookups -- ie
   by casting to (char*). That means network byte order or using the low order
   bits is equally efficient.

3) Have VARSIZE and VARATT_SIZE recognize short headers and produce accurate
   values.

4) Change heap_deform*tuple, heap_getattr and any other functions and macros
   in heapam.c that step through tuples to recognize the new headers.

   Actually mostly these should just work because att_addlength uses VARSIZE
   but there may be additional changes. Other places that use att_addlength
   and need to be checked are heaptuple.c, indextuple.c, arrayfuncs.c,
   datum.c, varlena.c, and execQual.c, and flatfiles.c.

5) Change pg_detoast_datum to recognize the new header types and decompress
   them.

5) Change heap_form_tuple to compress headers where possible.

6) Fix the toaster to generate new-style toasted data

Did I miss anything?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-14 Thread Tom Lane

Hannu Krosing [EMAIL PROTECTED] writes:
 Ãhel kenal pÃ¤eval, T, 2007-02-13 kell 09:38, kirjutas Tom Lane:
 It's all moot anyway since 8 bits isn't enough for a pointer ...

 With 8k pages and MAXALIGN=8 we just barely can, as with current page
 structure (tuple headers together with data) the minimal tuple size for
 that case is 24 for header + 8 for data = 32 bytes which means 256
 tuples per page (minus page header) and so we can store tuple number in
 8 bits.

But neither of those assumptions is acceptable --- I would think in fact
that people would become *more* interested in having large pages with
HOT, because it would give more room for updates-on-the-same-page.
Besides, you forgot the case of an empty tuple (= 8 columns, all NULL).

 OTOH, for same page HOT tuples, we have the command and trx ids stored
 twice first as cmax,xmax of the old tuple and as cmin,xmin of the
 updated tuple. One of these could probably be used for in-page HOT tuple
 pointer.

This proposal seems awfully fragile, because the existing
tuple-chain-following logic *depends for correctness* on comparing each
tuple's xmin to prior xmax.  I don't think you can just wave your hands
and say we don't need that cross-check.  Furthermore it seems to me you
haven't fixed the problem, which is that you can't remove the chain
member that is being pointed at by off-page links (either index entries
or a previous generation of the same tuple).  As described, you've made
that problem worse because you're trying to say we don't know which of
the chain entries is pointed at.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] anyelement2 pseudotype

2007-02-14 Thread Tom Lane

Tom Dunstan [EMAIL PROTECTED] writes:
 The enum patch [1] does exactly this with an ANYENUM pseudo-type. It 
 should provide a pretty good overview of what will be required.

ANYENUM?  What's the use-case for that?  These special cases in the
type system are enough of a pain-in-the-neck for the code that I'm
disinclined to add one without a very solid argument for it.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] anyelement2 pseudotype

2007-02-14 Thread Andrew Dunstan


Tom Lane wrote:

Tom Dunstan [EMAIL PROTECTED] writes:
  
The enum patch [1] does exactly this with an ANYENUM pseudo-type. It 
should provide a pretty good overview of what will be required.



ANYENUM?  What's the use-case for that?  These special cases in the
type system are enough of a pain-in-the-neck for the code that I'm
disinclined to add one without a very solid argument for it.
  



Well ... *somebody* suggested it here ... 
http://archives.postgresql.org/pgsql-hackers/2005-11/msg00457.php


;-)

cheers

andrew



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Writing triggers in C++

2007-02-14 Thread Florian G. Pflug


Alvaro Herrera wrote:

Florian G. Pflug wrote:

Andreas Pflug wrote:

Tom Lane wrote:

Jacob Rief [EMAIL PROTECTED] writes:
 

I tried to write a trigger using C++.
   

That is most likely not going to work anyway, because the backend
operating environment is C not C++.  If you dumb it down enough
--- no exceptions, no RTTI, no use of C++ library --- then it might
work, 

I can confirm that it does work this way.
I've written an aggregate function that uses c++ stl hashes, and it 
seems to work pretty well. I'd think that using exceptions should be

fine, as long as you make sure to _always_ catch any exception that
might be thrown inside your own c++ code, and don't let it propagate
into backend code. STL allows you to specify custom allocator classes
as template parameters to hash, vector and the like. You can use that
to let STL allocate memory from the correct memory context.


What happens if Postgres raises an elog(ERROR) in the code you're
catching exceptions in?  Is it propagated outwards?


In my case, the only possible source of an elog(ERROR) would palloc(), 
when the machine is out of memory (Does it even throw elog(ERROR), or

does it return NULL just as malloc() ?). Since this is rather unlikely,
and would probably lead to a postgres shutdown anyway, I didn't really
care about that case.

You're right of course that this is different for triggers - they're 
much more likely to call SPI functions or otherwise interact with the
backend than my rather self-contained aggregate function. Still, I'd 
think that an elog(ERROR) would propagate outwards - but any C++

destructors of local (stack-allocated) objects wouldn't be called.

So, to be safe, I guess one would need to surround any call that could
call elog(ERROR) with an appropriate handler that translates the 
elog(ERROR) into a C++ exception. This C++ exception would have to be

translated back into an elog(ERROR) at the outmost level of C++ code.

Maybe we should create some wiki page or pgfoundry project that collects
all glue code, tipps and tricks that people invented to glue C++ into
the postgres backend.

greetings, Florian Pflug


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] anyelement2 pseudotype

2007-02-14 Thread Tom Lane

Andrew Dunstan [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 ANYENUM?  What's the use-case for that?

 Well ... *somebody* suggested it here ... 
 http://archives.postgresql.org/pgsql-hackers/2005-11/msg00457.php

Well, in that usage (ie, for enum I/O functions) it's not actually
necessary that the type system as a whole understand ANYENUM as
something that any enum type can be cast to, because you're going to
hot-wire the pg_type entries during CREATE ENUM anyway.  What I'm
wondering is if there's a use-case for it during ordinary user
operations with enums.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] HOT WIP Patch - version 1

2007-02-14 Thread Heikki Linnakangas


Pavan Deolasee wrote:
- We need to find a way to handle DEAD root tuples, either convert them 
into

stubs or overwrite them with a new version. We can also perform pointer
swinging from the index. Again there are concerns about crash-safety and
concurrent index-scans working properly. We don't have a community
consensus on any of the suggestions in this regard. But hopefully we
would converge on some design soon.


This seems to be the most fundamental problem we have at the moment. If 
we stick to the basic rule we have now that a live tuple's ctid doesn't 
change, the only way to get rid of a dead tuple at the root of the 
update chain is by changing the index pointer. The backward-pointers 
Hannu suggested or the scan the whole page to find the previous tuple 
would allow reuse of those dead tuples for new tuples in the chain, but 
even those methods wouldn't completely eliminate the problem.


We could say that in some scenarios we just leave behind some dead 
tuples/stubs that can never be reclaimed. What do you guys think, if we 
can bring it down to just an extra line pointer, would that be 
acceptable? We could also do that for now and implement the 
pointer-swinging later if it turns out to be a problem in practice.


What's the verdict on relaxing the live tuple's ctid doesn't change 
rule? If we did allow that within a page, what would we need to change? 
  Inside the backend we'd have to make sure that whenever a ctid is 
used, the page is kept pinned. How likely is it that it would brake any 
external projects, and how difficult would it be to detect the broken 
usage pattern? It would mean that the ctid system column could change 
within a transaction, unless we change it so that it returns the ctid of 
the root tuple + xmin + cmin, but it'd be a user-visible change anyhow.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Writing triggers in C++

2007-02-14 Thread Alvaro Herrera

Florian G. Pflug wrote:
 Alvaro Herrera wrote:
 Florian G. Pflug wrote:
 Andreas Pflug wrote:
 Tom Lane wrote:
 Jacob Rief [EMAIL PROTECTED] writes:
  
 I tried to write a trigger using C++.

 That is most likely not going to work anyway, because the backend
 operating environment is C not C++.  If you dumb it down enough
 --- no exceptions, no RTTI, no use of C++ library --- then it might
 work, 
 I can confirm that it does work this way.
 I've written an aggregate function that uses c++ stl hashes, and it 
 seems to work pretty well. I'd think that using exceptions should be
 fine, as long as you make sure to _always_ catch any exception that
 might be thrown inside your own c++ code, and don't let it propagate
 into backend code. STL allows you to specify custom allocator classes
 as template parameters to hash, vector and the like. You can use that
 to let STL allocate memory from the correct memory context.
 
 What happens if Postgres raises an elog(ERROR) in the code you're
 catching exceptions in?  Is it propagated outwards?
 
 In my case, the only possible source of an elog(ERROR) would palloc(), 
 when the machine is out of memory (Does it even throw elog(ERROR), or
 does it return NULL just as malloc() ?). Since this is rather unlikely,
 and would probably lead to a postgres shutdown anyway, I didn't really
 care about that case.

No, an out-of-memory leads to elog(ERROR), which rolls back the current
transaction.  This releases some memory so the system can continue
working.  In fact we periodically see out-of-memory reports, and they
certainly _don't_ cause a general shutdown.

 You're right of course that this is different for triggers - they're 
 much more likely to call SPI functions or otherwise interact with the
 backend than my rather self-contained aggregate function. Still, I'd 
 think that an elog(ERROR) would propagate outwards - but any C++
 destructors of local (stack-allocated) objects wouldn't be called.

Probably stack allocation doesn't matter much, as I think that would be
unwinded by the longjmp call.  I don't know a lot about C++, but if
there are allocations in the data area then those would probably not be
freed.  But it makes me wonder -- is longjmp very compatible with C++
exceptions at all?  I know that it causes problems with POSIX thread
cancel_push() and cancel_pop() for example (meaning, they can't be
used).

 So, to be safe, I guess one would need to surround any call that could
 call elog(ERROR) with an appropriate handler that translates the 
 elog(ERROR) into a C++ exception. This C++ exception would have to be
 translated back into an elog(ERROR) at the outmost level of C++ code.

Sort of a PG_RE_THROW() in the exception handler, I guess.

 Maybe we should create some wiki page or pgfoundry project that collects
 all glue code, tipps and tricks that people invented to glue C++ into
 the postgres backend.

If it can be made to work, sure; in techdocs.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] integer datetimes

2007-02-14 Thread Tom Lane

Magnus Hagander [EMAIL PROTECTED] writes:
 Our docs for the integer datetime option says:
 Note also that the integer datetimes
 code is newer than the floating-point code, and we still find bugs in it
 from time to time.

 Is the last sentence about bugs really true anymore? At least the buildfarm
 seems to have a lot *more* machines with it enabled than without.

Buildfarm proves only that the regression tests don't expose any bugs,
not that there aren't any.

 (I'm thinking about making it the defautl for the vc++ build, which is
 why I came across that)

FWIW, there are several Linux distros that build their RPMs that way,
so it's not like people aren't using it.  But it seems like we find bugs
in the datetime/interval stuff all the time, as people trip over
different weird edge cases.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Writing triggers in C++

2007-02-14 Thread Neil Conway

On Wed, 2007-02-14 at 13:19 -0300, Alvaro Herrera wrote:
 Probably stack allocation doesn't matter much, as I think that would be
 unwinded by the longjmp call.  I don't know a lot about C++, but if
 there are allocations in the data area then those would probably not be
 freed.  But it makes me wonder -- is longjmp very compatible with C++
 exceptions at all?

C-style stack unwinding (using setjmp and longjmp from csetjmp) is
incompatible with exception-handling and is best avoided. (Stroustrup,
p. 433).

Which presumably means that in practice, the interaction between these
features is implementation-defined.

-Neil



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] anyelement2 pseudotype

2007-02-14 Thread Andrew Dunstan


Tom Lane wrote:

Andrew Dunstan [EMAIL PROTECTED] writes:
  

Tom Lane wrote:


ANYENUM?  What's the use-case for that?
  


  
Well ... *somebody* suggested it here ... 
http://archives.postgresql.org/pgsql-hackers/2005-11/msg00457.php



Well, in that usage (ie, for enum I/O functions) it's not actually
necessary that the type system as a whole understand ANYENUM as
something that any enum type can be cast to, because you're going to
hot-wire the pg_type entries during CREATE ENUM anyway.  What I'm
wondering is if there's a use-case for it during ordinary user
operations with enums.


  


If you look further down in the thread you'll see that I suggested 
hiding it, because i didn't think there was much use case for it in user 
code, but you didn't seem to think much of that idea.


cheers

andrew




---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] NULL and plpgsql rows

2007-02-14 Thread Jim C. Nasby

On Tue, Feb 13, 2007 at 05:55:11PM -0500, Bruce Momjian wrote:
 
 Is there a TODO here?
 
 ---
 
 Jim Nasby wrote:
  On Oct 2, 2006, at 6:28 PM, Tom Lane wrote:
   Jim C. Nasby [EMAIL PROTECTED] writes:
   However, the test right above that means that we'll fail if the user
   tries something like row_variable := NULL;:
  
   The patch you seem to have in mind would allow
 row_variable := int_variable;
   to succeed if the int_variable chanced to contain NULL, which is  
   surely
   not very desirable.

Well, that's Tom's objection, though I'm not sure if by 'int_variable'
he means 'internal' or 'integer'.

Personally, I think it would be useful to just allow setting a row or
record variable to NULL as I showed it above; ie: no variables involved.
This is something you might want to do to invalidate a row/record
variable after taking some action (perhaps deleting a row).

You'd also think that you should be able to detect if a record variable
is null, as you can with row.

So, I suggest:

* Allow row and record variables in plpgsql to be set to NULL

It's not clear if it's a wise idea to allow this assignment from a
variable. It may be better to only allow explicitly setting them,
ie:

row_variable := NULL;

* Allow testing a record variable to see if it's NULL

Currently works for row variables, but not record variables
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] anyelement2 pseudotype

2007-02-14 Thread Matt Miller

  adding an anyelement2 pseudotype ... The context was a
  compatibility SQL function to support Oracle's DECODE function.

 The reason it's not in there already is we didn't seem to have quite
 enough use-case to justify it.  Do you have more?

No.  Even this case, for me, is more an expedient than a necessity. I
could just rewrite my Oracle code to use CASE, but I've a lot of code to
convert, and the transformation is a bit error prone.  I'm also looking
at a scripted code edit to rewrite the Oracle stuff, and comparing this
to the cost a PG compatibility function.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Plan for compressed varlena headers

2007-02-14 Thread Tom Lane

Gregory Stark [EMAIL PROTECTED] writes:
 1) Replace the VARATT_SIZEP macro with SET_VARLENA_LEN.

If we're going to do this then it's time to play the name game; that is,
pick names we actually like and that work well together.  The original
TOAST patch made things a bit messy in this area.  We should try to
improve the situation now, not make it worse.

We have the following macro names that are commonly used for varlena
struct access:

VARHDRSZ
VARATT_SIZEP(x)
VARATT_SIZE(x)
VARATT_DATA(x)
VARSIZE(x)  equivalent to VARATT_SIZE
VARDATA(x)  equivalent to VARATT_DATA

My recollection is that VARHDRSZ, VARSIZE, VARDATA were Berkeley-era
and the other three were added by the TOAST patch.  The lack of
consistency is a bit glaring, and having two ways to do the exact
same thing doesn't seem like an improvement.

A first-cut proposal:

VARHDRSZsame as now, ie, size of 4-byte header
VARSIZE(x)  for *reading* a 4-byte-header length word
VARDATA(x)  same as now, ie, ptr + 4 bytes
SET_VARSIZE(x, len) for *writing* a 4-byte-header length word

We have to remove VARATT_SIZEP anyway to catch unmodified code, and I'd
propose removing VARATT_SIZE and VARATT_DATA too, as they never caught on.

We'll also need names for the macros that can read the length and find
the data of a datum in either-1-or-4-byte-header format.  These should
probably be named as variants of VARSIZE and VARDATA, but I'm not sure
what exactly; any thoughts?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] integer datetimes

2007-02-14 Thread Magnus Hagander

On Wed, Feb 14, 2007 at 11:27:31AM -0500, Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
  Our docs for the integer datetime option says:
  Note also that the integer datetimes
  code is newer than the floating-point code, and we still find bugs in it
  from time to time.
 
  Is the last sentence about bugs really true anymore? At least the buildfarm
  seems to have a lot *more* machines with it enabled than without.
 
 Buildfarm proves only that the regression tests don't expose any bugs,
 not that there aren't any.
 
  (I'm thinking about making it the defautl for the vc++ build, which is
  why I came across that)
 
 FWIW, there are several Linux distros that build their RPMs that way,
 so it's not like people aren't using it.  But it seems like we find bugs
 in the datetime/interval stuff all the time, as people trip over
 different weird edge cases.

Certainly, but is it more likely to trip on these in the integer
datetime case, really? 

//Magnus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] HOT WIP Patch - version 1

2007-02-14 Thread Tom Lane

Heikki Linnakangas [EMAIL PROTECTED] writes:
 What's the verdict on relaxing the live tuple's ctid doesn't change 
 rule?

I think that's unacceptable; it is known that that will break the ODBC
and JDBC drivers, as well as any other programs that make use of the
ctid for re-finding a tuple they read earlier in the same transaction.
We have not only never deprecated client-side use of ctid for this, but
actively encouraged it, for instance by going out of our way to support
fast access for queries WHERE ctid = 'constant'.

What's more, your proposal would break plain old UPDATE and DELETE,
as well as SELECT FOR UPDATE, none of which promise to hold a pin
continuously on every page containing a tuple they might decide to
revisit (by ctid) later.  Are you prepared to disallow hash join and
sort/merge join in all such queries?

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Writing triggers in C++

2007-02-14 Thread Florian G. Pflug


Alvaro Herrera wrote:

Florian G. Pflug wrote:

Alvaro Herrera wrote:

Florian G. Pflug wrote:

Andreas Pflug wrote:

Tom Lane wrote:

Jacob Rief [EMAIL PROTECTED] writes:


I tried to write a trigger using C++.
  

That is most likely not going to work anyway, because the backend
operating environment is C not C++.  If you dumb it down enough
--- no exceptions, no RTTI, no use of C++ library --- then it might
work, 

I can confirm that it does work this way.
I've written an aggregate function that uses c++ stl hashes, and it 
seems to work pretty well. I'd think that using exceptions should be

fine, as long as you make sure to _always_ catch any exception that
might be thrown inside your own c++ code, and don't let it propagate
into backend code. STL allows you to specify custom allocator classes
as template parameters to hash, vector and the like. You can use that
to let STL allocate memory from the correct memory context.

What happens if Postgres raises an elog(ERROR) in the code you're
catching exceptions in?  Is it propagated outwards?
In my case, the only possible source of an elog(ERROR) would palloc(), 
when the machine is out of memory (Does it even throw elog(ERROR), or

does it return NULL just as malloc() ?). Since this is rather unlikely,
and would probably lead to a postgres shutdown anyway, I didn't really
care about that case.


No, an out-of-memory leads to elog(ERROR), which rolls back the current
transaction.  This releases some memory so the system can continue
working.  In fact we periodically see out-of-memory reports, and they
certainly _don't_ cause a general shutdown.


Sorry, I explained my point badly. What I actually meant is that in my
specific use-case (Lots of small transaction, non of which use much 
memory), the only reason for out-of-memory conditions I've even seen
was some application gone wild that ate up all available memory. In that 
case, postgres dies sooner or later, because any memory freed during 
rollback is immediatly used by that other application. In general, of 
course, you're right.


You're right of course that this is different for triggers - they're 
much more likely to call SPI functions or otherwise interact with the
backend than my rather self-contained aggregate function. Still, I'd 
think that an elog(ERROR) would propagate outwards - but any C++

destructors of local (stack-allocated) objects wouldn't be called.


Probably stack allocation doesn't matter much, as I think that would be
unwinded by the longjmp call.  I don't know a lot about C++, but if
there are allocations in the data area then those would probably not be
freed.  But it makes me wonder -- is longjmp very compatible with C++
exceptions at all?  I know that it causes problems with POSIX thread
cancel_push() and cancel_pop() for example (meaning, they can't be
used).


Yeah, the memory taken by stack-allocated objects is freed (basically by
just resetting the stack pointer). But normally, C++ would call the 
destructor of a stack-allocated objects _before_ resetting the 
stack-pointer. Since setjmp/longjmp don't know anything about C++, they

will omit this step. Whether this causes problems or not depends on the
objects that you allocated on the stack...


So, to be safe, I guess one would need to surround any call that could
call elog(ERROR) with an appropriate handler that translates the 
elog(ERROR) into a C++ exception. This C++ exception would have to be

translated back into an elog(ERROR) at the outmost level of C++ code.


Sort of a PG_RE_THROW() in the exception handler, I guess.


Maybe we should create some wiki page or pgfoundry project that collects
all glue code, tipps and tricks that people invented to glue C++ into
the postgres backend.


If it can be made to work, sure; in techdocs.


I was thinking that two pairs of macros,
PG_BEGIN_CPP, PG_END_CPP and
PG_CPP_BEGIN_BACKEND, PG_CPP_END_BACKEND
should be able to take care of the exception handling issues.

You'd need to wrap any code-block that calls postgres functions that 
might do an elog(ERROR) inside PG_CPP_BEGIN_BACKEND, PG_CPP_END_BACKEND.


Vice versa, any block of c++ code that is called from the backend would
need to start with PG_BEGIN_CPP, and end with PG_END_CPP.

Do you see any other possible problems, aside from memory managemt issues?

greetings, Florian Pflug


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] Writing triggers in C++

2007-02-14 Thread Florian G. Pflug


Neil Conway wrote:

On Wed, 2007-02-14 at 13:19 -0300, Alvaro Herrera wrote:

Probably stack allocation doesn't matter much, as I think that would be
unwinded by the longjmp call.  I don't know a lot about C++, but if
there are allocations in the data area then those would probably not be
freed.  But it makes me wonder -- is longjmp very compatible with C++
exceptions at all?


C-style stack unwinding (using setjmp and longjmp from csetjmp) is
incompatible with exception-handling and is best avoided. (Stroustrup,
p. 433).

Which presumably means that in practice, the interaction between these
features is implementation-defined.


Well, as long as you don't longjmp past an C++ catch block, and don't
throw an C++ exception past an setjmp handler, there should be no 
problem I think. Or at least I can't imagine how a problem could arise..


greetings, Florian Pflug

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] integer datetimes

2007-02-14 Thread Andrew Dunstan


Tom Lane wrote:

Magnus Hagander [EMAIL PROTECTED] writes:
  

Our docs for the integer datetime option says:
Note also that the integer datetimes
code is newer than the floating-point code, and we still find bugs in it
from time to time.



  

Is the last sentence about bugs really true anymore? At least the buildfarm
seems to have a lot *more* machines with it enabled than without.



Buildfarm proves only that the regression tests don't expose any bugs,
not that there aren't any.

  

(I'm thinking about making it the defautl for the vc++ build, which is
why I came across that)



FWIW, there are several Linux distros that build their RPMs that way,
so it's not like people aren't using it.  But it seems like we find bugs
in the datetime/interval stuff all the time, as people trip over
different weird edge cases.


  


I think it's disappointing, to say the least, that we treat this code as 
a sort of second class citizen. BTW, the buildfarm has a majority of 
machines using it by design - it's in the default set of options in the 
distributed config file. If we think there are bugs we haven't found, 
then we need to engage in some sort of analytical effort to isolate 
them. I don't see any reason in principle why this code should be any 
more buggy than the float based datetimes, and I see plenty of reason in 
principle why we should make sure it's right.


cheers

andrew



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] HOT WIP Patch - version 1

2007-02-14 Thread Heikki Linnakangas


Tom Lane wrote:

Heikki Linnakangas [EMAIL PROTECTED] writes:
What's the verdict on relaxing the live tuple's ctid doesn't change 
rule?


I think that's unacceptable; it is known that that will break the ODBC
and JDBC drivers, as well as any other programs that make use of the
ctid for re-finding a tuple they read earlier in the same transaction.


AFAIK the JDBC driver doesn't use ctid. But ODBC and other programs do.


We have not only never deprecated client-side use of ctid for this, but
actively encouraged it, for instance by going out of our way to support
fast access for queries WHERE ctid = 'constant'.


The idea I had was to change what the ctid system column returns to 
root ctid + xmin + cmin. As long as programs treat the ctid as an opaque 
string, it should work. Tid scan would use that to locate the original 
tuple.



What's more, your proposal would break plain old UPDATE and DELETE,
as well as SELECT FOR UPDATE, none of which promise to hold a pin
continuously on every page containing a tuple they might decide to
revisit (by ctid) later.  Are you prepared to disallow hash join and
sort/merge join in all such queries?


No, of course not. We'd have to do the same thing here; use root tid + 
xmin + cmin instead of just ctid.


But now that I think of it, how do we get the root tid of a tuple? I 
suppose we'd be back to having backpointers or scanning the whole 
page... I guess pointer-swinging it is, then.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] HOT WIP Patch - version 1

2007-02-14 Thread Pavan Deolasee


On 2/14/07, Tom Lane [EMAIL PROTECTED] wrote:


Heikki Linnakangas [EMAIL PROTECTED] writes:
 What's the verdict on relaxing the live tuple's ctid doesn't change
 rule?

I think that's unacceptable; it is known that that will break the ODBC
and JDBC drivers, as well as any other programs that make use of the
ctid for re-finding a tuple they read earlier in the same transaction.
We have not only never deprecated client-side use of ctid for this, but
actively encouraged it, for instance by going out of our way to support
fast access for queries WHERE ctid = 'constant'.

What's more, your proposal would break plain old UPDATE and DELETE,
as well as SELECT FOR UPDATE, none of which promise to hold a pin
continuously on every page containing a tuple they might decide to
revisit (by ctid) later.  Are you prepared to disallow hash join and
sort/merge join in all such queries?




Not that I am suggesting we do this, but I believe we had some solution
to this problem in the earlier version of HOT. The live tuple when
copied-back
to the root tuple, the tuple is marked with a HEAP_COPIED_BACK flag.

HeapTupleSatisfiesUpdate() checks for this flag and if set returns a new
return code, HeapTupleCopiedBack. heap_update() returns the same
to ExecUpdate along with the ctid of the root tuple. The UPDATE/DELETE
operation then retried on the root tuple, very similar to read-committed
update/delete. The xmax of the copied-back tuple is set so that its
not vacuumed away until all the current transactions are completed.

Though I have tested this patch several times and it seems to work fine,
I probably don''t have insight into the code as much others on this list
has. So if someone wants to take a look and see if it would work fine,
I would be more than happy to post the latest HOT patch (older design).

Thanks,
Pavan



--

EnterpriseDB http://www.enterprisedb.com

Re: [HACKERS] HOT WIP Patch - version 1

2007-02-14 Thread Zeugswetter Andreas ADI SD


 What's the verdict on relaxing the live tuple's ctid doesn't 
 change rule? If we did allow that within a page, what would 
 we need to change?

I already said this, but why would this need to be visible from the
outside ?

A few assumptions:
no back pointers
indexes only point at slots marked as roots (and non hot tuples)

During vacuum, you swap the tuples and keep a stub at the slot that the
user's ctid might be pointing at. You mark the stub to detect this
situation.
When a select/update by ctid comes along it needs to do one step to the
root
and use that tuple instead.

It needs a second vacuum (or a per page vacuum during update) to remove
the 
extra stub when it is dead and not recently dead. 

I fail to see the hole.

Andreas

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Writing triggers in C++

2007-02-14 Thread Andreas Seltenreich

Florian G. Pflug writes:

 Maybe we should create some wiki page or pgfoundry project that collects
 all glue code, tipps and tricks that people invented to glue C++ into
 the postgres backend.

 If it can be made to work, sure; in techdocs.

 I was thinking that two pairs of macros,
 PG_BEGIN_CPP, PG_END_CPP and
 PG_CPP_BEGIN_BACKEND, PG_CPP_END_BACKEND
 should be able to take care of the exception handling issues.

 You'd need to wrap any code-block that calls postgres functions that
 might do an elog(ERROR) inside PG_CPP_BEGIN_BACKEND,
 PG_CPP_END_BACKEND.

 Vice versa, any block of c++ code that is called from the backend would
 need to start with PG_BEGIN_CPP, and end with PG_END_CPP.

I've made positive experiences with such a setup, although I've spared
the PG_BEGIN_CPP/PG_END_CPP by doing the exception conversion in a C++
language handler that instantiates functors using the portable class
loading technique described in this paper:

http://www.s11n.net/papers/classloading_cpp.html

I'd be glad to help out on a pgfoundry project to make C++ a better
citizen for extending postgres.

regards,
andreas

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] HOT WIP Patch - version 1

2007-02-14 Thread Zeugswetter Andreas ADI SD


 But now that I think of it, how do we get the root tid of a 
 tuple? I suppose we'd be back to having backpointers or 
 scanning the whole page... I guess pointer-swinging it is, then.

During vacuum you see a root [stub] not recently dead. You follow 
the chain to detect if you find a live tuple that can replace
the root. You replace the root. You replace the original with a stub
that points at the root and mark it recently dead (and HEAP_COPIED_BACK
aka Pavan). ... (see prev post)
No need for anyone but vacuum to find a root.

Andreas

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] anyelement2 pseudotype

2007-02-14 Thread Tom Dunstan


Tom Lane wrote:

Andrew Dunstan [EMAIL PROTECTED] writes:

Tom Lane wrote:

ANYENUM?  What's the use-case for that?


Well ... *somebody* suggested it here ... 
http://archives.postgresql.org/pgsql-hackers/2005-11/msg00457.php


Well, in that usage (ie, for enum I/O functions) it's not actually
necessary that the type system as a whole understand ANYENUM as
something that any enum type can be cast to, because you're going to
hot-wire the pg_type entries during CREATE ENUM anyway.


Well, it's not just I/O functions in pg_type, it's functions, operators, 
aggregates, index methods etc. There are 34 OIDs used up by the enum 
patch, and most of those catalog entries would have to be duplicated per 
enum type by CREATE TYPE in the absence of ANYENUM; since you'd given 
the hand-wavy suggestion anyway, it seemed better not to spam the catalogs.


Regarding the type system understanding ANYENUM, most of the type system 
treats ANYENUM identically to ANYELEMENT, the only parts that really 
need to understand it are the bits that try to tie down concrete types. 
For those, non-enum types are rejected if the generic type is ANYENUM. 
That's it, basically.



What I'm
wondering is if there's a use-case for it during ordinary user
operations with enums.


Not really. I allowed it to occur in plpgsql, mostly for completeness, 
but I didn't bother for the other P/Ls as there didn't seem to be much 
of a use case.


Cheers

Tom

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] anyelement2 pseudotype

2007-02-14 Thread Tom Lane

Tom Dunstan [EMAIL PROTECTED] writes:
 Regarding the type system understanding ANYENUM, most of the type system 
 treats ANYENUM identically to ANYELEMENT, the only parts that really 
 need to understand it are the bits that try to tie down concrete types. 

The reason I'm feeling annoyed with ANYfoo stuff today is that yesterday
I had to put a special hack for ANYARRAY into the ri_triggers code,
which you'd think would have no concern with it.  But perhaps this is
just an indication that we need to refactor the code in parse_coerce.c.
(The problem in ri_triggers is that it uses find_coercion_pathway()
which does not concern itself with ANYfoo types.)

Anyway, objection withdrawn --- I just thought it seemed a good idea to
question whether we were adding a frammish we didn't really need.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] HOT WIP Patch - version 1

2007-02-14 Thread Bruce Momjian


Just to summarize:

o  Every tuple gets a heap ctid
o  Only the root tuple gets an index entry
o  We can easily remove dead tuples that aren't the root because
   by definition, nothing points to them, including backends and
   indexes

The problem is that a dead root tuple has to stay around because while
no backends can see it, the index does.  We could move a live tuple into
root ctid slot, but if we do that, the live tuple changes its ctid while
it is visible.

Could we insert index tuples for the live tuple and then remove the root
tuple, perhaps later?  So basically we break the chain at that time. 
The problem there is that we basically have nothing better than what we
have now --- we are just delaying the index insert, and I don't see what
that buys us.

Could a _new_ tuple take over the root tuple slot?  It is new, so it
doesn't have a ctid yet to change.  I think that means the index walking
could go forward or backward, but on the same page.   To illustrate,
with ctid slot numbers:

[1] root INSERT
[2]
[3]

[1] root INSERT
[2] UPDATE
[3]

[1] root INSERT (dead)
[2] UPDATE 1
[3]

[1] root UPDATE 2
[2] UPDATE 1
[3]

---

Heikki Linnakangas wrote:
 Pavan Deolasee wrote:
  - We need to find a way to handle DEAD root tuples, either convert them 
  into
  stubs or overwrite them with a new version. We can also perform pointer
  swinging from the index. Again there are concerns about crash-safety and
  concurrent index-scans working properly. We don't have a community
  consensus on any of the suggestions in this regard. But hopefully we
  would converge on some design soon.
 
 This seems to be the most fundamental problem we have at the moment. If 
 we stick to the basic rule we have now that a live tuple's ctid doesn't 
 change, the only way to get rid of a dead tuple at the root of the 
 update chain is by changing the index pointer. The backward-pointers 
 Hannu suggested or the scan the whole page to find the previous tuple 
 would allow reuse of those dead tuples for new tuples in the chain, but 
 even those methods wouldn't completely eliminate the problem.
 
 We could say that in some scenarios we just leave behind some dead 
 tuples/stubs that can never be reclaimed. What do you guys think, if we 
 can bring it down to just an extra line pointer, would that be 
 acceptable? We could also do that for now and implement the 
 pointer-swinging later if it turns out to be a problem in practice.
 
 What's the verdict on relaxing the live tuple's ctid doesn't change 
 rule? If we did allow that within a page, what would we need to change? 
Inside the backend we'd have to make sure that whenever a ctid is 
 used, the page is kept pinned. How likely is it that it would brake any 
 external projects, and how difficult would it be to detect the broken 
 usage pattern? It would mean that the ctid system column could change 
 within a transaction, unless we change it so that it returns the ctid of 
 the root tuple + xmin + cmin, but it'd be a user-visible change anyhow.
 
 -- 
Heikki Linnakangas
EnterpriseDB   http://www.enterprisedb.com
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Fixing insecure security definer functions

2007-02-14 Thread Jeff Davis

On Tue, 2007-02-13 at 20:01 -0500, Tom Lane wrote:
 I would suggest that the search path be added as an explicit parameter
 to CREATE FUNCTION, with a default of the current setting.  The main
 reason for this is that it's going to be a real PITA for pg_dump if we
 don't allow an explicit specification.
 
 It might also be worth allowing PATH NULL or some such locution to
 specify the current behavior, for those who really want it.  (In
 particular, most C functions would want this to avoid useless overhead
 for calls to things that aren't affected by search path.)
 

It might also be useful to allow something such as PATH CURRENT to
attach the current schema as the search path for all calls of that
function. 

This would be useful because then SQL scripts for installing 3rd party
modules could install nicely into any schema by merely setting
search_path before running the script.

For instance, PostGIS doesn't support installing into a schema other
than public because they want to have a static SQL install script
rather than generate one based on your desired search path.

Regards,
Jeff Davis


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Plan for compressed varlena headers

2007-02-14 Thread Bruce Momjian

Gregory Stark wrote:
 2) Replace VARATT* macros to store and retrieve the toast bits in a manner
that will work for variable length headers. This either means storing the
bits at the least-significant position or using network byte order.
 
If we want to allow storing 1 headers unaligned which I think would be
good then I still think we have to read them using bytewise lookups -- ie
by casting to (char*). That means network byte order or using the low order
bits is equally efficient.

I think the plan was to have the macro code conditional on big-little
endian.  We can deal with doing 1 headers unaligned at some future
date if we feel we need it, and the macros will make it transparent.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] HOT WIP Patch - version 1

2007-02-14 Thread mark

On Wed, Feb 14, 2007 at 01:56:03PM -0500, Bruce Momjian wrote:
 Could we insert index tuples for the live tuple and then remove the root
 tuple, perhaps later?  So basically we break the chain at that time. 
 The problem there is that we basically have nothing better than what we
 have now --- we are just delaying the index insert, and I don't see what
 that buys us.

At some point - inserting into the block would not be possible, as
there is no free space. Would that be a good time to do the index
insert? Then, a later vacuum would eventually prune out the whole old
chain. As long as vacuuming the intermediate entries in the chain
keeps the block with free space, there is no need to remove the root
tuple. If space ever runs out (vacuum not running frequently enough -
many updates performed in the same interval) - fall back to the
mechanism that is being used today.

I see it buying increased performance for rows that are frequently
updated.  If it can delay modifying the indices to only once every 10
or more updates, it seems to me that the improvement should be
significant. Perhaps PostgreSQL could be used for page hit counters
again... :-)

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] anyelement2 pseudotype

2007-02-14 Thread Tom Lane

I wrote:
 Tom Dunstan [EMAIL PROTECTED] writes:
 Regarding the type system understanding ANYENUM, most of the type system 
 treats ANYENUM identically to ANYELEMENT, the only parts that really 
 need to understand it are the bits that try to tie down concrete types. 

 The reason I'm feeling annoyed with ANYfoo stuff today is that yesterday
 I had to put a special hack for ANYARRAY into the ri_triggers code,
 which you'd think would have no concern with it.

Actually ... now that I re-read that remark, I think you may have done
the wrong things with ANYENUM.  I think that ANYENUM may in fact be
closer to ANYARRAY than it is to ANYELEMENT, because ANYELEMENT pretty
nearly means anything at all whereas ANYARRAY identifies a subset of
types that share some properties, which is an accurate description of
ANYENUM as well.  In particular, it is sensible to have b-tree index
opclasses that are declared to operate on ANYARRAY.  If you've
got b-tree support for ANYENUM, as I hope you do, then you'll have to
patch that same spot in ri_triggers that now knows about ANYARRAY.

So you might want to take another pass through the code and see if you
shouldn't be modeling ANYENUM more closely on ANYARRAY than ANYELEMENT.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Plan for compressed varlena headers

2007-02-14 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 Gregory Stark wrote:
 If we want to allow storing 1 headers unaligned which I think would be
 good then I still think we have to read them using bytewise lookups -- ie
 by casting to (char*). That means network byte order or using the low order
 bits is equally efficient.

 I think the plan was to have the macro code conditional on big-little
 endian.  We can deal with doing 1 headers unaligned at some future
 date if we feel we need it, and the macros will make it transparent.

Forcing bytewise access does not sound like a good plan to me --- you're
very much at the mercy of the compiler whether you get good code for
that.  Plus you can't do it without multiple evaluation of the macro
argument, which is something I'd really prefer we not introduce into
such a widely-used macro.  The only argument in favor is to save a
couple bytes of alignment padding, but since this is only going to
happen for wide data values, the significance of that is minimal.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Reducing likelihood of deadlocks (was referential Integrity and SHARE locks)

2007-02-14 Thread Alex Hayward

On Tue, 13 Feb 2007, Marc Munro wrote:

 On Mon, 2007-12-02 at 00:10 -0500, Tom Lane wrote:
  Marc Munro [EMAIL PROTECTED] writes:
   Consider a table C containing 2 child records C1 and C2, of parent P.
   If transaction T1 updates C1 and C2, the locking order of the the
   records will be C1, P, C2.  Another transaction, T2, that attempts to
   update only C2, will lock the records in order C2, P.
 
   The locks on C2 and P are taken in different orders by the two
   transactions, leading to the possibility of deadlock.
 
  But the lock on P is shared, hence no deadlock.

 Doh!  Yes, you are right.  It is not that simple.

 For deadlock to occur, we need a transaction that takes an exclusive
 lock on P as well as on one of the children.  Let us replace T2 with a
 new transaction, T3, which is going to update P and only one of its
 children.

 If T3 is going to update P and C1 without the possibility of deadlock
 against T1, then it must take out the locks in the order C1, P.  If, on
 the other hand, it is going to update P and C2, then the locks must be
 taken in the order P, C2.

 This means that there is no single strategy we can apply to T3 that will
 guarantee to avoid deadlocks with transactions that update only C (ie
 transactions, which to a developers point of view do nothing to P, and
 so should be unable to deadlock with T3).

This scenario would do it, too:

  Table X has rows representing an object of some kind. These objects
  contain other objects, which are represented by rows in table Y. Suppose
  X stores a count of the Y objects it contains in a particular
  status (because your application needs to get this quickly) and suppose
  the count is updated by a trigger. The Y objects hold references to the
  containing X, checked by FK constraints.

  A query which updates the status of one or more Ys can deadlock with
  another instance of itself. It first locks a Y row, then shared-locks an
  X row, then updates the X row (when the trigger runs). Two transactions
  could get to the shared-lock stage simultaneously, then deadlock.

I've come across some a bit like this in my own applications. I'm sure
there are many, many, others.

 From an application developer's standpoint there are few options, none
 of them ideal:

 1) Insist on a locking policy that requires updates to first lock their
 parent records.

 This is horrible for so many reasons.  It should be unnecessary; it
 causes exclusive locking on parent records, thereby eliminating the
 gains made by introducing row share locks in 8.1; it is onerous on the
 developers; it is error-prone; etc

I once tried to define a locking order for rows in a database. It doesn't
work (though this was at a time when FK constraint checking used FOR
UPDATE locks, which, of course, made things much worse). This wasn't
specifically for FK checks, but they were an important cause of deadlocks.

Firstly, you have no idea of the order in which rows locked by a statement
will be locked. UPDATE d SET counter=counter+1 WHERE d.a=1 could deadlock
with UPDATE d SET counter=counter+1 WHERE d.b=1. Secondly, even if you
could defining a usable locking order across all of the rows in your
database (not just the tables) is nearly impossible. I suppose you could
base it on, say, the order (tablename, id) but you'd have to go to extreme
lengths to follow this. Imagine having to determine the id of every row
you want to update and the ID and table name of every row they'll lock
because of FK constraints and then sort a big set of 'SELECT ..  FOR
SHARE' and 'UPDATE' statements.

That's a lot of queries - and huge variety of useful queries you can't use
any more.

And once you've done all that you might find your application has race
conditions - there are sometimes other reasons for performing queries in a
certain order.

 2) Remove FK constraints to eliminate the possibility of RI-triggered
 deadlocks.

 Ugh.

Deadlocks aren't the only problem FK constraints' locks are going to cause
you.  It's quite possible you have, somewhere, a small number of rows
referenced via FKs by a huge number of rows. Think about the amount of
locking (not just locks on rows, but internal locks on bits of cache) and
cache coherency logic going on in a production SMP machine.

It'll depend on your load and schema, of course, but I've found the
constraints to be mostly impractical in my production systems. Some I can
keep, but only if I know that the checks aren't going to be triggered too
often.

 3) Encapsulate all transactions in some form of retry mechanism that
 traps deadlocks and retries those transactions.

 This may not be practicable, and incurs all of the overhead of
 encountering and trapping deadlocks in the first place.  Also, as each
 deadlock occurs, a number of locks will be left active before deadlock
 detection kicks in, increasing the window for further deadlocks.  On a
 busy system, the first deadlock may well trigger a cascade of further
 deadlocks.

It's essential to

Re: [HACKERS] integer datetimes

2007-02-14 Thread Magnus Hagander

On Wed, Feb 14, 2007 at 12:38:12PM -0500, Andrew Dunstan wrote:
 Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
   
 Our docs for the integer datetime option says:
 Note also that the integer datetimes
 code is newer than the floating-point code, and we still find bugs in it
 from time to time.
 
 
   
 Is the last sentence about bugs really true anymore? At least the 
 buildfarm
 seems to have a lot *more* machines with it enabled than without.
 
 
 Buildfarm proves only that the regression tests don't expose any bugs,
 not that there aren't any.
 
   
 (I'm thinking about making it the defautl for the vc++ build, which is
 why I came across that)
 
 
 FWIW, there are several Linux distros that build their RPMs that way,
 so it's not like people aren't using it.  But it seems like we find bugs
 in the datetime/interval stuff all the time, as people trip over
 different weird edge cases.
 
  
   
 
 I think it's disappointing, to say the least, that we treat this code as 
 a sort of second class citizen. BTW, the buildfarm has a majority of 
 machines using it by design - it's in the default set of options in the 
 distributed config file. If we think there are bugs we haven't found, 
 then we need to engage in some sort of analytical effort to isolate 
 them. I don't see any reason in principle why this code should be any 
 more buggy than the float based datetimes, and I see plenty of reason in 
 principle why we should make sure it's right.

That was exactly what I thought, which is why I was kinda surprised to
see that note in the configure stuff.

If we go with that, then we can say that *any* new feature is less
tested, no? ;-)

//Magnus

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Day and month name localization uses wrong locale category

2007-02-14 Thread Bruce Momjian


Added to TODO:

o Use LC_TIME for localized weekday/month names, rather than
  LC_MESSAGES

  http://archives.postgresql.org/pgsql-hackers/2006-11/msg00390.php

---

Euler Taveira de Oliveira wrote:
 Peter Eisentraut wrote:
 
   What about using strftime()? So we couldn't worry about gettext
   translations; all is localized.
   Why didn't I think it before? :-)
  
   I'll try to code a patch today later if noone objects.
  
  How is this going?
  
 Finished. Sorry for the delay I had some trouble understanding how
 backend treats the locale stuff (Neil pointed out the path).
 Now TM mode is returning strftime() output. It would be nice if in the
 future we change this to pg_strftime() but unfortunately the last one is
 not i18n. :(
 
 template1=# show lc_time;
  lc_time
 -
  pt_BR
 (1 registro)
 
 template1=# select to_char(now(), 'TMDay, DD TMMonth ');
   to_char
 ---
  Segunda, 20 Novembro 2006
 (1 registro)
 
 template1=# set lc_time to 'C';
 SET
 template1=# select to_char(now(), 'TMDay, DD TMMonth ');
  to_char
 --
  Monday, 20 November 2006
 (1 registro)
 
 template1=# set lc_time to 'de_DE';
 SET
 template1=# select to_char(now(), 'TMDay, DD TMMonth ');
  to_char
 --
  Montag, 20 November 2006
 (1 registro)
 
 template1=#
 
 
 Comments?
 
 -- 
   Euler Taveira de Oliveira
   http://www.timbira.com/

[ Attachment, skipping... ]

 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] XML regression test failure

2007-02-14 Thread Stefan Kaltenbrunner

sponge failed the XML regression test once today:

http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=spongedt=2007-02-14%2007:30:02

If i read that correctly we could just add a simple ORDER BY
table_name to that query to get a more reliable result.


Stefan

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] HOT WIP Patch - version 1

2007-02-14 Thread Heikki Linnakangas


Zeugswetter Andreas ADI SD wrote:

A few assumptions:
no back pointers
indexes only point at slots marked as roots (and non hot tuples)

During vacuum, you swap the tuples and keep a stub at the slot that the
user's ctid might be pointing at. You mark the stub to detect this
situation.
When a select/update by ctid comes along it needs to do one step to the
root
and use that tuple instead.


As Pavan pointed out, that's more or less what he ended up doing 
originally. You need to mark the stub with the current most recent xid, 
and wait until that's no longer running. Only after that you can remove 
the stub.



It needs a second vacuum (or a per page vacuum during update) to remove
the 
extra stub when it is dead and not recently dead. 


Requiring two vacuums to remove the tuple sounds bad at first, but it's 
actually not so bad since both steps could by done by retail vacuum, or 
even normal scans while.



I fail to see the hole.


The only potential problem I can see is how to make sure that a heap 
scan or a bitmap heap scan doesn't visit the tuple twice. If we make 
sure that the page is scanned in one go while keeping the buffer pinned, 
we're good. We already do that except for system catalogs, so I believe 
we'd have to forbid hot updates on system tables, like we forbid bitmap 
scans.


To me this sounds like the best idea this far.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] Fixing insecure security definer functions

2007-02-14 Thread Josh Berkus

Andreas,

 Have you considered hardcoding the schema for each object where it was
 found at creation time ? This seems more intuitive to me.

This isn't practical.  Consider the schema qualification syntax for 
operators.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Plan for compressed varlena headers

2007-02-14 Thread Gregory Stark


Tom Lane [EMAIL PROTECTED] writes:

 Plus you can't do it without multiple evaluation of the macro argument,
 which is something I'd really prefer we not introduce into such a
 widely-used macro.

I don't see any way to do VARSIZE without multiply evaluating its argument.
It's got to mask out the relevant bits then take the appropriate number of
bytes and shift the appropriate number of bits to the right.

If we wanted to require GCC we could use temporary variables in macros. Or we
could use a global variable and declare that you can't use VARSIZE inside the
argument to VARSIZE. (Actually I can't construct a scenario where it would
break, perhaps it's safe.)

But I just did a find-grep and the most complicated expression I can find is
VARATT_SIZE(DatumGetPointer(values[i])). And that's in code I think I'll have
to touch anyways. I can't find any instances of anything that would be
dangerous or noticeably inefficient.

It would be pretty strange code that wanted to know the size of a datum but
didn't care enough about the actual contents of the datum to store it in a
temporary variable. The only circumstances I could see it happening is if
someone wrote code like:

  len = VARSIZE(datum = DirectFunctionCall())

There are no instances that I can find of that form that I can find.


 The only argument in favor is to save a couple bytes of alignment padding,
 but since this is only going to happen for wide data values, the
 significance of that is minimal.

Yeah I realized the same thing earlier. At least in the case of four-byte
headers padding is logical on all fronts since nul bytes will have the right
bitpattern. And it's a big cpu win on four-byte headers since it affects
reading non-compressed varlenas being passed around in the executor.

I'm not sure the same logic holds for two-byte headers. They're a) not so
expensive in the first place, b) not so large in the first place and c) would
require padding with a special pad byte. 

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Plan for compressed varlena headers

2007-02-14 Thread Tom Lane

Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 Plus you can't do it without multiple evaluation of the macro argument,

 I don't see any way to do VARSIZE without multiply evaluating its argument.

Some variant of

#define VARSIZE(x)  (ntohl((x)-vl_len)  0x3fff)
#define VARSIZE(x)  ((x)-vl_len  2)

The 1-or-4-byte version is a lot harder, but also will be used in a lot
fewer places, all of which will get looked at when it gets installed.
I'm prepared to put up with multiple eval for that.  I *don't* want to
assume that existing code can tolerate multiple eval in a macro that has
existed forever and never did it before.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Variable length varlena headers redux

2007-02-14 Thread elein

On Tue, Feb 13, 2007 at 01:32:11PM -0500, Bruce Momjian wrote:
 Gregory Stark wrote:
  Alternatively, what does the trailing a in varlena signify? Would this be
  varlenb?
 
 attribute
 
 -- 

Actually varlena stands for variable length array.

elein
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Variable length varlena headers redux

2007-02-14 Thread elein

I want to register, just in principle, that I object to changing
the structure of a varlena.  

The idea behind the data type is simple, clean and fast.  And it is 
easily understood by developers and by people developing applications 
and functions in PostgreSQL.

Of course you will do what you will.  Be careful out there.

elein
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Variable length varlena headers redux

2007-02-14 Thread Tom Lane

elein [EMAIL PROTECTED] writes:
 I want to register, just in principle, that I object to changing
 the structure of a varlena.  

Indeed, I'm doing my best to restrain Greg from changing what a
datatype-specific function will see by default.  The stuff with
variable-length headers should be confined to a relatively small
number of functions in which we're willing to pay a code-ugliness
penalty for speed.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] XML regression test failure

2007-02-14 Thread Tom Lane

Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 sponge failed the XML regression test once today:
 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=spongedt=2007-02-14%2007:30:02
 If i read that correctly we could just add a simple ORDER BY
 table_name to that query to get a more reliable result.

Done.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

55 matches

Mail list logo