Re: [HACKERS] PGBuildfarm member skylark Branch HEAD Failed at Stage Make

2007-09-29 Thread Magnus Hagander
PG Build Farm wrote:
 The PGBuildfarm member skylark had the following event on branch HEAD:
 
 Failed at Stage: Make
 
 The snapshot timestamp for the build that triggered this notification is: 
 2007-09-29 03:00:01
 
 The specs of this machine are:
 OS:  Windows XP / SP2
 Arch: x64
 Comp: Visual C++ / 14.00.50727.762
 
 For more information, see 
 http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=skylarkbr=HEAD

I think this just needs a new object added to the libpgport list in
Mkvcbuild.pm at line 46-50. I can do this monday, no earlier :-( (I
could do it now, but I'm mobile so I can't test it, so I won't do it) If
someone else who can actually test it wants to put that in, please do.

//Magnus

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [COMMITTERS] pgsql: Temporarily modify tsearch regression tests to suppress notice

2007-09-29 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  That's not fixing the problem, unless your proposal includes never
  issuing any warnings at all, for anything.
 
  No warning for * because it is intentional, but warning for actual
  stop words.
 
 No, you're focusing on one symptom not the problem.  The problem is
 that we've got user-visible behavior going on during what's effectively
 a chance event, ie, a cache reload.
 
 One possible real solution would be to tweak the dictionary APIs so
 that the dictionaries can find out whether this is the first load during
 a session, or a reload, and emit notices only in the first case.

Yea, that would work too.  Or just throw an error for a stop word in the
file and then you never get a reload (use * instead).

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[HACKERS] Something's been bugging me

2007-09-29 Thread Gregory Stark

A while back in an off-hand comment Tom packed varlenas he mentioned that we
might want to have more types of toast pointers. Since then the idea of some
alternative column-wise partitioning scheme has come up and another idea I've
been tossing around is some kind of compression scheme which takes advantage
of information across rows.

The current varvarlena scheme doesn't leave any room for any expansion at all.
Every possible value of varlena headers is meaningful and could potentially
exist in a database. So any such scheme would be end any hope of binary
compatibility in the future and probably any hope of a migrator since it would
potentially expand existing data. 

Now not every possible idea for these options would need space in the varlena
header but many would.

I'm wondering whether it doesn't make sense to lower VARATT_SHORT_MAX to 0x70
to allow for at least a small number of constant values which could indicate
some special type of datum. That could be used to indicate that a fixed size
pointer like a toast pointer follows. That could be used for something like
common value compression. [*]

I'm almost tempted to suggest lowering it as far as 0x3f which would give us a
whole bit. That would be necessary if we wanted to allow for variable length
fields with some alternate interpretation following the header such as some
very light compression scheme. But I'm pretty loath to give up as much as that
now for only a potential future gain.

[*] Yes, this does make the idea of keeping the VARHDRSZ/VARSHDRSZ offset in
the varlen header seem pretty silly; hindsight is 20/20 and all that.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] PG on NFS may be just a bad idea

2007-09-29 Thread Zdenek Kotala

Tom Lane wrote:



If this is what's happening I'd claim it is a kernel bug, but seeing
that I see it on FC6 and Miya sees it on Solaris 10, it would be a bug
widespread enough that we'd not be likely to get it killed off soon.



I think my colleague was solving similar issue in JavaDB. IIRC the 
problem is in how NFS works and conclusion was do not use JavaDB (Derby) 
on NFS. I forwarded this issue to our NFS gurus and  I will send updated 
information.


Zdenek

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] PGBuildfarm member skylark Branch HEAD Failed at Stage Make

2007-09-29 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 PG Build Farm wrote:
 For more information, see 
 http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=skylarkbr=HEAD

 I think this just needs a new object added to the libpgport list in
 Mkvcbuild.pm at line 46-50.

My fault, sorry about that.  But I'm getting less and less satisfied
with the way that the MSVC build system is forcing us to duplicate all
the knowledge in the Makefiles.

One thing I did in the commit that broke this was to move the list of
fixed (platform-independent) members of libpgport out of
Makefile.global.in and into src/port/Makefile.  Is it possible to parse
that to get the list of fixed members, instead of duplicating the list
in Mkvcbuild.pm?  Of course Mkvcbuild.pm would still need a list of
Windows-specific members, but that should be a lot shorter and more
stable.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [COMMITTERS] pgsql: Temporarily modify tsearch regression tests to suppress notice

2007-09-29 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 One possible real solution would be to tweak the dictionary APIs so
 that the dictionaries can find out whether this is the first load during
 a session, or a reload, and emit notices only in the first case.

 Yea, that would work too.  Or just throw an error for a stop word in the
 file and then you never get a reload (use * instead).

Hm, that's a thought --- it'd be a way to solve the problem without an
API change for dictionaries, which is something to avoid at this late
stage of the 8.3 cycle.  Come to think of it, does the ts_cache stuff
work properly when an error is thrown in dictionary load (ie, is the
cache entry left in a sane state)?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Something's been bugging me

2007-09-29 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 I'm wondering whether it doesn't make sense to lower VARATT_SHORT_MAX to 0x70
 to allow for at least a small number of constant values which could indicate
 some special type of datum. That could be used to indicate that a fixed size
 pointer like a toast pointer follows. That could be used for something like
 common value compression. [*]

I'm not for this because it would complicate the already-too-complicated
inner-loop tests for deciding which form of datum you're looking at.

The idea that I recall mentioning was to expend another byte in TOAST
pointers to make them self-identifying, ie, instead of 0x80 or 0x01
signaling something that *must* be a 17-byte toast pointer, that bit
pattern signals something else and the content of the next byte
lets you know what.  So TOAST pointers would take 18 bytes instead of
17, and there would be room for additions of other sorts of pointers.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Something's been bugging me

2007-09-29 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 I'm wondering whether it doesn't make sense to lower VARATT_SHORT_MAX to 0x70
 to allow for at least a small number of constant values which could indicate
 some special type of datum. That could be used to indicate that a fixed size
 pointer like a toast pointer follows. That could be used for something like
 common value compression. [*]

 I'm not for this because it would complicate the already-too-complicated
 inner-loop tests for deciding which form of datum you're looking at.

 The idea that I recall mentioning was to expend another byte in TOAST
 pointers to make them self-identifying, ie, instead of 0x80 or 0x01
 signaling something that *must* be a 17-byte toast pointer, that bit
 pattern signals something else and the content of the next byte
 lets you know what.  So TOAST pointers would take 18 bytes instead of
 17, and there would be room for additions of other sorts of pointers.

Hm, wouldn't that be just as expensive though? You would still have to look at
the next byte and check it against various values to see what length to skip
over. Hm, unless we put the length in the following byte. Also the difference
between (first-byte ^ 0x80  0x70) and (first-byte  0x80 == 0x80) seems like
it's going to be pretty slight.

I suppose we don't have to decide now. We could just put a 1-byte padding byte
containing 0 (or 17 or 18, though I think 0 is safest) at the front of the
toast pointer structure for now -- we don't have to actually check what it
contains yet.

For that matter we could lower VARATT_SHORT_MAX so we don't generate any short
varlenas over 0x70 in length but not actually check for them in VARATT_IS_1B()
yet either.

The choice of strategy might depend on what we're trying to encode in there. I
was picturing using a single byte for up to 256 common values and for that it
might be unfortunate if we need two more bytes of overhead. On the other hand
something else I was pondering was doing some form of lz compression using
some global dictionary in which case one more byte is not going to matter at
all.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] CLUSTER doesn't check indisvalid etc

2007-09-29 Thread Tom Lane
It strikes me that CLUSTER has been broken since CREATE INDEX
CONCURRENTLY was put in, because it doesn't check whether the index
it's been asked to cluster on is valid.  If C.I.C. fails before marking
the index indisvalid, a subsequent CLUSTER would happily cluster using
only the index entries that are present, thereby losing data.

Fixing this in 8.2 seems a simple matter of checking indisvalid.
However I'm not quite sure what to do in HEAD: is it important to
honor indcheckxmin?  Offhand it seems the worst problem we could
have with a not-quite-ready index is that some recently dead tuples
might be scanned out-of-order because they'd be visited via a broken
HOT chain; and we need not worry too much about whether CLUSTER
preserves exact index ordering for such tuples.

Are there any other utility commands besides CLUSTER that should be
checking index validity?  I don't see anything in a quick look, but
maybe I missed something.

Comments?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] PGBuildfarm member skylark Branch HEAD Failed at Stage Make

2007-09-29 Thread Andrew Dunstan



Tom Lane wrote:

I'm getting less and less satisfied
with the way that the MSVC build system is forcing us to duplicate all
the knowledge in the Makefiles.
  



I whined about this quite some time ago ...


One thing I did in the commit that broke this was to move the list of
fixed (platform-independent) members of libpgport out of
Makefile.global.in and into src/port/Makefile.  Is it possible to parse
that to get the list of fixed members, instead of duplicating the list
in Mkvcbuild.pm?  Of course Mkvcbuild.pm would still need a list of
Windows-specific members, but that should be a lot shorter and more
stable.


  


I've just been toying around in my head with possibly installing gmake 
on a windows build box and making a target where it outputs the list of 
source files for building a project from. Maybe that would help.


cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Something's been bugging me

2007-09-29 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 I'm not for this because it would complicate the already-too-complicated
 inner-loop tests for deciding which form of datum you're looking at.
 
 The idea that I recall mentioning was to expend another byte in TOAST
 pointers to make them self-identifying, ie, instead of 0x80 or 0x01
 signaling something that *must* be a 17-byte toast pointer, that bit
 pattern signals something else and the content of the next byte
 lets you know what.  So TOAST pointers would take 18 bytes instead of
 17, and there would be room for additions of other sorts of pointers.

 Hm, wouldn't that be just as expensive though?

No, I don't think so, because it'd be a second-level test that's only
hit after determining that you have a 1B_E kind of datum.  Furthermore,
it'd only be hit if you were actually trying to determing the datum's
value, and not if (for instance) you only wanted to skip past it to the
next field.

 You would still have to look at
 the next byte and check it against various values to see what length to skip
 over. Hm, unless we put the length in the following byte.

Making the second byte contain the length would work well for the sorts
of cases I was envisioning, which was different sorts of fixed-size
pointer-ish objects.  I suppose it would not scale well towards having a
different kind of inline compression method, but I don't see how your
proposal handles that either.

After a bit of reflection, I'd argue that variant inline compression
methods ought to be implemented within the 4B_C family of datum
representations.  It looks to me like we have a couple of leftover bits
within that, because va_rawsize can't exceed 1G, and so there is room
to include two flag bits in it.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Something's been bugging me

2007-09-29 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 I'm wondering whether it doesn't make sense to lower VARATT_SHORT_MAX to 0x70
 to allow for at least a small number of constant values which could indicate
 some special type of datum. That could be used to indicate that a fixed size
 pointer like a toast pointer follows. That could be used for something like
 common value compression. [*]

 I'm not for this because it would complicate the already-too-complicated
 inner-loop tests for deciding which form of datum you're looking at.

 The idea that I recall mentioning was to expend another byte in TOAST
 pointers to make them self-identifying, ie, instead of 0x80 or 0x01
 signaling something that *must* be a 17-byte toast pointer, that bit
 pattern signals something else and the content of the next byte
 lets you know what.  So TOAST pointers would take 18 bytes instead of
 17, and there would be room for additions of other sorts of pointers.

Here's a patch that does all of the above. 

I accidentally included the debugging ifdef to test with textin producing
short varlenas since that was in my tree but thought I would leave it in since
it's just as easy for you to remove it as me and it might be useful to leave
in since it's in an ifdef anyways.

Two style issues: 1) the magic constant 0 could either be hard coded into
postgres.h for now or should become a VARATT_EXTERNAL_TYPE_POINTER or
something like that. 2) the test in tuptoaster.c could be

if (toast_isnull[i] || 
!VARATT_IS_EXTERNAL(new_value) ||
VARSIZE_EXTERNAL(old_value) != VARSIZE_EXTERNAL(new_value) ||
memcmp(VARDATA_SHORT(old_value),
   VARDATA_SHORT(new_value),
   VARSIZE_EXTERNAL(old_value)) != 0)

which optimizes away to the same thing but I thought it was clearer to leave
those niceties out for now. tuptoaster.c would probably always have to know
what type of pointers it's looking at at that point anyways.



varvarlena-room-for-expansion.diff.gz
Description: Binary data

I am sorry for leaving this until so late. It's been bugging me for a while
but I thought it was better not to make trouble. I guess seeing the release
looming near made me realize what the consequences might be of neglecting it.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Something's been bugging me

2007-09-29 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 The idea that I recall mentioning was to expend another byte in TOAST
 pointers to make them self-identifying, ie, instead of 0x80 or 0x01
 signaling something that *must* be a 17-byte toast pointer, that bit
 pattern signals something else and the content of the next byte
 lets you know what.  So TOAST pointers would take 18 bytes instead of
 17, and there would be room for additions of other sorts of pointers.

 Here's a patch that does all of the above. 

I'd be inclined to make the second byte be the length and have
VARSIZE_1B_E depend on that --- any objection?

 2) the test in tuptoaster.c could be

 if (toast_isnull[i] || 
 !VARATT_IS_EXTERNAL(new_value) ||
 VARSIZE_EXTERNAL(old_value) != VARSIZE_EXTERNAL(new_value) ||
 memcmp(VARDATA_SHORT(old_value),
VARDATA_SHORT(new_value),
VARSIZE_EXTERNAL(old_value)) != 0)

Yeah, I'd go with this just to avoid having hardwired knowledge of the
datum size here.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] 8.3 beta timing

2007-09-29 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 I think we need another week to get things ready for beta.

Why?  Other than the lack of release notes, we could wrap on Monday.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] 8.3 beta timing

2007-09-29 Thread Bruce Momjian
I think we need another week to get things ready for beta.  I will have
the release notes done mid-week and hopefully we can close out all open
items by the end of the week.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Something's been bugging me

2007-09-29 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 I'd be inclined to make the second byte be the length and have
 VARSIZE_1B_E depend on that --- any objection?

On one hand it offends me since it's hard coding an assumption that the size
of a pointer decides what it contains and vice versa. There's nothing saying
we won't have two possible special meanings for a one-byte datum.

And it forecloses any possibility of having a type whose size is at all
variable. I like your idea of using the 4-byte header for variable sized
structures, but what about structures whose size depends on an architecture
detail. We might one day have a pointer which contains a wchar_t or a bigint
and then not have any way to tell whether we have a conflict with some other
pointer structure on some architectures.

On the other hand I suppose you're concerned about the time to do a few
comparisons before knowing which length to skip over? I'm not entirely sure
cycle-counting at that level leads to the correct conclusions. In particular I
think a few checks against constant values followed by a single assignment can
actually be cheaper with speculative execution than having to copy data from
memory and then do subsequent calculations depending on it. I'm not sure we'll
every know though since I doubt it will be measurable.

(I'm also not entirely clear which length to put in, the entire length
including the header or the length just of the pointer. Personally I think I
would prefer just the pointer. I suppose that makes the macro
VARSIZE_EXTERNAL_EXHDR_EXHDR() :/ )


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match