Re: [HACKERS] Making PostgreSQL 7.4 (CVS) work properly on OS X 10.3 (7B85)

2003-11-07 Thread Bob Ippolito
On Nov 8, 2003, at 12:31 AM, Tom Lane wrote:

Bob Ippolito [EMAIL PROTECTED] writes:
Here's my notes on how to build PostgreSQL 7.4 (CVS) on OS X 10.3
(7B85) with (seems to be working, but I haven't really ran any tests)
python, tcl, perl, readline.
I have just in the past couple hours realized that ps_status.c is
seriously broken on OS X 10.3.  It appears that Apple has randomly
decided to start #define'ing BSD, which they did not do in any
prior OS X release, and this confuses ps_status.c into choosing the
wrong method of updating the ps-visible command line.  Aside from
causing ps to not show any useful info about postmaster child 
processes,
this seems to cause dynamic loading of libraries to fail in some
cases :-( ... still trying to understand what's happening there ...
__APPLE__ is usually the only define you should really be depending on 
(other than things you pass in yourself via configure).

I noticed today that you guys are just
about to release, so I figure I should bring this up right now.
Indeed.  We need to fix this yesterday.

* I ditched the system.c hack, assuming Apple has fixed them by 10.3 
--
because it breaks tcl and python if you do.
I don't see why system.c would affect the problem I'm seeing --- does
this really fix pltcl for you?
If you're getting the NSLinkModule - mmap crash, then yes.  It's the 
multiply defined symbol that kills it (system in postmaster, system in 
libSystem).  The other crashes I saw were related to the shared mem 
limit that you can only set *once*.. so you have to sudo vi /etc/rc 
instead of /etc/sysctl.conf :(

-bob

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Making PostgreSQL 7.4 (CVS) work properly on OS X 10.3 (7B85)

2003-11-07 Thread Bob Ippolito
On Nov 8, 2003, at 1:00 AM, Tom Lane wrote:

Bob Ippolito [EMAIL PROTECTED] writes:
On Nov 8, 2003, at 12:31 AM, Tom Lane wrote:
I have just in the past couple hours realized that ps_status.c is
seriously broken on OS X 10.3.  It appears that Apple has randomly
decided to start #define'ing BSD,

__APPLE__ is usually the only define you should really be depending on
It is not that we are failing to recognize Darwin, it is that Darwin is
falsely claiming to be something else.
I don't see why system.c would affect the problem I'm seeing --- does
this really fix pltcl for you?

If you're getting the NSLinkModule - mmap crash, then yes.
No, I'm seeing an NSLinkModule - memcmp crash.  memcmp() is crashing
because it is being passed a null pointer, and the pointer in question
is null because ps_status.c zeroed it via argv[1] = NULL.
I modified ps_status to use the correct status update method
(PS_CLOBBER_ARGV) but it still failed :-(.  It sorta looks like
something in the dynamic loader may be relying on a stale copy of
the environ pointer ... does that ring a bell at all?
Er... I meant memcmp.. Have you tried removing the system.c hack?  
That's what fixed it for me.  I'm pretty sure that the multiply defined 
_system is messing with things.  Is it possible to use execve to 
rewrite argv[0]?  I've had better luck with that.

-bob

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Making PostgreSQL 7.4 (CVS) work properly on OS X 10.3 (7B85)

2003-11-08 Thread Bob Ippolito
On Nov 8, 2003, at 1:13 PM, Tom Lane wrote:

Bob Ippolito [EMAIL PROTECTED] writes:
On Nov 8, 2003, at 12:31 AM, Tom Lane wrote:
I have just in the past couple hours realized that ps_status.c is
seriously broken on OS X 10.3.

Er... I meant memcmp.. Have you tried removing the system.c hack?
That's what fixed it for me.
AFAICT system.c hasn't got anything to do with the problem that I'm
seeing; it's purely a matter of ps_status.c clobbering argv[] contents
that the dynamic loader depends on for some weird reason.  It looks
like Apple's implementation stores a copy of the original argc count,
and there is a bit of code in the loader that for some reason is
examining each argv string from 0..original_argc-1.  Who knows why :-(
... but where we set save_argv[1] to NULL, we create a null pointer
crash in the loader.  Take that out, no crash.  You would not see this
crash if you started the postmaster with no command-line arguments, 
btw.

I'm planning to change ps_status so that instead of zeroing
save_argv[1], it makes all the original argv strings be pointers to 
except for argv[0].
It may be causing problems because dyld does this thing called 
@executable_path substitution so it can find dylibs relative to the 
executable.  Also, the WindowServer and several other things 
(CoreFoundation, Foundation) use argv[0] to determine whether the 
executable is inside a bundle or not.

I don't think OS X people would make a big fuss about argv[0] not being 
as useful as it is on other platforms, personally I'd just take it out 
if it's not working.

As for getting rid of system.c, I am not eager to do that since it 
would
certainly break compatibility with OS X 10.1.  We could conditionally
compile it out perhaps.  Do you know what #define symbol we could test
for to determine which OS X version we are on?
See /usr/include/AvailabilityMacros.h

-bob

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Making PostgreSQL 7.4 (CVS) work properly on OS X 10.3 (7B85)

2003-11-08 Thread Bob Ippolito
On Nov 8, 2003, at 1:44 PM, Tom Lane wrote:

Bob Ippolito [EMAIL PROTECTED] writes:
On Nov 8, 2003, at 1:13 PM, Tom Lane wrote:
As for getting rid of system.c, I am not eager to do that since it
would
certainly break compatibility with OS X 10.1.  We could conditionally
compile it out perhaps.  Do you know what #define symbol we could 
test
for to determine which OS X version we are on?

See /usr/include/AvailabilityMacros.h
I don't see anything there that we can use in the form

#ifdef OSX_VERSION_10_2
or
#if OSX_VERSION = something
My 10.2.6 copy already has MAC_OS_X_VERSION_10_3 in it, so they are
obviously not intending that the highest defined symbol of that
series is the OS version.
MAC_OS_X_VERSION_MIN_REQUIRED

 * If min OS not specified, assume 10.0
 * Note: gcc driver may set MAC_OS_X_VERSION_MIN_REQUIRED based on 
MACOSX_DEPLOYMENT_TARGET environment variable

Compiling software for 10.3 should setenv MACOSX_DEPLOYMENT_TARGET 10.3

-bob

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Making PostgreSQL 7.4 (CVS) work properly on OS X 10.3 (7B85)

2003-11-08 Thread Bob Ippolito
On Nov 8, 2003, at 3:19 PM, Tom Lane wrote:

Bob Ippolito [EMAIL PROTECTED] writes:
* I ditched the system.c hack, assuming Apple has fixed them by 10.3 
--
because it breaks tcl and python if you do.. I did:
After fixing the ps_status problems, I cannot observe any problem, with
or without system.c.  However, I agree that it's a bad idea to 
propagate
that hack forward when it's no longer needed.  I've #ifdef'd out
system.c for OS X 10.3 and later.  (I was tempted to disable it on
10.2.* as well, but I can only prove system() works in 10.2.6, not
having older OS X handy to test.)
Sounds great

Ok, that's not quite true, so:
I edited src/Makefile.global to have
python_libspec = -framework Python
I'm not going to touch that at this point in the release cycle.  It'll
just have to wait for a future release.
That's fine, does anyone even use plpython?  I only compiled it because 
I knew how to fix it..

-bob

---(end of broadcast)---
TIP 8: explain analyze is your friend


[HACKERS] problem with creating/dropping tables and plpgsql ?

2001-07-20 Thread \(::\) Bob Ippolito

I'm pretty new to postgresql.. I'm using a fresh compile/install of postgresql 7.1.2 
without any special options.. but here's my problem:

semantic=# create temp table ttmptable(lookup_id int, rating int);
CREATE
semantic=# SELECT doEverythingTemp(20706,2507);
 doeverythingtemp 
--
1
(1 row)
semantic=# DROP table ttmptable;
DROP
semantic=# create temp table ttmptable(lookup_id int, rating int);
CREATE
semantic=# SELECT doEverythingTemp(20706,2507);
ERROR:  Relation 4348389 does not exist

--- schema --

CREATE FUNCTION doEverythingTemp(int,int) RETURNS int AS '
DECLARE
  rrec RECORD;
  userid int;
  lookupid int;
  rrating int;
  ruser int;
BEGIN
  userid := $1;
  lookupid := $2;
  FOR rrec IN SELECT webuser_id,rating FROM rating WHERE webuser_id!=userid AND 
lookup_id=lookupid;
  rrating:=rrec.rating;
  ruser:=rrec.webuser_id;
  INSERT INTO ttmptable SELECT lookup_id,rrating*rating FROM rating WHERE 
webuser_id=ruser AND lookup_id!=lookupid;
  END LOOP;
  RETURN 1;
END;' LANGUAGE 'plpgsql'

Table rating
 Attribute  |  Type   | Modifier 
-+-+--
webuser_id  | integer | not null default '0'
category_id | integer | not null default '0'
lookup_id   | integer | not null default '0'
rating  | integer | not null default '0'
rating_id   | integer | not null default nextval('rating_rating_id_seq'::text)
Indices: rating_category_id_idx,
 rating_lookup_id_idx,
 rating_rating_id_key,
 rating_webuser_id_idx


I've tried regular tables, creating the table from within the function, and a few 
other things.. no luck.  Does anyone have ANY idea how I can either redesign this 
query or make the create/drop thing work properly?

Thanks,
(::) Bob Ippolito

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] uuid type for postgres

2005-09-06 Thread Bob Ippolito
On Sep 6, 2005, at 12:57 PM, Jonah H. Harris wrote:On 9/6/05, Josh Berkus josh@agliodbs.com wrote: Mark, I suggest that UUID be recommended in place of SERIAL for certain classes of applications, and that it therefore belongs in the core. UUID and SERIAL can be used together (although, once you have a UUID,  it may not be useful to also have a SERIAL).I think that, if you want to push a refactored UUID type for PostgreSQL8.2, that you'd better separate your database design arguments from yourinclusion arguments. For example, you might get my agreement that it would be useful to have aUUID as a core type; you would *never* get my agreement to recommend usingUUID to newbies.   I have seen *far* too many abuses of UUIDs in really bad database design.   People who use them should be experienced enough toknow what they're doing.I agree with Josh on the UUID type, it gets abused far too often and (IMHO) isn't widely enough used to belong in the core.  Couldn't you just fix the problem in pguuid rather than write a whole new type?This sounds like a strawman argument.  People abuse a lot of types to do a lot of things they shouldn't be doing, adding types to the core isn't really going to change that much one way or the other.If the documentation gives the user a good idea of when to use UUID and when not, I think it would be a good addition.  Worst case, it could become part of contrib, so at least it ships with PostgreSQL with the same license.  The GPL license of pguuid is a show-stopper for some (and seems like a silly choice for what it is and who it's for).-bob

Re: [HACKERS] uuid type for postgres

2005-09-06 Thread Bob Ippolito


On Sep 6, 2005, at 2:16 PM, nathan wagner wrote:

On Tue, Sep 06, 2005 at 03:57:55PM -0400, [EMAIL PROTECTED]  
wrote:



I agree with Josh on the UUID type, it gets abused far too often



Out of curiosity, how does it get abused?  It doesn't seem to me
that it would be any more prone to abuse than any other type.



and (IMHO) isn't widely enough used to belong in the core.



Seems like a self-fulfilling cycle.  There isn't one in core
or contrib (pguuid is not in contrib, IIRC), and so one doesn't
get used.  Since no one uses one, it's not in core or contrib.



Couldn't you just fix the
problem in pguuid rather than write a whole new type?



Not sure which you you're addressing here, but I can't fix the license
problem.  Nor can I readily fix the lack of a mac ioctl() to get
the hardware mac address.


It's not that hard to get a MAC address or serial number out of a  
Mac, the problem is I'm not going to bother writing that code for a  
GPL extension to PostgreSQL.  There's a better way, anyhow.  Darwin  
ships with UUID parse/generate/etc functions in its Libc (uuid/ 
uuid.h).


FWIW, that code (the Theodore Ts'o UUID implementation) looks to be  
suitably licensed for PostgreSQL, and it actually it appears that  
someone has already written a PostgreSQL UUID type using this it...  
which is unfortunately LGPL'ed (why?!).


-bob


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] uuid type for postgres

2005-09-06 Thread Bob Ippolito


On Sep 6, 2005, at 3:06 PM, [EMAIL PROTECTED] wrote:


On Tue, Sep 06, 2005 at 05:54:34PM -0400, Tom Lane wrote:


I don't see any big opposition.  People are simply questioning the
idea whether it belongs in core PG.  The reason we don't want to  
accept

everything-and-the-kitchen-sink in core is that we have only limited
manpower to maintain it.  So you've got to justify that we should  
spend

our effort here and not elsewhere.  There's a fair amount of nearly
...
been there awhile.  So one of the questions that's going to be  
asked is

how useful/popular it's really going to be.



Sounds reasonable, and certainly no more than I expected. If Nathan
hadn't raised the issue, it probably would have been a few months
before I raised it myself.



One thing that is raising my own level of concern quite a bit is the
apparent portability issues.  Code that isn't completely portable  
is a

huge maintainability problem; in particular, stuff that requires
system-dependent behavior used nowhere else in Postgres is a real  
pain.
It sounds like the UUID code expects to be able to get at the  
machine's

MAC address, which suggests serious issues in (a) relying on
not-too-standard APIs, (b) possible protection issues (will an
unprivileged process be able to get at the MAC address?), and (c)
ill-defined behavior on machines with more or less than one MAC  
address.

Not to mention that MAC addresses aren't so unique as all that.



I'll try to prepare an answer for this. (I started to write a lot of
information - but is it unverified from memory, and perhaps should be
more authoritative before presented as truth)


Some modern UUID implementations prefer /dev/urandom or similar to  
the time or MAC address unless you really beg them to give you a  
weaker UUID.


You can take a look at the man page for the Theodore Y. Ts'o  
implementation that is in Darwin's Libc here:
http://developer.apple.com/documentation/Darwin/Reference/ManPages/ 
man3/uuid_generate.3.html


Specifically:

   The uuid_generate function creates a new universally unique   
identifier
   (UUID).   The  uuid  will be generated based on high-quality  
randomness
   from  /dev/urandom,  if  available.   If  it  is  not   
available,  then
   uuid_generate  will use an alternative algorithm which uses  
the current
   time, the local ethernet MAC address (if available),  and   
random  data

   generated using a pseudo-random generator.

The Apache Portable Runtime has a apr_os_uuid_get() that supports two  
flavors of UUID for unix (Linux/Mac OS X uuid_generate and FreeBSD's  
uuid_create, may be available elsewhere), and the UuidCreate API on  
Win32.  apr-util's apr_uuid_get() will use apr_os_uuid_get() if  
available, and otherwise will default to a relatively weak mostly- 
timestamp-based UUID.


It would probably be reasonable and easy to do what Apache does  
here.  A platform UUID implementation, if present, is generally going  
to be better than anything included into PostgreSQL itself.


-bob


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] uuid type for postgres

2005-09-06 Thread Bob Ippolito


On Sep 6, 2005, at 6:02 PM, Josh Berkus wrote:


Bob, People,

Let me clarify my stance here, because it seems to be getting
misrepresented.

Mark (and Nathan) pushed at repaired UUID type for possible  
inclusion in
the core PostgreSQL distribution.  I'm not opposed to that,  
provided that
the portability, licensing, and bugs are worked out.  Why not?  We  
have

ipv6 data types, after all.

However, Mark went on to suggest that we should recommend UUID over  
SERIAL
in the docs, and that we could consider dropping SERIAL entirely in  
favor

of UUID:

---quoth Mark--
I suggest that UUID be recommended in place of SERIAL for certain
classes of applications, and that it therefore belongs in the core.
UUID and SERIAL can be used together (although, once you have a
UUID, it may not be useful to also have a SERIAL).
-

This was what I objected to; I believe that the use-case for UUIDs is
actually quite narrow and assert that it's a very bad idea to  
promote them

to most users.


I agree with you (Josh) completely, which is why I said:
If the documentation gives the user a good idea of when to use UUID  
and when not, I think it would be a good addition.


.. the fact that the use-cases are narrow was implicit :)

Everything else I talked about was just implementation details.   
Summary: there are (several) UUID implementations out there that are  
appropriately licensed and easy enough to use, and a lot of OSes ship  
with pretty good implementations already.  Creating a decent UUID  
type should be relatively trivial, as far as those things go.


-bob


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] uuid type for postgres

2005-09-07 Thread Bob Ippolito

On Sep 7, 2005, at 10:04 AM, nathan wagner wrote:


On Wed, Sep 07, 2005 at 09:45:17AM -0700, josh@agliodbs.com wrote:


I think the issue is portability.  Remember that this type needs  
to work on

Windows as well as all POSIX platforms and AIX.



I had forgotten about windows.  I'll see to what extent the library  
i'm

using is portable to windows.


You don't need to use a non-standard library for a lot of platforms,  
you're pretty much guaranteed UUID support for (at least):
Linux/Darwin/Mac OS X: uuid_generate uuid/uuid.h (part of libc, at  
least on Darwin)

FreeBSD: uuid_create uuid.h (again, libc)
Windows: UuidCreate Rpc.h (link to Rpcrt4.dll, IIRC)

Where a native UUID generate is not available, you can include some  
implementation, but surely other platforms also include UUID  
implementations.


I think Windows portability is a non-issue here.

-bob


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] uuid type for postgres

2005-09-08 Thread Bob Ippolito
One reason to use a UUID type over a naively stored hash for this purpose is that it takes up half the space as naively stored MD5 and 40% of the space as naively stored SHA1.  Granted, it's easy enough to pack them, but packed MD5 does have the same storage requirements as UUID and it won't be quite as convenient to use as a native UUID type out of the box.-bobOn Sep 8, 2005, at 9:02 AM, Jonah H. Harris wrote:Mark,  I think what Greg suggested was sha1(number) as the key instead of requiring uuid as the key... it would perform the same function as far as you r use case is concerned.  As a similar example (using MD5):  CREATE SEQUENCE marks_seq START 1 INCREMENT 1; CREATE TABLE your_tbl (     your_key  VARCHAR(32) NOT NULL DEFAULT md5(nextval('marks_seq')),     your_picture  BYTEA NOT NULL,     PRIMARY KEY (your_key));  INSERT INTO your_tbl (your_picture) VALUES ('some bytea data');  select * from your_tbl ;  your_key |  your_picture --+-  c4ca4238a0b923820dcc509a6f75849b | some bytea data  same functionality from your standpoint.  On 9/8/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: On Thu, Sep 08, 2005 at 01:45:10PM -, Greg Sabino Mullane wrote:  For a rather simple example, consider a site that associates a picture   with each member. If the pictures are named 1.jpg, 2.jpg, 3.jpg, etc.  it makes it ridiculously easy to write a script to pull all of the  pictures off the site. This can be bothersome, as the only type of   person who would do this, is the type of person with an illegitimate  motivation. I want the data to be easily and freely accessible as  specific objects, but I do not wish to provide an easy way of   dumping all of the data as a unit. I don't think security through obscurity is a good reason to add something to PostgreSQL. Either make the pictures accessible to everyone, or put some other security mechanism in place. As far as pictures, you could  certainly used the SHA1 checksum as your primary key, which would satisfy all your requirements.I think you missed two parts. The first is, that it isn't really anattempt at security. The pictures are still freely available. There is no need to block accesses.The goal is to prevent a dump of the database. Nobody should be able totrivially download all the pictures at one time.I'm not sure what you are suggesting the SHA1 checksum be applied to. Not the picture, for sure, as this would require that my tables storea primary key, and a picture key, leading me right back to somethingthat seems a little ridiculous.If you mean the SHA1 checksum of some constant secret password, and the primary key, possibly SERIAL, then sure - but that's ignoring theother benefits of UUID. If I'm going to use UUID, I might as well useonly UUID, and use the SHA1 checksum of the constant secret password, and the UUID. So, the idea has merit - and was suggested by anotherperson - but it doesn't mean using the UUID alone, without all theextra layers on top I would have to write, is the wrong choice. Whywould I go out of my way to jump through so many hoops, when I can very easily, use the UUID?You're asking me to do a lot, only so that you can point out that'UUID' is not necessary. Heck, I could use the timestamp, and arandom key as two separate fields. There are always work arounds. Is it good to use work arounds? Or a type that already works finefor the purposes? :-)  In my case, it isn't only pictures. I don't want people pulling all  the data off the site as a dump, and using it how they wish, but I do   wish to make the data freely available, and easily accessible from a  web browser. .htaccess? Cookies? Encrypted data? iptables? All of these are better solutions than random filenames. Nope. The files are publically accessible. No reason the secure theiraccess, and securing their access makes the whole system less efficient.  Not that everybody should rip out SERIAL and replace it with UUID,   but it really isn't that bad, and in some cases, such as mine,  I don't see the point of using both, and choose to instead allow  UUID to solve many of my concerns at the same time, with an   acceptable for me cost in database pages. We're doing something similar to UUIDs here - LSID (Life Science Identifiers). But being as the whole point of LSIDs is to share data, the "random" factor is not needed. If ever PG did do something  like UUID (and despite the paragraphs above, I could see a use for it), I'd rather do something like picking from a predetermined range of values at random and deplete the pool as you go rather than create large  values with a low (but finite) probability of collision.I don't need the random factor. I only need it to be difficult to guess.The timestamps have sufficient granularity, that they are effectively difficult to guess, and a foreach loop would be expensive. If none ofthe bits were random, I would still be happy.If I had LSID support, and no UUID support, 

[HACKERS] PostgreSQL 8.1.0 catalog corruption

2005-11-21 Thread Bob Ippolito
I've been running 8.1.0 on a test box since the FreeBSD port has been  
available, and it appears that the catalog has become corrupted.   
There's plenty of free space on all volumes, so it hasn't run out of  
space or anything.


$ uname -a
FreeBSD shi.mochibot.com 6.0-RELEASE FreeBSD 6.0-RELEASE #2: Mon Nov   
7 14:34:52 PST 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/ 
CUSTOM  i386


$ psql mochi
Welcome to psql 8.1.0, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help with psql commands
   \g or terminate with semicolon to execute query
   \q to quit

mochi=# drop table ping_1132387200;
ERROR:  table ping_1132387200 does not exist
mochi=# create table ping_1132387200();
ERROR:  type ping_1132387200 already exists

I'm not sure what to do about this..

-bob


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] PostgreSQL 8.1.0 catalog corruption

2005-11-21 Thread Bob Ippolito


On Nov 21, 2005, at 12:44 PM, Tom Lane wrote:


Bob Ippolito [EMAIL PROTECTED] writes:

mochi=# drop table ping_1132387200;
ERROR:  table ping_1132387200 does not exist
mochi=# create table ping_1132387200();
ERROR:  type ping_1132387200 already exists



I'm not sure what to do about this..


Try dropping the type.

We've seen at least one prior report of a table's rowtype not being
dropped with the table, but nothing solid enough to debug.  If you can
find a way to reproduce this, I'm all ears.


I did try that, I guess it didn't make it to the list yet:

mochi=# drop type ping_1132387200;
ERROR:  cache lookup failed for relation 211174567

-bob





---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] PostgreSQL 8.1.0 catalog corruption

2005-11-21 Thread Bob Ippolito


On Nov 21, 2005, at 11:31 AM, Bob Ippolito wrote:

I've been running 8.1.0 on a test box since the FreeBSD port has  
been available, and it appears that the catalog has become  
corrupted.  There's plenty of free space on all volumes, so it  
hasn't run out of space or anything.


$ uname -a
FreeBSD shi.mochibot.com 6.0-RELEASE FreeBSD 6.0-RELEASE #2: Mon  
Nov  7 14:34:52 PST 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/ 
sys/CUSTOM  i386


$ psql mochi
Welcome to psql 8.1.0, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help with psql commands
   \g or terminate with semicolon to execute query
   \q to quit

mochi=# drop table ping_1132387200;
ERROR:  table ping_1132387200 does not exist
mochi=# create table ping_1132387200();
ERROR:  type ping_1132387200 already exists

I'm not sure what to do about this..


Oh, and trying to drop the type doesn't work either:

mochi=# drop type ping_1132387200;
ERROR:  cache lookup failed for relation 211174567

-bob


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] PostgreSQL 8.1.0 catalog corruption

2005-11-21 Thread Bob Ippolito


On Nov 21, 2005, at 1:14 PM, Tom Lane wrote:


Bob Ippolito [EMAIL PROTECTED] writes:

On Nov 21, 2005, at 12:44 PM, Tom Lane wrote:

Try dropping the type.



I did try that, I guess it didn't make it to the list yet:



mochi=# drop type ping_1132387200;
ERROR:  cache lookup failed for relation 211174567


Hmm, apparently there are still entries in pg_depend for these  
objects.
Would you look to see what is in pg_depend with either objid or  
refobjid

equal to either 211174567 or the type's oid (possibly 211174568, but
check)?


I don't know how to get the oid of a type.. but there are certainly  
entries in pg_depend with the other criteria:


mochi=# select * from pg_depend where objid=211174567;
classid |   objid   | objsubid | refclassid | refobjid  | refobjsubid  
| deptype
-+---+--++--- 
+-+-
1259 | 211174567 |0 |   2615 |  2200 |
0 | n
1259 | 211174567 |0 |   1259 | 103327322 |
0 | n

(2 rows)

mochi=# select * from pg_depend where refobjid=211174567;
classid |   objid   | objsubid | refclassid | refobjid  | refobjsubid  
| deptype
-+---+--++--- 
+-+-
2606 | 212119648 |0 |   1259 | 211174567 |
1 | a
2606 | 212119648 |0 |   1259 | 211174567 |
1 | n
2606 | 212119649 |0 |   1259 | 211174567 |
2 | a
2606 | 212119649 |0 |   1259 | 211174567 |
2 | n
1247 | 211174568 |0 |   1259 | 211174567 |
0 | i
2604 | 211174569 |0 |   1259 | 211174567 |
1 | a
1259 | 211174570 |0 |   1259 | 211174567 |
0 | i
2606 | 211174574 |0 |   1259 | 211174567 |
1 | a
1259 | 211174575 |0 |   1259 | 211174567 |
2 | a
1259 | 211174576 |0 |   1259 | 211174567 |
4 | a
1259 | 211174576 |0 |   1259 | 211174567 |
2 | a
1259 | 211174577 |0 |   1259 | 211174567 |
4 | a
1259 | 211174577 |0 |   1259 | 211174567 |
7 | a
1259 | 211174577 |0 |   1259 | 211174567 |
2 | a
1259 | 211174578 |0 |   1259 | 211174567 |
4 | a
1259 | 211174578 |0 |   1259 | 211174567 |
8 | a
1259 | 211174578 |0 |   1259 | 211174567 |
2 | a

(17 rows)

-bob


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] PostgreSQL 8.1.0 catalog corruption

2005-11-21 Thread Bob Ippolito


On Nov 21, 2005, at 1:43 PM, Tom Lane wrote:


Bob Ippolito [EMAIL PROTECTED] writes:

I don't know how to get the oid of a type.. but there are certainly
entries in pg_depend with the other criteria:


Hmph, looks like you still have a pretty full set of dependencies for
the table.  What about attributes --- try
select attname from pg_attribute where attrelid = 211174567;

It might be interesting to try reindexing pg_class, too ... maybe the
pg_class row is still there but isn't being found because of a corrupt
index?


The attributes look like the names of all the columns in the table,  
and reindexing didn't help.


mochi=# select attname from pg_attribute where attrelid = 211174567;
attname

tableoid
cmax
xmax
cmin
xmin
oid
ctid
id
timestamp
ip_address
mochiTag
mochiGUID
mochiVersion
movieURL
movieURLHost
movieReferrer
movieReferrerHost
movieUserAgent
movieSWFVersion
movieBytesLoaded
movieQuality
movieStageHeight
movieStageWidth
movieBytesTotal
movieStageAlign
movieScaleMode
movieShowMenu
userScreenReader
userCameras
userMicrophones
userSystemCapabilities
userTimeZoneOffset
userTicks
userUTCTime
(34 rows)

mochi=# reindex table pg_class;
REINDEX
mochi=# create table ping_1132387200();
ERROR:  type ping_1132387200 already exists
mochi=# drop table ping_1132387200;
ERROR:  table ping_1132387200 does not exist
mochi=# drop type ping_1132387200;
ERROR:  cache lookup failed for relation 211174567

-bob


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] PostgreSQL 8.1.0 catalog corruption

2005-11-21 Thread Bob Ippolito


On Nov 21, 2005, at 1:59 PM, Tom Lane wrote:


Bob Ippolito [EMAIL PROTECTED] writes:

The attributes look like the names of all the columns in the table,
and reindexing didn't help.


So at this point it seems that the pg_class row disappeared, but there
probably wasn't any actual DROP operation --- you'd think at least  
some

of those other entries would have been deleted by a DROP.

My next guess is that the pg_class row simply got clobbered somehow,
eg its xmin field got set to something ridiculous.  The only way I can
think of to investigate that is to dump out the contents of pg_class
with pg_filedump --- are you game for that?  If so, get the right
version of pg_filedump from
http://sources.redhat.com/rhdb/utilities.html
and run it with the -i -f options (usually produces the most helpful
output, in my experience).


This is 8.1.0, can I use pg_dump 4.0 with that?  The entire database  
is 39GB, there's a way to just get pg_class, right?


-bob


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] PostgreSQL 8.1.0 catalog corruption

2005-11-21 Thread Bob Ippolito


On Nov 21, 2005, at 2:42 PM, Tom Lane wrote:


Bob Ippolito [EMAIL PROTECTED] writes:

Ok, so how do I figure out which file(s) are associated with pg_class
so I can feed this thing?


See contrib/oid2name and/or read
http://www.postgresql.org/docs/8.1/static/storage.html


Ok, here's the pg_filedump for the pg_class table in the mochi  
database that is having the issue:


(it's also there without gz, but the gz is 228KB versus 2352KB  
uncompressed)

http://undefined.org/mochi.pg_class.filedump.gz

-bob


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] PostgreSQL 8.1.0 catalog corruption

2005-11-21 Thread Bob Ippolito


On Nov 21, 2005, at 3:12 PM, Tom Lane wrote:


Bob Ippolito [EMAIL PROTECTED] writes:

Ok, here's the pg_filedump for the pg_class table in the mochi
database that is having the issue:


Thanks.  I don't see any live tuples that look like they could have  
been

the one we want, but there's a whole lot of deleted rows, which
pg_filedump won't show with those options.  Could you try a  
pg_filedump

with -d option too?


Sure, here it is:

http://undefined.org/mochi.pg_class-1.filedump.gz

-bob


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] PostgreSQL 8.1.0 catalog corruption

2005-11-21 Thread Bob Ippolito


On Nov 21, 2005, at 3:56 PM, Tom Lane wrote:


Bob Ippolito [EMAIL PROTECTED] writes:

Sure, here it is:
http://undefined.org/mochi.pg_class-1.filedump.gz


Well, I count at least a couple hundred deleted versions of that table
row :-(.  What the heck were you doing with it?  As far as I can think
offhand, only a schema modification would cause an update of the
pg_class row.


There's an ETL process that does the following:

- Create a temp table
- COPY into the temp table
- do some transforms on the data
- create a dated table if it doesn't already exist
- fail miserably because the catalog is busted**

**: it normally does something else here, namely inserting a bunch of  
rows into the table


The ETL process keeps trying until it succeeds or someone stops it,  
so I guess that's why there's so much churn in there for that table.   
Kept trying to create it, and ran into the issue.  I'd estimate  
around 1700 to 1800 dead versions of that table, because it ran for  
some time before I noticed and stopped it... this is just a test box  
after all, I don't have 8.1 in production yet (thankfully!).


So what do I do now?  Kill the database, start over from a dump, and  
cross fingers that this doesn't pop up again?


-bob


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] PostgreSQL 8.1.0 catalog corruption

2005-11-21 Thread Bob Ippolito


On Nov 21, 2005, at 4:33 PM, Alvaro Herrera wrote:


Tom Lane wrote:

Bob Ippolito [EMAIL PROTECTED] writes:

On Nov 21, 2005, at 3:56 PM, Tom Lane wrote:
Well, I count at least a couple hundred deleted versions of that  
table

row :-(.  What the heck were you doing with it?



The ETL process keeps trying until it succeeds or someone stops it,
so I guess that's why there's so much churn in there for that table.
Kept trying to create it, and ran into the issue.  I'd estimate
around 1700 to 1800 dead versions of that table, because it ran for
some time before I noticed and stopped it... this is just a test box
after all, I don't have 8.1 in production yet (thankfully!).


Um, no, that theory doesn't seem to explain the evidence.  A failed
insertion would result in a row with an uncommitted XMIN and no XMAX.
All of the entries I'm seeing have both XMIN and XMAX set.  A good- 
size
fraction have the same XMIN and XMAX (but different CMIN and  
CMAX), but
I see some that have different XMIN and XMAX.  It looks to me like  
the

table was definitely created successfully, and it survived across
multiple transactions ... but something was doing a lot of DDL  
changes
on it.  If we could find out what, maybe we could reproduce the  
problem.


Maybe the UPDATE pg_class SET relhastriggers='f' that people is so  
fond

of doing to deactivate triggers?  Or something similar?


I don't touch pg_class at all... this is what I'm doing (over and  
over again).


	-- clone_table is almost always a no-op, but once a day it creates a  
new table

SELECT clone_table('ping', 'ping_%s', '')
SELECT drop_ping_constraints('ping_%s')
-- stuff that doesn't effect DDL
SELECT add_ping_constraints('ping_%s')

and the referenced UDFs are as follows:

CREATE OR REPLACE FUNCTION
clone_table(parent text, child text, extra text) RETURNS boolean
AS $$
DECLARE
tmprec record;
user_index record;
parent_constraint record;
user_index_column record;
indexname text;
i integer;
columns text[];
BEGIN
-- are we done?
FOR tmprec IN
SELECT 1 FROM pg_sysviews.pg_user_tables WHERE  
table_name=child

LOOP

RETURN FALSE;
END LOOP;

-- inherit the table
EXECUTE 'CREATE TABLE '
|| quote_ident(child)
|| '('
|| extra
|| ') INHERITS ('
|| quote_ident(parent)
|| ')';


FOR parent_constraint IN
SELECT *
FROM pg_sysviews.pg_user_table_constraints A
WHERE A.table_name = parent
LOOP
EXECUTE 'ALTER TABLE '
|| quote_ident(child)
|| ' ADD '
|| parent_constraint.definition;
END LOOP;

i := 0;
FOR user_index IN
SELECT *
FROM pg_sysviews.pg_user_indexes A
WHERE
A.table_name = parent
AND A.index_name != (parent || '_pkey')
LOOP

i := i + 1;
indexname := child;
columns := '{}'::text[];
FOR user_index_column IN
SELECT B.column_name, quote_ident(B.column_name) AS col
FROM pg_sysviews.pg_user_index_columns B
WHERE
B.table_name = user_index.table_name
AND B.index_name = user_index.index_name
ORDER BY B.column_position
LOOP

indexname := indexname || '_' ||  
user_index_column.column_name;

columns := array_append(columns, user_index_column.col);
END LOOP;

IF user_index.predicate IS NOT NULL THEN
indexname := indexname || '_p' || i::text;
END IF;

-- this is not complete, but works
-- missing tablespace, index_method, is_clustered,
EXECUTE ('CREATE '
|| (CASE WHEN user_index.is_unique THEN 'UNIQUE ' ELSE  
'' END)

|| 'INDEX '
|| quote_ident(indexname)
|| ' ON '
|| quote_ident(child)
|| ' USING '
|| quote_ident(user_index.index_method)
|| ' ('
|| array_to_string(columns, ',')
|| ')'
|| (CASE WHEN user_index.predicate IS NOT NULL
THEN ' WHERE ' || user_index.predicate
ELSE '' END)
);


END LOOP;

RETURN TRUE;

END
$$ LANGUAGE plpgsql;


CREATE OR REPLACE FUNCTION
drop_ping_constraints(ping_table text) RETURNS void
AS $drop_ping_constraints$
DECLARE
next_sql text;
constraint_rec record;
BEGIN

next_sql := $sql$
SELECT
constraint_name
FROM pg_sysviews.pg_user_table_constraints
WHERE constraint_name IN ($sql$
|| quote_literal(ping_table || '_timestamp_check')
|| ', '
|| quote_literal(ping_table || '_id_check')
|| ')';

-- RAISE NOTICE 'SQL: %', next_sql;
FOR constraint_rec IN EXECUTE next_sql LOOP
next_sql := 'ALTER TABLE

Re: [HACKERS] PostgreSQL 8.1.0 catalog corruption

2005-11-21 Thread Bob Ippolito


On Nov 21, 2005, at 5:50 PM, Tom Lane wrote:


Bob Ippolito [EMAIL PROTECTED] writes:

I don't touch pg_class at all... this is what I'm doing (over and
over again).



-- clone_table is almost always a no-op, but once a day it creates a
new table
 SELECT clone_table('ping', 'ping_%s', '')
 SELECT drop_ping_constraints('ping_%s')
-- stuff that doesn't effect DDL
SELECT add_ping_constraints('ping_%s')


Hm, do the drop/add constraint functions get executed even when
clone_table decides not to make a new table?  If so, that would  
probably

explain the pattern I'm seeing in the dump of many updates of the
pg_class row.


Yes, they do.  The constraints are there for constraint exclusion.

This still doesn't give us a hint why the row disappeared, but  
maybe we

can try running these functions for awhile and see if anyone can
reproduce a failure.


If it matters, I have had the same code running on Bizgres 0.7.4 for  
quite some time with no issues at all.  I may just have to migrate  
the test server to Bizgres 0.8 if we can't figure out why PostgreSQL  
8.1.0 choked here.


-bob


---(end of broadcast)---
TIP 6: explain analyze is your friend