[HACKERS] Re: [GENERAL] Warning: Don't delete those /tmp/.PGSQL.* files

2000-11-29 Thread Tom Lane

"Joel Burton" [EMAIL PROTECTED] writes:
 On 25 Nov 2000, at 17:35, Tom Lane wrote:
 Ugh.  The reason that removing the socket file allowed a second
 postmaster to start up is that we use an advisory lock on the socket
 file as the interlock that prevents two PMs on the same port number.
 Remove the socket file, poof no interlock.
 
 *However*, there is a second line of defense to prevent two
 postmasters in the same directory, and I don't understand why that
 didn't trigger. Unless you are running a version old enough to not
 have it.  What PG version is this, anyway?

 7.1devel, from about 1 week ago.

Ah, I see why the data-directory interlock file wasn't helping: it
wasn't checked until *after* shared memory was set up (read clobbered
:-().  This was not a very bright choice.  I'm still surprised that
the shared-memory reset should've trashed your database so thoroughly,
though.

Over the past two days I've committed changes that should make the data
directory, socket file, and shared memory interlocks considerably more
robust.  In particular, mechanically doing "rm -f /tmp/.s.PGSQL.5432"
should never be necessary anymore.

Sorry about your trouble...

BTW, your original message mentioned something about a recursive view
definition that wasn't being recognized as such.  Could you provide
details on that?

regards, tom lane



[HACKERS] Re: [GENERAL] Warning: Don't delete those /tmp/.PGSQL.* files

2000-11-29 Thread Joel Burton

 Ah, I see why the data-directory interlock file wasn't helping: it
 wasn't checked until *after* shared memory was set up (read clobbered
 :-().  This was not a very bright choice.  I'm still surprised that
 the shared-memory reset should've trashed your database so thoroughly,
 though.
 
 Over the past two days I've committed changes that should make the
 data directory, socket file, and shared memory interlocks considerably
 more robust.  In particular, mechanically doing "rm -f
 /tmp/.s.PGSQL.5432" should never be necessary anymore.

That's fantastic. Thanks for the quick fix. 

 BTW, your original message mentioned something about a recursive view
 definition that wasn't being recognized as such.  Could you provide
 details on that?

I can't. It's a few weeks ago, the database has been in furious 
development, and, of course, I didn't bother to save all those views 
that crashed my server. I keep trying to re-create it, but can't 
figure it out. I'm sorry.

I think it wasn't just two views pointing at each other (it would, of 
course, be next to impossible to even create those, unless you hand 
tweaked the system tables), but I think was a view-relies-on-a-
function-relies-on-a-view kind of problem. If I ever see it again, I'll 
save it.

Thanks!

--
Joel Burton, Director of Information Systems -*- [EMAIL PROTECTED]
Support Center of Washington (www.scw.org)



[HACKERS] Re: [GENERAL] Warning: Don't delete those /tmp/.PGSQL.* files

2000-11-29 Thread Tom Lane

"Joel Burton" [EMAIL PROTECTED] writes:
 I think it wasn't just two views pointing at each other (it would, of 
 course, be next to impossible to even create those, unless you hand 
 tweaked the system tables), but I think was a view-relies-on-a-
 function-relies-on-a-view kind of problem.

Oh, OK.  I wouldn't expect the rewriter to realize that that sort of
situation is recursive.  Depending on what your function is doing, it
might or might not be an infinite recursion, so I don't think I'd want
the system arbitrarily preventing you from doing this sort of thing.

Perhaps there should be an upper bound on function-call recursion depth
enforced someplace?  Not sure.

regards, tom lane



Re: [HACKERS] Re: [GENERAL] Warning: Don't delete those /tmp/.PGSQL.* files

2000-11-27 Thread Marko Kreen

On Sat, Nov 25, 2000 at 07:41:52PM -0500, Tom Lane wrote:
 Peter Eisentraut [EMAIL PROTECTED] writes:
  Actually, this turns out to be similar to what you wrote in
  http://www.postgresql.org/mhonarc/pgsql-hackers/1998-08/msg00835.html
 
 Well, we've talked before about moving the socket files to someplace
 safer than /tmp.  The problem is to find another place that's not
 platform-dependent --- else you've got a major configuration headache.

Could this be described in e.g. /etc/postgresql/pg_client.conf?
a la the dbname idea?

I cant remember the exact terminology, but there is a
configuration file for clients, set at compile time where are
set the connection params for clients.

-

[db_foo]
type=inet
host=srv3.devel.net
port=1234
# there should be a way of specifing dbname later too
database=asdf

[db_baz]
type=unix
socket=/var/lib/postgres/comm/db_baz



Also there should be possible to give another configuration file
with env vars or command-line parameters.

Well, just a idea.

-- 
marko




[HACKERS] Re: [GENERAL] Warning: Don't delete those /tmp/.PGSQL.* files

2000-11-25 Thread Larry Rosenman

* Tom Lane [EMAIL PROTECTED] [001125 16:37]:
 "Joel Burton" [EMAIL PROTECTED] writes:
 
 This story does indicate that we need a less fragile interlock against
 starting two postmasters on one database.  I have to admit that it
 hadn't occurred to me that you could break the port-number interlock
 so easily as that :-(.  But obviously you can, so we need a different
 way of representing the interlock.  Hackers, any thoughts?
how about a .pid/.port/.???  file in the /data directory, and a lock on that? 


-- 
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 972-414-9812 E-Mail: [EMAIL PROTECTED]
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749



Re: [HACKERS] Re: [GENERAL] Warning: Don't delete those /tmp/.PGSQL.*files

2000-11-25 Thread Peter Eisentraut

Tom Lane writes:

 There is a related issue on my todo list, though --- didn't we find out
 awhile back that some older Linux kernels crash and burn if one attempts
 to get an advisory lock on a socket file?  (See thread 7/6/00)  Were we
 going to fix that, and if so how?  Or will we just tell people that they
 have to update their kernel to run Postgres?  The current configure
 script "works around" this by disabling the advisory lock on *all*
 versions of Linux, which I regard as a completely unacceptable
 solution...

Firstly, AFAIK there's no official production kernel that fixes this.  
When and if it gets fixed we can change that logic.

I have simple test program that exhibits the problem (taken from the
kernel mailing list), but

a) You shouldn't run test programs in configure.

b) You really shouldn't run test programs in configure that set up
   networking connections.

c) You definitely shouldn't run test programs in configure that provoke
   kernel exceptions.

We could use flock() on Linux, though.


Maybe we could name the socket file .s.PGSQL.port.pid and make
.s.PGSQL.port a symlink.  Then you can find out whether the postmaster
that created the file is still running.  (You could even put the actual
socket file into the data directory, although that would require
re-thinking the file permissions on the latter.)

Actually, this turns out to be similar to what you wrote in
http://www.postgresql.org/mhonarc/pgsql-hackers/1998-08/msg00835.html


But we really should be fixing the IPC interlock with IPC_EXCL, but the
code changes look to be non-trivial.

-- 
Peter Eisentraut  [EMAIL PROTECTED]   http://yi.org/peter-e/




Re: [HACKERS] Re: [GENERAL] Warning: Don't delete those /tmp/.PGSQL.* files

2000-11-25 Thread Tom Lane

Peter Eisentraut [EMAIL PROTECTED] writes:
 Maybe we could name the socket file .s.PGSQL.port.pid and make
 .s.PGSQL.port a symlink.  Then you can find out whether the postmaster
 that created the file is still running.

Or just create a lockfile /tmp/.s.PGSQL.port#.lock, ie, same name as
socket file with ".lock" added (containing postmaster's PID).  Then we
could share code with the data-directory-lockfile case.

 Actually, this turns out to be similar to what you wrote in
 http://www.postgresql.org/mhonarc/pgsql-hackers/1998-08/msg00835.html

Well, we've talked before about moving the socket files to someplace
safer than /tmp.  The problem is to find another place that's not
platform-dependent --- else you've got a major configuration headache.

 But we really should be fixing the IPC interlock with IPC_EXCL, but the
 code changes look to be non-trivial.

AFAIR the previous thread, it wasn't that bad, it was just a matter of
someone taking the time to do it.  Maybe I'll have a go at it...

regards, tom lane