Re: [HACKERS] Ocasional problems !!!!
I really have constraints of foreign keys but not on delete, only on update Stephan Szabo wrote: Did you perhaps have foreign key constraints with an on delete clause defined on a table that referenced this one? Postgres doesn't currently check that the types are comparable before making the constraint. I'm working on adding a check for that now. Or for that matter, any other rules or triggers could do it. On Fri, 15 Dec 2000, Luis Sousa wrote: I just trying to execute a simple query in a table to delete a simpe record or all of them, like: DELETE * FROM table; I have a schema of more or less 25 tables, that are created using a script. When i'm trying to use that table (and only happens in this table) when created by the script i receive the message below. The most strange is that i droped the table and i created again, maintaining the structure created with the script and i didn't had any problems !!! Best Regards Luis Sousa Stephan Szabo wrote: What is the schema of the table involved and what are the queries you are trying to run? Stephan Szabo [EMAIL PROTECTED] On Thu, 14 Dec 2000, Luis Sousa wrote: I'm trying to delete all the records or only one record or insert one record in a table but i'm having this message: ERROR: Unable to identify an operator '=' for types 'int4' and 'text' You will have to retype this query using an explicit cast What's this means ??? Thanks Luis Sousa -- Luis Sousa Tecnico Superior de Informatica Gabinete de Assessoria e Planeamento Universidade do Algarve
[HACKERS] heap page corruption not easy
A heap page corruption is not very likely in PostgreSQL because of the underlying page design. Not even on flakey hardware/ossoftware. (I once read a page design note from pg 4 but don't exactly remember were or when) The point is, that the heap page is only modified in places that were previously empty (except header). All previous row data stays exactly in the same place. Thus if a page is only partly written (any order of page segments) only a new row is affected. But those rows will be fixed during redo anyway. The only source of serious problems is thus a bogus write of a page segment (100 bytes ok 412 bytes chunk actually written to disk), but this case is imho sufficiently guarded or at least detected by disk hardware. (I assume that the page header fits into one atomic block and has no problem with beeing one step behind or ahead of redo). I thus doubt that we really need "physical log" for heap pages in PostgreSQL with the current non-overwrite smgr. If we could detect corruption in index pages we would not need physical log at all, since an index can always be recreated. What do you think ? I ask because "physical log" is a substantial amount of additional IO that we imho only want if it is absolutely necessary. Andreas PS: reposted, did this not make it to the list ?
Re: [HACKERS] Tuple data
Hiroshi Inoue [EMAIL PROTECTED] writes: To make it usable for inherited tables requires truly horrendous kluges (as you well know). Logical/physical attribute numbers solves it naturally. Maybe. At this point that's a theory without experimental evidence to back it up ;-). I'm still concerned about how widespread/intrusive the changes will need to be. regards, tom lane
Re: [HACKERS] Ocasional problems !!!!
Actually, it's not going to matter since all foreign keys have a delete portion (realized after seeing your response) that checks to make sure the one you are deleting is not being referenced. I'm surprised you're not seeing this on inserts into the fk table or on updates to the pk table. What are the types of the columns on both tables? Stephan Szabo [EMAIL PROTECTED] On Mon, 18 Dec 2000, Luis Sousa wrote: I really have constraints of foreign keys but not on delete, only on update Stephan Szabo wrote: Did you perhaps have foreign key constraints with an on delete clause defined on a table that referenced this one? Postgres doesn't currently check that the types are comparable before making the constraint. I'm working on adding a check for that now. Or for that matter, any other rules or triggers could do it. On Fri, 15 Dec 2000, Luis Sousa wrote: I just trying to execute a simple query in a table to delete a simpe record or all of them, like: DELETE * FROM table; I have a schema of more or less 25 tables, that are created using a script. When i'm trying to use that table (and only happens in this table) when created by the script i receive the message below. The most strange is that i droped the table and i created again, maintaining the structure created with the script and i didn't had any problems !!! Best Regards Luis Sousa Stephan Szabo wrote: What is the schema of the table involved and what are the queries you are trying to run? Stephan Szabo [EMAIL PROTECTED] On Thu, 14 Dec 2000, Luis Sousa wrote: I'm trying to delete all the records or only one record or insert one record in a table but i'm having this message: ERROR: Unable to identify an operator '=' for types 'int4' and 'text' You will have to retype this query using an explicit cast What's this means ??? Thanks Luis Sousa -- Luis Sousa Tecnico Superior de Informatica Gabinete de Assessoria e Planeamento Universidade do Algarve
Re: [HACKERS] Re: [SQL] PostgreSQL crashes on me :(
Date: Sun, 17 Dec 2000 22:47:55 -0500 From: Tom Lane [EMAIL PROTECTED] BUT: I think there's a race condition here, at least on systems where errno is not saved and restored around a signal handler. Consider the following scenario: Postmaster is waiting at the select() --- its normal state. Postmaster receives a SIGCHLD signal due to backend exit, so it goes off and does the reaper() thing. On return from reaper() the system arranges to return EINTR error from the select(). Before control can reach the "if (errno..." test, another SIGCHLD comes in. reaper() is invoked again and does its thing. The normal exit condition from reaper() will be errno == ECHILD, because that's what the waitpid() or wait3() call will return after all children are dealt with. If the signal-handling mechanism allows that to be returned to the mainline code, we have a failure. Can any FreeBSD hackers comment on the plausibility of this theory? I'm not a FreeBSD hacker, but I do know how the BSD kernel works unless FreeBSD has changed things. The important facts are: 1) The kernel only delivers signals when a process moves from kernel mode to user mode, after a system call or an interrupt (including a timer interrupt). 2) The errno variable is set in user space after the process has returned to user mode. Therefore, the scenario you describe is possible, but only if there happens to be both a timer interrupt and a SIGCHLD signal within a couple of instructions after the select returns. (I suppose that a page fault instead of a timer interrupt could have the same effect as well, although a page fault here seems quite unlikely unless the system is extremely overloaded.) A quick-and-dirty workaround would be to save and restore errno in reaper() and the other postmaster signal handlers. It might be a better idea in the long run to avoid doing system calls in the signal handlers --- but that would take a more substantial rewrite. Ideally, signal handlers should not make system calls. However, if this is impossible, then signal handlers must save and restore errno. Ian
Re: [HACKERS] Re: [SQL] PostgreSQL crashes on me :(
Ian Lance Taylor [EMAIL PROTECTED] writes: Therefore, the scenario you describe is possible, but only if there happens to be both a timer interrupt and a SIGCHLD signal within a couple of instructions after the select returns. Right. A small failure window would explain the infrequency of the symptom. Ideally, signal handlers should not make system calls. However, if this is impossible, then signal handlers must save and restore errno. I have just committed fixes to do that in current sources (7.1 branch). I doubt we're going to make a 7.0.4 release, but if we do then this ought to be back-patched. regards, tom lane
Re: [HACKERS] Re: [SQL] PostgreSQL crashes on me :(
Date: Mon, 18 Dec 2000 10:41:47 -0500 From: Tom Lane [EMAIL PROTECTED] [EMAIL PROTECTED] (Nathan Myers) writes: Sounds like a TODO list item: eliminate syscalls from signal handlers. After looking at it some more, I think that's a lot easier said than done. We could try writing the postmaster's SIGCHLD routine in the same style currently used for SIGHUP --- ie, signal handler just sets a flag that's examined by the main loop in ServerLoop --- but I don't see any way to guarantee timely response if we do that. If the SIGCHLD happens just before we reach the select() then the select() will block, and we won't respond to the dying child until the next connection request arrives or some other signal happens. That's OK under normal scenarios, but highly not OK when a backend has crashed. Any thoughts on a cleaner solution? One way to avoid this race condition is to set a timeout on the select. What is the maximum acceptable time for a timely response? Would executing a few instructions at that time interval impose an unacceptable system load? (Naturally no timeout should be used if there are no child processes.) Ian
Re: [DOCS] Re: [HACKERS] 7.1 features list
Peter Bierman [EMAIL PROTECTED] writes: Just now I went to http://www.postgresql.org/mhonarc/pgsql-hackers/ typed 'foo' in the search field, and I get a dialog a few seconds later: "The attempt to load:"Accessing URL: http://www.postgresql.org/mhonarc/pgsql-hackers/search.mpl?stuff" (runs offscreen). Odd, the same experiment seems to work fine for me. Maybe a browser dependency? I'm using Netscape 4.75 on HPUX ... Maybe it's some javascript I don't see any javascript on the loaded page. regards, tom lane
Re: [HACKERS] Re: [SQL] PostgreSQL crashes on me :(
Ian Lance Taylor [EMAIL PROTECTED] writes: Any thoughts on a cleaner solution? One way to avoid this race condition is to set a timeout on the select. What is the maximum acceptable time for a timely response? I thought about that, but it doesn't seem like a cleaner solution. Basically you'd have to figure a tradeoff between wasted cycles in the postmaster and time delay to respond to a crashed backend. And there's no good tradeoff there. If you have a backend crash, you want to shut down the other backends ASAP, before they have a chance to propagate any shared-memory corruption that the failed backend might've created. The entire exercise is probably pointless if the postmaster twiddles its thumbs for awhile before killing the other backends. regards, tom lane
Re: [HACKERS] Re: [SQL] PostgreSQL crashes on me :(
Date: Mon, 18 Dec 2000 13:18:26 -0500 From: Tom Lane [EMAIL PROTECTED] Ian Lance Taylor [EMAIL PROTECTED] writes: Any thoughts on a cleaner solution? One way to avoid this race condition is to set a timeout on the select. What is the maximum acceptable time for a timely response? I thought about that, but it doesn't seem like a cleaner solution. Basically you'd have to figure a tradeoff between wasted cycles in the postmaster and time delay to respond to a crashed backend. And there's no good tradeoff there. If you have a backend crash, you want to shut down the other backends ASAP, before they have a chance to propagate any shared-memory corruption that the failed backend might've created. The entire exercise is probably pointless if the postmaster twiddles its thumbs for awhile before killing the other backends. The timeout is only to catch the case where the child dies between checking the flag and calling select. What you really want, of course, is a version of select which lets you atomically control the signal blocking mask. That function is actually specified by the most recent version of POSIX; it's called pselect; it's just like select, but the last argument is a const sigset_t * which is atomically passed to sigprocmask(SIG_SETMASK) around the select (the original signal mask is restored before pselect returns). But I don't know which kernels implement it. There is an implementation on GNU/Linux, but if the kernel doesn't support it then it is not truly atomic. Ian
[HACKERS] Re: Beos update
Thanks. Applied. Ok, sorry but now I use cvs diff instead of difforig (which use -c by default). Here the same with -c. thanks cyril Sorry, I need a context diff, diff -c. It makes sure that the lines are added to the proper places. Thanks. [ Charset ISO-8859-1 unsupported, converting... ] Hi, Here is a patch for the beos port (All regression tests are OK). Updated files are : xlog.c : special case for beos to avoid 'link' which does not work yet beos/sem.c : implementation of new sem_ctl call (GETPID) and a new sem_op flag (IPCNOWAIT) dynloader/beos.c : add a verification of symbol validity (seem that the loader sometime return OK with an invalid symbol) postmaster.c : add beos forking support for the new checkpoint process postgres.c : remove beos special case for getrusage beos.h : Correction of a bas definition of AF_UNIX, misc defnitions thanks cyril [ Attachment, skipping... ] -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 [ Attachment, skipping... ] -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026
Re: [DOCS] Re: [HACKERS] 7.1 features list
On Mon, 18 Dec 2000, Tom Lane wrote: Peter Bierman [EMAIL PROTECTED] writes: Just now I went to http://www.postgresql.org/mhonarc/pgsql-hackers/ typed 'foo' in the search field, and I get a dialog a few seconds later: "The attempt to load:"Accessing URL: http://www.postgresql.org/mhonarc/pgsql-hackers/search.mpl?stuff" (runs offscreen). Odd, the same experiment seems to work fine for me. Maybe a browser dependency? I'm using Netscape 4.75 on HPUX ... Just went to the above URL using IE 5.5, types in 'foo' and it came back with 909 matches found ... Maybe it's some javascript I don't see any javascript on the loaded page. none used that I'm aware of either ... regards, tom lane Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: [EMAIL PROTECTED] secondary: scrappy@{freebsd|postgresql}.org
Re: [HACKERS] Re: [SQL] PostgreSQL crashes on me :(
Mathijs Brands [EMAIL PROTECTED] writes: I've worked around the situation by running a small script that continually monitors postgres and takes appropriate action if postgres shuts down. I'm assuming this problem won't lead to any data corruption. Hm. The problem here is that when the postmaster crashes, it probably doesn't take down the old backends with it. So if you auto-restart the postmaster, you may have two sets of backends that don't know about each other. That would be A Bad Thing. If your script kills off all the backends it can find before starting the new postmaster, that should work OK. A better solution would be to back-patch my errno fix into 7.0.3. You can see the diff as it stands for current sources at http://www.postgresql.org/cgi/cvsweb.cgi/pgsql/src/backend/postmaster/postmaster.c.diff?r1=1.198r2=1.199f=c The additions to reaper() are probably the only part you really need. regards, tom lane
RE: [HACKERS] heap page corruption not easy
The point is, that the heap page is only modified in places that were previously empty (except header). All previous row data stays exactly in the same place. Thus if a page is only partly written (any order of page segments) only a new row is affected. Exception: PageRepairFragmentation() and PageIndexTupleDelete() are called during vacuum - they change layout of tuples. But those rows will be fixed during redo anyway. We can't count on this for non-atomic 8K page writes: each page keeps LSN (log sequence number - offset of end of log record for last page modification) - if page LSN = LSN of redo record then recoverer assumes that changes already applied and doesn't try to redo op. We could change this - ie force applying changes. This requires new format of log records (we couldn't use PageAddItem in redo anymore): - for heap we would set pd_lower, pd_upper and line pointer (LP) and copy tuple data from record into page space; - for indices: set pd_lower, pd_upper, copy LPs from newly inserted index tuple LP till last one and copy tuple data from record into page space (in split case it seems better to log contents of both left and right siblings). We would also have to log entire page for two ops above (which change page layout) if op occures first time after checkpoint or insert/update/delete ops (because of redo for insert/update/delete may be forced for improper page layout). Well, this probably will decrease required full page logging but I would think more about this way. For example, I didn't consider upcoming undo op... The only source of serious problems is thus a bogus write of a page segment (100 bytes ok 412 bytes chunk actually written to disk), but this case is imho sufficiently guarded or at least detected by disk hardware. With full page logging after checkpoint we would be safe from this case... (I assume that the page header fits into one atomic block and has no problem with beeing one step behind or ahead of redo). I thus doubt that we really need "physical log" for heap pages in PostgreSQL with the current non-overwrite smgr. As you see we still need in full page backup when we want to reuse space and change page layout for this, so it's mostly issue not of smgr type. I don't know about Informix page design, but overwriting smgr itself doesn't require physical removing tuple from page (ie changing page layout) - something like turning LP_USED off would be enough and page layout could be changed on first insertion of new tuple. Nevertheless they do full page backup. If we could detect corruption in index pages we would not need physical log at all, since an index can always be recreated. I don't like to follow this way on long term - reindex is not option for 24x7x365 usage when index creation takes several minutes. Comments? - full page backup on first after checkpoint modification or - forcing redo and full page backup when changing page layout first time after checkpoint/insert/update/delete Vadim
Re: [HACKERS] Who is a maintainer of GiST code ?
Hannu Krosing [EMAIL PROTECTED] writes: and I can't see why btree stores them (as it seems to do judging by the index file size) - at least it does not use it for searching for "IS NULL" That's another thing that needs improvement ;-). Seems to me it should be able to do that. The reason why btree *has* to be able to deal with null entries is to cope with multi-column indexes; you don't want it refusing to index a row at all just because some of the columns are null. The others don't currently handle multi-column indexes, so they're not really forced to deal with that issue. From a purely semantic point of view I'm not sure why Oleg is worried about being able to store nulls in a GiST index ... seems like leaving them out is OK, modulo the occasional complaint from VACUUM's insufficiently intelligent tuple-count comparison ... regards, tom lane
[HACKERS] Re: [DOCS] 7.1 features list
On Sat, 16 Dec 2000, Bruce Momjian wrote: Here is the list of features in 7.1. One thing that I think ought to be added is that with 7.1, PostgreSQL will compile out of the box (i.e. without any extra patches) for Linux/Alpha. This might not be a big deal for most people, but for those of who run pgsql on Linux/Alpha, it is, and I feel it at least deserves a mention in the 7.1 feature list. I looked for it (i.e. grep -i alpha) in the list, but did not see it. Your choice which heading it goes under. Also, I have not tested any recent snapshots or betas on Linux/Alpha lately, but I plan to shortly and will let the hackers list know of any problems. I have every intention of making sure the 7.1 release does indeed work out of box on Linux/Alpha. Thanks, TTYL. --- | "For to me to live is Christ, and to die is gain."| |--- Philippians 1:21 (KJV) | --- | Ryan Kirkpatrick | Boulder, Colorado | http://www.rkirkpat.net/ | ---
Re: [HACKERS] Tuple data
Hiroshi Inoue wrote : [ ... ] Column order isn't essential in rdbms. Nitpicking A relation (a table) is a subset of the Cartesain cross-product of the definition domains of the attributes (columns). Cartesian product being a commutative operation, "order of columns" does not really exists. Period. If you impose an order relationship, you *add* inforation to the structure. That may be OK, but you can't rely on relational algebra to guarantee your results. You'll have to manage it yourself. (And, yes, there is relevant algebra for this, too ...). /Nitpicking Isn't it well known that it's not preferable to use 'select *','insert' without column list etc.. in production applications ? 100% agreed. Such a notation is an abbreviation. Handy, but dangerous. IMHO, such checking can (should ?) be done by an algorithm checking for column *names* before sending the "insert" command. A partial workaround : inserting in a view containing only the relevant columns, in a suitable (and known) order. [ Back to lurking ... ]
[HACKERS] LM devel GLIB2_2
Hi On LM 7.2-devel pgsql7.03 needs libc.so.6 GLIB2_2. Would anybody reply what packages do I need to download with glibc-2.2.21.rpm ? Thanks Lucian
Re: [HACKERS] Re: Beos update
Thanks. Applied. Here is a patch for the beos port (All regression tests are OK). Cyril, what version(s) of BeOS should be listed for our "ports list"? - Thomas
[HACKERS] OID Implicit limit
Hi folks, Reading the documentation, I see that OIDs are unique through the whole database. But since OIDs are int4, does that limit the number of rows I can have in a database to 2^32 = 4 billion ? Best Regards, Howe