Re: [HACKERS] Recovery from multi trouble
On Mon, 2006-03-27 at 12:14 +0900, OKADA Satoshi wrote: Our aim is giving database recovery chances to a database administrator at PostgreSQL startup time when there is possibility of data loss of losing log files. There is no possibility of data loss because of loss of a single log file, if you have your hardware configured correctly. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Nightly builds
I for one am still struggling with Kerberos on Windows, and it would be a great help for me to know where to get libraries and headers from. The pginstaller README might help you - it contains the steps used to build it for the installer... One of my problems is that it needs VC++ to build, and I don't have that. I tried the precompiled binaries, but the headers declare all functions as __stdcall which breaks my mingw build. If I change the headers to omit that __stdcall, libpq.dll builds but crashes. Any help would be welcome. Yours, Laurenz Albe ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Nightly builds
I for one am still struggling with Kerberos on Windows, and it would be a great help for me to know where to get libraries and headers from. The pginstaller README might help you - it contains the steps used to build it for the installer... One of my problems is that it needs VC++ to build, and I don't have that. I would expect it to build fine with the free download they have these days, but I haven't tried it. //Magnus ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Shared memory
Martijn, I tried a Socket approach. Using the new IO stuff that arrived with Java 1.4 (SocketChannel etc.), the performance is really good. Especially on Linux where an SMP machine show a 1 to 1.5 ratio between one process doing ping-pong between two threads and two processes doing ping-pong using a socket. That's acceptable overhead indeed and I don't think I'll be able to trim it much using a shared memory approach (the thread scenario uses Java monitor locks. That's the most efficient lightweight locking implementation I've come across). One downside is that on a Windows box, the ratio between the threads and the processes scenario seems to be 1 to 5 which is a bit worse. I've heard that Solaris too is less efficient then Linux in this respect. The real downside is that a call from SQL to PL/Java using the current in-process approach is really fast. It takes about 5 micro secs on my 2.8GHz i386 box. The overhead of an IPC-call on that box is about 18 micro secs on Linux and 64 micro secs on Windows. That's an overhead of between 440% and 1300% due to context switching alone. Yet, for some applications, perhaps that overhead is acceptable? It should be compared to the high memory consumption that the in-process approach undoubtedly results in (which in turn might lead to less optimal use of CPU caches and, if memory is insufficient, more time spent doing swapping). Given those numbers, it would be interesting to hear what the community as a whole thinks about this. Kind Regards, Thomas Hallgren Martijn van Oosterhout wrote: On Fri, Mar 24, 2006 at 11:51:30AM +0100, Thomas Hallgren wrote: Hi, I'm currently investigating the feasibility of an alternative PL/Java implementation that would use shared memory to communicate between a JVM and the backend processes. I would very much like to make use of the routines provided in shmem.c but I'm a bit uncertain how to add a segment for my own use. I'm wondering if a better way to do it would be similar to the way X does it. The client connects to the X server via a pipe (tcp/ip or unix domain). This is handy because you can block on a pipe. The client then allocates a shared memory segment and sends a message to the server, who can then also connect to it. The neat thing about this is that the client can put data in the shared memory segment and send one byte through the pipe and then block on a read. The JVM which has a thread waiting on the other end wakes up, processes the data, puts the result back and writes a byte to the pipe and waits. This wakes up the client who can then read the result. No locking, no semaphores, the standard UNIX semantics on pipes and sockets make sure everything works. In practice you'd probably end up sending small responses exclusively via the pipe and only use the shared memory for larger blocks of data but that's your choice. In X this is mostly used for image data and such. My questions are: 1. Do you see something right away that invalidates this approach? Nothing direct, though a single segment just for finding the JVM seems a lot. A socket approach would work better I think. 2. Is using the shared memory functionality that the backend provides a good idea (I'm thinking shmem functions, critical sections, semaphores, etc.). I'd rather depend on them then having conditional code for different operating systems. That I don't know. However, ISTM a lock-free approach is better wherever possible. If you can avoid the semaphores altogether... 3. Would it be better if the Postmaster allocated the global segment and started the JVM (based on some config parameter)? I don't know about the segment but the postmaster should start. I thought the tsearch guys had an approach using a co-process. I don't know how they start it up but they connected via pipes. Hope this helps, ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] 8.2 planning features
- Postgres intrinsic log-shipping replication (we have one to contribute) Are you saying you have a working WAL-shipping based portable (means working well on all platforms) replication already done ? Cause I was looking into implementing just this one :-) Do you have some details how it works ? Cheers, Csaba. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Shared memory
On Mon, Mar 27, 2006 at 10:57:21AM +0200, Thomas Hallgren wrote: Martijn, I tried a Socket approach. Using the new IO stuff that arrived with Java 1.4 (SocketChannel etc.), the performance is really good. Especially on Linux where an SMP machine show a 1 to 1.5 ratio between one process doing ping-pong between two threads and two processes doing ping-pong using a socket. That's acceptable overhead indeed and I don't think I'll be able to trim it much using a shared memory approach (the thread scenario uses Java monitor locks. That's the most efficient lightweight locking implementation I've come across). Yeah, it's fairly well known that the distinctions between processes and threads on linux is much smaller than on other OSes. Windows is pretty bad, which is why threading is much more popular there. The real downside is that a call from SQL to PL/Java using the current in-process approach is really fast. It takes about 5 micro secs on my 2.8GHz i386 box. The overhead of an IPC-call on that box is about 18 micro secs on Linux and 64 micro secs on Windows. That's an overhead of between 440% and 1300% due to context switching alone. Yet, for some applications, snip This might take some more measurements but AIUI the main difference between in-process and intra-process is that one has a JVM per connection, the other one JVM shared. In that case might thoughts are as follows: - Overhead of starting JVM. If you can start the JVM in the postmaster you might be able to avoid this. However, if you have to restart the JVM each process, that's a cost. - JIT overhead. For often used classes JIT compiling can help a lot with speed. But if every class needs to be reinterpreted each time, maybe that costs more than your IPC. - Memory overhead. You meantioned this already. - Are you optimising for many short-lived connections or a few long-lived connections? My gut feeling is that if someone creates a huge number of server-side java functions that performence will be better by having one always running JVM with highly JIT optimised code than having each JVM doing it from scratch. But this will obviously need to be tested. One other thing is that seperate processes give you the ability to parallelize. For example, if a Java function does an SPI query, it can receive and process results in parallel with the backend generating them. This may not be easy to acheive with an in-process JVM. Incidently, there are compilers these days that can compile Java to native. Is this Java stuff setup in such a way that you can compile your classes to native and load directly for the real speed-freaks? In that case, maybe you should concentrate on relibility and flexibility and still have a way out for functions that *must* be high-performance. Hope this helps, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. signature.asc Description: Digital signature
Re: [HACKERS] Shared memory
Martijn van Oosterhout wrote: On Mon, Mar 27, 2006 at 10:57:21AM +0200, Thomas Hallgren wrote: Martijn, I tried a Socket approach. Using the new IO stuff that arrived with Java 1.4 (SocketChannel etc.), the performance is really good. Especially on Linux where an SMP machine show a 1 to 1.5 ratio between one process doing ping-pong between two threads and two processes doing ping-pong using a socket. That's acceptable overhead indeed and I don't think I'll be able to trim it much using a shared memory approach (the thread scenario uses Java monitor locks. That's the most efficient lightweight locking implementation I've come across). Yeah, it's fairly well known that the distinctions between processes and threads on linux is much smaller than on other OSes. Windows is pretty bad, which is why threading is much more popular there. The real downside is that a call from SQL to PL/Java using the current in-process approach is really fast. It takes about 5 micro secs on my 2.8GHz i386 box. The overhead of an IPC-call on that box is about 18 micro secs on Linux and 64 micro secs on Windows. That's an overhead of between 440% and 1300% due to context switching alone. Yet, for some applications, snip This might take some more measurements but AIUI the main difference between in-process and intra-process is that one has a JVM per connection, the other one JVM shared. In that case might thoughts are as follows: - Overhead of starting JVM. If you can start the JVM in the postmaster you might be able to avoid this. However, if you have to restart the JVM each process, that's a cost. - JIT overhead. For often used classes JIT compiling can help a lot with speed. But if every class needs to be reinterpreted each time, maybe that costs more than your IPC. - Memory overhead. You meantioned this already. - Are you optimising for many short-lived connections or a few long-lived connections? My gut feeling is that if someone creates a huge number of server-side java functions that performence will be better by having one always running JVM with highly JIT optimised code than having each JVM doing it from scratch. But this will obviously need to be tested. The use case with a huge number of short-lived connections is not feasible at all with PL/Java as it stands today. This is partly the reason for my current research. Another reason is that it's sometimes desirable to share resources between your connections. Dangerous perhaps, but an API that encourages separation and allows sharing in a controlled way might prove very beneficial. The ideal use-case for PL/Java is a client that utilizes a connection pool. And most servlet containers and EJB servers do. Scenarios where you have just a few and fairly long lived clients are OK too. One other thing is that seperate processes give you the ability to parallelize. For example, if a Java function does an SPI query, it can receive and process results in parallel with the backend generating them. This may not be easy to acheive with an in-process JVM. It is fairly easy to achieve using threads. Only one thread at a time may of course execute an SPI query but that's true when multiple processes are in place too since the backend is single-threaded, and since the logical thread in PL/Java must utilize the same backend as where the call originated (to maintain the transaction boundaries). Any result must also sooner or later be delivered using that same backend which further limits the ability to parallelize. Incidently, there are compilers these days that can compile Java to native. Is this Java stuff setup in such a way that you can compile your classes to native and load directly for the real speed-freaks? PL/Java can be used with GCJ although I don't think the GCJ compiler outranks the JIT compiler in a modern JVM. It can only do static optimizations whereas the JIT has runtime heuristics to base its optimizations on. In the test results I've seen so far, the GCJ compiler only gets the upper hand in very simple tests. The JIT generated code is faster when things are more complicated. GCJ is great if you're using short-lived connections (less startup time and everything is optimized from the very start) but the native code that it produces still needs a JVM of some sort. No interpreter of course but classes must be initialized, a garbage collector must be running etc. The shared native code results in some gain in memory consumption but it's not as significant as one might think. In that case, maybe you should concentrate on relibility and flexibility and still have a way out for functions that *must* be high-performance. Given time and enough resources, I'd like to provide the best of two worlds and give the user a choice whether or not the JVM should be external. Ideally, this should be controlled using configuration parameters so that its easy to test which scenario that works best. It's a lot of work
[HACKERS] Remote administration contrib module
A thread on -general (http://archives.postgresql.org/pgsql-general/2006-03/msg01023.php) recently ended in the idea that the pgAdmin 'adminpack' be included as an 'official' PostgreSQL contrib module. Currently the code is an add-on available from the pgAdmin website, unless you run the Windows pgInstaller distribution of PostgreSQL in which case it is already included. For those that don't know, the adminpack is a set of additional functions designed to allow pgAdmin (and potentially any other client application) the ability to rewrite server configuration files and browse read logfiles etc. The full readme can be found at http://svn.pgadmin.org/cgi-bin/viewcvs.cgi/trunk/pgadmin3/xtra/admin81/R EADME.admin81?rev=5024view=markup, and the code can be seen at http://svn.pgadmin.org/cgi-bin/viewcvs.cgi/trunk/pgadmin3/xtra/admin81/. Tom's concern in the -general thread was that the pgAdmin licence is the Artistic License, however we are now in a position to offer the code under the BSD licence. So, how would people feel about including this as a contrib module in the future, until(/if) equivalent functionality becomes available in -core in some form? Regards, Dave. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Remote administration contrib module
Am Montag, 27. März 2006 15:12 schrieb Dave Page: So, how would people feel about including this as a contrib module in the future, until(/if) equivalent functionality becomes available in -core in some form? Right now you have got plenty of time to get it in shape for inclusion in core, so we might not even have to take the detour through a contrib module. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Remote administration contrib module
-Original Message- From: Peter Eisentraut [mailto:[EMAIL PROTECTED] Sent: Mon 3/27/2006 3:00 PM To: pgsql-hackers@postgresql.org Cc: Dave Page Subject: Re: [HACKERS] Remote administration contrib module Am Montag, 27. März 2006 15:12 schrieb Dave Page: So, how would people feel about including this as a contrib module in the future, until(/if) equivalent functionality becomes available in -core in some form? Right now you have got plenty of time to get it in shape for inclusion in core, so we might not even have to take the detour through a contrib module. As it stands it was previously rejected for inclusion in -core in it's current form. The main objector was Tom, but as it was him that encouraged me to change the licencing for inclusion as a contrib module I assume he doesn't object to that. I have submitted a conference discussion proposal to talk about ways of implementing the remote configuration that we want which was the main thing I think Tom objected to originally, but that's some time away and would likely result in a significantly more complex solution that may or may not get done for 8.2. In addition, there are one or two helper functions in there that were previously rejected more on the 'no use to end user' basis iirc (see pg_logdir_ls) which may never end up in -core without reconsideration of the need. It would be good (for us at least) to get it in as a contrib, even if we do manage to replace it, or some of it, by release. Regards, Dave ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [GENERAL] PANIC: heap_update_redo: no block
[ redirecting to a more appropriate mailing list ] Alex bahdushka [EMAIL PROTECTED] writes: LOG: REDO @ D/19176610; LSN D/19176644: prev D/191765E8; xid 81148979: Heap - clean: rel 1663/16386/16559898; blk 0 LOG: REDO @ D/19176644; LSN D/191766A4: prev D/19176610; xid 81148979: Heap - move: rel 1663/16386/16559898; tid 1/1; new 0/10 PANIC: heap_update_redo: no block: target blcknum: 1, relation(1663/16386/16559898) length: 1 I think what's happened here is that VACUUM FULL moved the only tuple off page 1 of the relation, then truncated off page 1, and now heap_update_redo is panicking because it can't find page 1 to replay the move. Curious that we've not seen a case like this before, because it seems like a generic hazard for WAL replay. The simplest fix would be to treat WAL records as no-ops if they refer to nonexistent pages, but that seems much too prone to hide real failure conditions. Another thought is to remember that we ignored this record, and then complain if we don't see a TRUNCATE that would've removed the page. That would be pretty complicated but not impossible. Anyone have a better idea? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Recovery from multi trouble
Simon Riggs [EMAIL PROTECTED] writes: On Mon, 2006-03-27 at 12:14 +0900, OKADA Satoshi wrote: Our aim is giving database recovery chances to a database administrator at PostgreSQL startup time when there is possibility of data loss of losing log files. There is no possibility of data loss because of loss of a single log file, if you have your hardware configured correctly. I'm fairly concerned about whether this isn't just replacing one failure mode with another one. See nearby discussion with Alex Bahdushka for a graphic reminder why PANICing on any little thing during WAL replay is not necessarily such a great idea. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Shared memory
Thomas Hallgren [EMAIL PROTECTED] writes: The real downside is that a call from SQL to PL/Java using the current in-process approach is really fast. It takes about 5 micro secs on my 2.8GHz i386 box. The overhead of an IPC-call on that box is about 18 micro secs on Linux and 64 micro secs on Windows. That's an overhead of between 440% and 1300% due to context switching alone. Yet, for some applications, perhaps that overhead is acceptable? It's only that much difference? Given all the other advantages of separating the JVM from the backends, I'd say you should gladly pay that price. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Shared memory
Tom Lane wrote: It's only that much difference? Given all the other advantages of separating the JVM from the backends, I'd say you should gladly pay that price. If I'm right, and the most common scenario is clients using connection pools, then it's very likely that you don't get any advantages at all. Paying for nothing with a 440% increase in calling time (at best) seems expensive :-) Regards, Thomas Hallgren ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Shared memory
Thomas Hallgren [EMAIL PROTECTED] writes: Tom Lane wrote: It's only that much difference? Given all the other advantages of separating the JVM from the backends, I'd say you should gladly pay that price. If I'm right, and the most common scenario is clients using connection pools, then it's very likely that you don't get any advantages at all. Paying for nothing with a 440% increase in calling time (at best) seems expensive :-) You are focused too narrowly on a few performance numbers. In my mind the primary advantage is that it will *work*. I do not actually believe that you'll ever get the embedded-JVM approach to production-grade reliability, because of the fundamental problems with threading, error processing, etc. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Domains as Subtypes
elein [EMAIL PROTECTED] writes: But I like the idea of centralizing the check in the input/output functions. It seems clearer and cleaner. I remembered the problem with doing it that way: an input function can't enforce a domain NOTNULL constraint, because it won't even get invoked for a null input value. So there seems no way around having a special case for domains in all places where I/O conversion is done. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Why are default encoding conversions namespace-specific?
See $SUBJECT. It seems to me this is a bad idea for much the same reasons that we recently decided default index operator classes should not be namespace-specific: http://archives.postgresql.org/pgsql-hackers/2006-02/msg00284.php I don't mind having encoding conversions be named within schemas, but I propose that any given encoding pair be allowed to have only one default conversion, period, and that when we are looking for a default conversion we find it by a non-namespace-aware search. With the existing definition, any change in search_path could theoretically cause a change in client-to-server encoding conversion behavior, and this just seems like a really bad idea. (It's only theoretical because we don't actually redo the conversion function search on a search_path change ... but if you think the existing definition is good then that's a bug.) Comments? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Shared memory
Tom Lane wrote: Thomas Hallgren [EMAIL PROTECTED] writes: Tom Lane wrote: It's only that much difference? Given all the other advantages of separating the JVM from the backends, I'd say you should gladly pay that price. If I'm right, and the most common scenario is clients using connection pools, then it's very likely that you don't get any advantages at all. Paying for nothing with a 440% increase in calling time (at best) seems expensive :-) You are focused too narrowly on a few performance numbers. In my mind the primary advantage is that it will *work*. I do not actually believe that you'll ever get the embedded-JVM approach to production-grade reliability, because of the fundamental problems with threading, error processing, etc. My focus with PL/Java over the last year has been to make it a production-grade product and I think I've succeeded pretty well. The current list of open bugs is second to none. What fundamental problems are you thinking of that hasn't been solved already? Regards, Thomas Hallgren ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] proposal - plpgsql: execute using into
Hello Current EXECUTE statemtn doesn't support other way for parametrisation than concating strings. It works well but it's little bit unreadable. Oracle's statement EXECUTE has positional replacement feature. It works similar our RAISE statement (when position holder is %). EXECUTE position holder has form :. has only symbolic value and isn't used for anything. Syntax of enhanced statements is: EXECUTE 'format string' USING expr_list There are some problems about replacing string values in the SQL string. Sometimes we have to enclose value between spaces or others symbols (apostrophe or double apostrophe), sometimes not. Possible rules: a) if position holder is inside string or identifier we don't enclose value; b) else numeric values are enclosed spaces and others (non regclass) single apostrophes c) regclass's values are enclosed douple apostrophes. PL/pgSQL knows three dynamic statements. All will be enhanced. Some examples: EXECUTE 'SELECT :name||:sp||:surname' USING 'Pavel',' ','Stehule'; EXECUTE e'SELECT \':name :surname' USING 'Pavel','Stehule'; EXECUTE 'SELECT * FROM :tabname' USING 'xb'::regclass; EXECUTE 'SELECT * FROM :base:num USING 'mytab',1; You can test it. I sent patch to pg_patches. I invite any comments Pavel Stehule _ Emotikony a pozadi programu MSN Messenger ozivi vasi konverzaci. http://messenger.msn.cz/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] proposal - plpgsql: execute using into
Pavel Stehule [EMAIL PROTECTED] writes: Current EXECUTE statemtn doesn't support other way for parametrisation than concating strings. It works well but it's little bit unreadable. Oracle's statement EXECUTE has positional replacement feature. ... There are some problems about replacing string values in the SQL string. Doesn't the Oracle implementation already imply a solution to that? The examples you give look to me like they are escaping problems waiting to happen, especially in view of the upcoming change in default backslash behavior, so this whole thing makes me feel pretty nervous. I think we'd be best off to leave EXECUTE alone, at least until we've converged to the point where almost nobody is using non-standard-compliant strings. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [GENERAL] PANIC: heap_update_redo: no block
Tom Lane [EMAIL PROTECTED] writes: I think what's happened here is that VACUUM FULL moved the only tuple off page 1 of the relation, then truncated off page 1, and now heap_update_redo is panicking because it can't find page 1 to replay the move. Curious that we've not seen a case like this before, because it seems like a generic hazard for WAL replay. This sounds familiar http://archives.postgresql.org/pgsql-hackers/2005-05/msg01369.php -- greg ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [GENERAL] PANIC: heap_update_redo: no block
Greg Stark [EMAIL PROTECTED] writes: This sounds familiar http://archives.postgresql.org/pgsql-hackers/2005-05/msg01369.php Hm, I had totally forgotten about that todo item :-(. Time to push it back up the priority list. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Domains as Subtypes
On Mon, Mar 27, 2006 at 11:41:30AM -0500, Tom Lane wrote: elein [EMAIL PROTECTED] writes: But I like the idea of centralizing the check in the input/output functions. It seems clearer and cleaner. I remembered the problem with doing it that way: an input function can't enforce a domain NOTNULL constraint, because it won't even get invoked for a null input value. So there seems no way around having a special case for domains in all places where I/O conversion is done. The notnull attribute of the pg_type table should be set to not null in the case of a not null constraint on a domain (type). You should not have to invoke the input function to check for that. Or perhaps I'm missing the details. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match elein [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Domains as Subtypes
elein [EMAIL PROTECTED] writes: On Mon, Mar 27, 2006 at 11:41:30AM -0500, Tom Lane wrote: I remembered the problem with doing it that way: an input function can't enforce a domain NOTNULL constraint, because it won't even get invoked for a null input value. So there seems no way around having a special case for domains in all places where I/O conversion is done. The notnull attribute of the pg_type table should be set to not null in the case of a not null constraint on a domain (type). You should not have to invoke the input function to check for that. Or perhaps I'm missing the details. Well, I can see two problems: 1. If we have to add code to everyplace that calls an input function to do that, then we've failed to achieve the hoped-for goal of solving the problem in just one place. 2. NOTNULL is just the most obvious form of the problem. There could be domain CHECK constraints that fail on null input --- CHECK(VALUE IS NOT NULL) for example, or something more subtle. If we don't run the input function then this means the CHECK constraints also have to be done out-of-band, and then we've lost any leverage whatsoever. We could push the problem into a domain input function if we abandoned the current rule that input functions are never invoked for nulls (we could check their strictness flag to decide whether to do it). This sort of change seems distinctly cleaner than pushing explicit knowledge about domains into all the places that use input functions, but it's still pretty ugly: A. We still have to touch everyplace that uses an input function; any code not changed will simply do the Wrong Thing for nulls, which is not a very friendly failure mode. (And we know there are places outside the core that use this stuff, for instance non-core PLs.) B. C-language input functions for most datatypes will need to be declared strict, else they'll crash on null input, which is an even less friendly behavior. Again, we can't be sure that non-core datatypes get this right at present. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Why are default encoding conversions
See $SUBJECT. It seems to me this is a bad idea for much the same reasons that we recently decided default index operator classes should not be namespace-specific: http://archives.postgresql.org/pgsql-hackers/2006-02/msg00284.php I don't mind having encoding conversions be named within schemas, but I propose that any given encoding pair be allowed to have only one default conversion, period, and that when we are looking for a default conversion we find it by a non-namespace-aware search. That doesn't sound good idea to me. With the existing definition, any change in search_path could theoretically cause a change in client-to-server encoding conversion behavior, and this just seems like a really bad idea. (It's only theoretical because we don't actually redo the conversion function search on a search_path change ... but if you think the existing definition is good then that's a bug.) Then why do we have CREATE DEFAULT CONVERSION command at all? -- Tatsuo Ishii SRA OSS, Inc. Japan ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Why are default encoding conversions namespace-specific?
Tatsuo Ishii [EMAIL PROTECTED] writes: I don't mind having encoding conversions be named within schemas, but I propose that any given encoding pair be allowed to have only one default conversion, period, and that when we are looking for a default conversion we find it by a non-namespace-aware search. That doesn't sound good idea to me. What does it mean to have different default encoding conversions in different schemas? Even if this had a sensible interpretation, I don't think the existing code implements it properly. Then why do we have CREATE DEFAULT CONVERSION command at all? So you can create the one you're allowed to have, of course ... regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Why are default encoding conversions namespace-specific?
Tom Lane said: Tatsuo Ishii [EMAIL PROTECTED] writes: I don't mind having encoding conversions be named within schemas, but I propose that any given encoding pair be allowed to have only one default conversion, period, and that when we are looking for a default conversion we find it by a non-namespace-aware search. That doesn't sound good idea to me. What does it mean to have different default encoding conversions in different schemas? Even if this had a sensible interpretation, I don't think the existing code implements it properly. perhaps I'm misunderstanding, but why not just resolve the namespace at the time the default conversion is created? cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Why are default encoding conversions namespace-specific?
Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane said: What does it mean to have different default encoding conversions in different schemas? Even if this had a sensible interpretation, I don't think the existing code implements it properly. perhaps I'm misunderstanding, but why not just resolve the namespace at the time the default conversion is created? Isn't that the same thing as saying that there can only be one default conversion across all schemas? (Only one for a given source and target encoding pair, of course.) If it isn't the same, please explain more clearly. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Why are default encoding conversions namespace-specific?
Tom Lane said: Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane said: What does it mean to have different default encoding conversions in different schemas? Even if this had a sensible interpretation, I don't think the existing code implements it properly. perhaps I'm misunderstanding, but why not just resolve the namespace at the time the default conversion is created? Isn't that the same thing as saying that there can only be one default conversion across all schemas? (Only one for a given source and target encoding pair, of course.) If it isn't the same, please explain more clearly. Yeah, I guess it is. I was thinking of it more as namespace-specified than as non-namespace-aware. I guess it's a matter of perspective. cheers andrew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [GENERAL] PANIC: heap_update_redo: no block
Greg Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: I think what's happened here is that VACUUM FULL moved the only tuple off page 1 of the relation, then truncated off page 1, and now heap_update_redo is panicking because it can't find page 1 to replay the move. Curious that we've not seen a case like this before, because it seems like a generic hazard for WAL replay. This sounds familiar http://archives.postgresql.org/pgsql-hackers/2005-05/msg01369.php After further review I've concluded that there is not a systemic bug here, but there are several nearby local bugs. The reason it's not a systemic bug is that this scenario is supposed to be handled by the same mechanism that prevents torn-page writes: the first XLOG record that touches a given page after a checkpoint is supposed to rewrite the entire page, rather than update it incrementally. Since XLOG replay always begins at a checkpoint, this means we should always be able to write a fresh copy of the page, even after relation deletion or truncation. Furthermore, during XLOG replay we are willing to create a table (or even a whole tablespace or database directory) if it's not there when touched. The subsequent replay of the deletion or truncation will get rid of any unwanted data again. Therefore, there is no systemic bug --- unless you are running with full_page_writes=off. I assert that that GUC variable is broken and must be removed. There are, however, a bunch of local bugs, including these: * On a symlink-less platform (ie, Windows), TablespaceCreateDbspace is #ifdef'd to be a no-op. This is wrong because it performs the essential function of re-creating a tablespace or database directory if needed during replay. AFAICS the #if can just be removed and have the same code with or without symlinks. * log_heap_update decides that it can set XLOG_HEAP_INIT_PAGE instead of storing the full destination page, if the destination contains only the single tuple being moved. This is fine, except it also resets the buffer indicator for the *source* page, which is wrong --- that page may still need to be re-generated from the xlog record. This is the proximate cause of the bug report that started this thread. * btree_xlog_split passes extend=false to XLogReadBuffer for the left sibling, which is silly because it is going to rewrite that whole page from the xlog record anyway. It should pass true so that there's no complaint if the left sib page was later truncated away. This accounts for one of the bug reports mentioned in the message cited above. * btree_xlog_delete_page passes extend=false for the target page, which is likewise silly because it's going to init the page (not that there was any useful data on it anyway). This accounts for the other bug report mentioned in the message cited above. Clearly, we need to go through the xlog code with a fine tooth comb and convince ourselves that all pages touched by any xlog record will be properly reconstituted if they've later been truncated off. I have not yet examined any of the code except the above. Notice that these are each, individually, pretty low-probability scenarios, which is why we've not seen many bug reports. If we had had a systemic bug I'm sure we'd be seeing far more. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Tablespaces oddity?
Hi folks after discussing this on IRC today (thanks G_SabinoMullane!), I'm still surprised by this behaviour on 8.1.3: pyarra=# create TABLESPACE spctables location '/mnt/pg_tables/data'; CREATE TABLESPACE pyarra=# create table foo(id int) tablespace spctables; CREATE TABLE pyarra=# \d foo Table public.foo Column | Type | Modifiers +-+--- id | integer | Tablespace: spctables So far, so good... pyarra=# CREATE DATABASE spctest TABLESPACE spctables; CREATE DATABASE pyarra=# \c spctest; You are now connected to database spctest. spctest=# create table foo(id int) tablespace spctables; CREATE TABLE spctest=# create table bar(id int); CREATE TABLE spctest=# \d foo Table public.foo Column | Type | Modifiers +-+--- id | integer | spctest=# \d bar Table public.bar Column | Type | Modifiers +-+--- id | integer | I hoped that these last two tables would also be listed as being on spctables. I think the issue is that pg_class.reltablespace = 0 where these objects are created on the default tablespace for this database. I can find out which tablespace the objects are really on with: select relname, COALESCE(t.spcname,(select spcname from pg_tablespace where oid = (select dattablespace from pg_database where datname =current_database( as tablespace from pg_class c left join pg_tablespace t on (t.oid = c.reltablespace) Someone else might be able to see a better way to write this query, but I think it would be good if \d could show this information, when you really want to know which tablespace an object is on. Note also that \l won't show you the tablespace for a DB, so you need to query pg_database to even know which is the default tablespace for a DB. It's not impossible, just harder than it needs to be, I reckon. Any thoughts? Regards, Philip. -- Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. - Brian W. Kernighan - Utiba Pty Ltd This message has been scanned for viruses and dangerous content by Utiba mail server and is believed to be clean. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster