Re: [HACKERS] Recovery from multi trouble

2006-03-27 Thread Simon Riggs
On Mon, 2006-03-27 at 12:14 +0900, OKADA Satoshi wrote:

 Our aim is giving database recovery chances to a database administrator
 at PostgreSQL startup time when there is possibility of data loss of
 losing log files.

There is no possibility of data loss because of loss of a single log
file, if you have your hardware configured correctly.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Nightly builds

2006-03-27 Thread Albe Laurenz
 I for one am still struggling with Kerberos on Windows, and 
 it would be a great help for me to know where to get 
 libraries and headers from.
 
 The pginstaller README might help you - it contains the steps used to
 build it for the installer...

One of my problems is that it needs VC++ to build, and I don't have
that.
I tried the precompiled binaries, but the headers declare all functions
as __stdcall which breaks my mingw build.
If I change the headers to omit that __stdcall, libpq.dll builds but
crashes.

Any help would be welcome.
Yours,
Laurenz Albe

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Nightly builds

2006-03-27 Thread Magnus Hagander
  I for one am still struggling with Kerberos on Windows, 
 and it would 
  be a great help for me to know where to get libraries and headers 
  from.
  
  The pginstaller README might help you - it contains the 
 steps used to 
  build it for the installer...
 
 One of my problems is that it needs VC++ to build, and I 
 don't have that.

I would expect it to build fine with the free download they have these
days, but I haven't tried it.

//Magnus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Shared memory

2006-03-27 Thread Thomas Hallgren

Martijn,

I tried a Socket approach. Using the new IO stuff that arrived with Java 1.4 (SocketChannel 
etc.), the performance is really good. Especially on Linux where an SMP machine show a 1 to 
1.5 ratio between one process doing ping-pong between two threads and two processes doing 
ping-pong using a socket. That's acceptable overhead indeed and I don't think I'll be able 
to trim it much using a shared memory approach (the thread scenario uses Java monitor locks. 
That's the most efficient lightweight locking implementation I've come across).


One downside is that on a Windows box, the ratio between the threads and the processes 
scenario seems to be 1 to 5 which is a bit worse. I've heard that Solaris too is less 
efficient then Linux in this respect.


The real downside is that a call from SQL to PL/Java using the current in-process approach 
is really fast. It takes about 5 micro secs on my 2.8GHz i386 box. The overhead of an 
IPC-call on that box is about 18 micro secs on Linux and 64 micro secs on Windows. That's an 
overhead of between 440% and 1300% due to context switching alone. Yet, for some 
applications, perhaps that overhead is acceptable? It should be compared to the high memory 
consumption that the in-process approach undoubtedly results in (which in turn might lead to 
less optimal use of CPU caches and, if memory is insufficient, more time spent doing swapping).


Given those numbers, it would be interesting to hear what the community as a whole thinks 
about this.


Kind Regards,
Thomas Hallgren


Martijn van Oosterhout wrote:

On Fri, Mar 24, 2006 at 11:51:30AM +0100, Thomas Hallgren wrote:

Hi,
I'm currently investigating the feasibility of an alternative PL/Java 
implementation that would use shared memory to communicate between a JVM 
and the backend processes. I would very much like to make use of the 
routines provided in shmem.c but I'm a bit uncertain how to add a segment 
for my own use.


I'm wondering if a better way to do it would be similar to the way X
does it. The client connects to the X server via a pipe (tcp/ip or unix
domain). This is handy because you can block on a pipe. The client then
allocates a shared memory segment and sends a message to the server,
who can then also connect to it.

The neat thing about this is that the client can put data in the shared
memory segment and send one byte through the pipe and then block on a
read. The JVM which has a thread waiting on the other end wakes up,
processes the data, puts the result back and writes a byte to the pipe
and waits. This wakes up the client who can then read the result.

No locking, no semaphores, the standard UNIX semantics on pipes and
sockets make sure everything works.

In practice you'd probably end up sending small responses exclusively
via the pipe and only use the shared memory for larger blocks of data
but that's your choice. In X this is mostly used for image data and
such.


My questions are:
1. Do you see something right away that invalidates this approach?


Nothing direct, though a single segment just for finding the JVM seems
a lot. A socket approach would work better I think.

2. Is using the shared memory functionality that the backend provides a 
good idea (I'm thinking shmem functions, critical sections, semaphores, 
etc.). I'd rather depend on them then having conditional code for different 
operating systems.


That I don't know. However, ISTM a lock-free approach is better
wherever possible. If you can avoid the semaphores altogether...

3. Would it be better if the Postmaster allocated the global segment and 
started the JVM (based on some config parameter)?


I don't know about the segment but the postmaster should start. I
thought the tsearch guys had an approach using a co-process. I don't
know how they start it up but they connected via pipes.

Hope this helps,



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] 8.2 planning features

2006-03-27 Thread Csaba Nagy
 - Postgres intrinsic log-shipping replication (we have one to contribute)

Are you saying you have a working WAL-shipping based portable (means
working well on all platforms) replication already done ? Cause I was
looking into implementing just this one :-)

Do you have some details how it works ?

Cheers,
Csaba.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Shared memory

2006-03-27 Thread Martijn van Oosterhout
On Mon, Mar 27, 2006 at 10:57:21AM +0200, Thomas Hallgren wrote:
 Martijn,
 
 I tried a Socket approach. Using the new IO stuff that arrived with Java 
 1.4 (SocketChannel etc.), the performance is really good. Especially on 
 Linux where an SMP machine show a 1 to 1.5 ratio between one process doing 
 ping-pong between two threads and two processes doing ping-pong using a 
 socket. That's acceptable overhead indeed and I don't think I'll be able to 
 trim it much using a shared memory approach (the thread scenario uses Java 
 monitor locks. That's the most efficient lightweight locking implementation 
 I've come across).

Yeah, it's fairly well known that the distinctions between processes
and threads on linux is much smaller than on other OSes. Windows is
pretty bad, which is why threading is much more popular there.

 The real downside is that a call from SQL to PL/Java using the current 
 in-process approach is really fast. It takes about 5 micro secs on my 
 2.8GHz i386 box. The overhead of an IPC-call on that box is about 18 micro 
 secs on Linux and 64 micro secs on Windows. That's an overhead of between 
 440% and 1300% due to context switching alone. Yet, for some applications, 

snip

This might take some more measurements but AIUI the main difference
between in-process and intra-process is that one has a JVM per
connection, the other one JVM shared. In that case might thoughts are
as follows:

- Overhead of starting JVM. If you can start the JVM in the postmaster
you might be able to avoid this. However, if you have to restart the
JVM each process, that's a cost.

- JIT overhead. For often used classes JIT compiling can help a lot
with speed. But if every class needs to be reinterpreted each time,
maybe that costs more than your IPC.

- Memory overhead. You meantioned this already.

- Are you optimising for many short-lived connections or a few
long-lived connections?

My gut feeling is that if someone creates a huge number of server-side
java functions that performence will be better by having one always
running JVM with highly JIT optimised code than having each JVM doing
it from scratch. But this will obviously need to be tested.

One other thing is that seperate processes give you the ability to
parallelize. For example, if a Java function does an SPI query, it can
receive and process results in parallel with the backend generating
them. This may not be easy to acheive with an in-process JVM.

Incidently, there are compilers these days that can compile Java to
native. Is this Java stuff setup in such a way that you can compile your
classes to native and load directly for the real speed-freaks? In that
case, maybe you should concentrate on relibility and flexibility and
still have a way out for functions that *must* be high-performance.

Hope this helps,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


signature.asc
Description: Digital signature


Re: [HACKERS] Shared memory

2006-03-27 Thread Thomas Hallgren

Martijn van Oosterhout wrote:

On Mon, Mar 27, 2006 at 10:57:21AM +0200, Thomas Hallgren wrote:

Martijn,

I tried a Socket approach. Using the new IO stuff that arrived with Java 
1.4 (SocketChannel etc.), the performance is really good. Especially on 
Linux where an SMP machine show a 1 to 1.5 ratio between one process doing 
ping-pong between two threads and two processes doing ping-pong using a 
socket. That's acceptable overhead indeed and I don't think I'll be able to 
trim it much using a shared memory approach (the thread scenario uses Java 
monitor locks. That's the most efficient lightweight locking implementation 
I've come across).


Yeah, it's fairly well known that the distinctions between processes
and threads on linux is much smaller than on other OSes. Windows is
pretty bad, which is why threading is much more popular there.

The real downside is that a call from SQL to PL/Java using the current 
in-process approach is really fast. It takes about 5 micro secs on my 
2.8GHz i386 box. The overhead of an IPC-call on that box is about 18 micro 
secs on Linux and 64 micro secs on Windows. That's an overhead of between 
440% and 1300% due to context switching alone. Yet, for some applications, 


snip

This might take some more measurements but AIUI the main difference
between in-process and intra-process is that one has a JVM per
connection, the other one JVM shared. In that case might thoughts are
as follows:

- Overhead of starting JVM. If you can start the JVM in the postmaster
you might be able to avoid this. However, if you have to restart the
JVM each process, that's a cost.

- JIT overhead. For often used classes JIT compiling can help a lot
with speed. But if every class needs to be reinterpreted each time,
maybe that costs more than your IPC.

- Memory overhead. You meantioned this already.

- Are you optimising for many short-lived connections or a few
long-lived connections?

My gut feeling is that if someone creates a huge number of server-side
java functions that performence will be better by having one always
running JVM with highly JIT optimised code than having each JVM doing
it from scratch. But this will obviously need to be tested.

The use case with a huge number of short-lived connections is not feasible at all with 
PL/Java as it stands today. This is partly the reason for my current research. Another 
reason is that it's sometimes desirable to share resources between your connections. 
Dangerous perhaps, but an API that encourages separation and allows sharing in a controlled 
way might prove very beneficial.


The ideal use-case for PL/Java is a client that utilizes a connection pool. And most servlet 
containers and EJB servers do. Scenarios where you have just a few and fairly long lived 
clients are OK too.



One other thing is that seperate processes give you the ability to
parallelize. For example, if a Java function does an SPI query, it can
receive and process results in parallel with the backend generating
them. This may not be easy to acheive with an in-process JVM.



It is fairly easy to achieve using threads. Only one thread at a time may of course execute 
an SPI query but that's true when multiple processes are in place too since the backend is 
single-threaded, and since the logical thread in PL/Java must utilize the same backend as 
where the call originated (to maintain the transaction boundaries). Any result must also 
sooner or later be delivered using that same backend which further limits the ability to 
parallelize.




Incidently, there are compilers these days that can compile Java to
native. Is this Java stuff setup in such a way that you can compile your
classes to native and load directly for the real speed-freaks?


PL/Java can be used with GCJ although I don't think the GCJ compiler outranks the JIT 
compiler in a modern JVM. It can only do static optimizations whereas the JIT has runtime 
heuristics to base its optimizations on. In the test results I've seen so far, the GCJ 
compiler only gets the upper hand in very simple tests. The JIT generated code is faster 
when things are more complicated.


GCJ is great if you're using short-lived connections (less startup time and everything is 
optimized from the very start) but the native code that it produces still needs a JVM of 
some sort. No interpreter of course but classes must be initialized, a garbage collector 
must be running etc. The shared native code results in some gain in memory consumption but 
it's not as significant as one might think.




In that
case, maybe you should concentrate on relibility and flexibility and
still have a way out for functions that *must* be high-performance.



Given time and enough resources, I'd like to provide the best of two worlds and give the 
user a choice whether or not the JVM should be external. Ideally, this should be controlled 
using configuration parameters so that its easy to test which scenario that works best. It's 
a lot of work 

[HACKERS] Remote administration contrib module

2006-03-27 Thread Dave Page
A thread on -general
(http://archives.postgresql.org/pgsql-general/2006-03/msg01023.php)
recently ended in the idea that the pgAdmin 'adminpack' be included as
an 'official' PostgreSQL contrib module. Currently the code is an add-on
available from the pgAdmin website, unless you run the Windows
pgInstaller distribution of PostgreSQL in which case it is already
included.

For those that don't know, the adminpack is a set of additional
functions designed to allow pgAdmin (and potentially any other client
application) the ability to rewrite server configuration files and
browse  read logfiles etc. The full readme can be found at
http://svn.pgadmin.org/cgi-bin/viewcvs.cgi/trunk/pgadmin3/xtra/admin81/R
EADME.admin81?rev=5024view=markup, and the code can be seen at
http://svn.pgadmin.org/cgi-bin/viewcvs.cgi/trunk/pgadmin3/xtra/admin81/.

Tom's concern in the -general thread was that the pgAdmin licence is the
Artistic License, however we are now in a position to offer the code
under the BSD licence. So, how would people feel about including this as
a contrib module in the future, until(/if) equivalent functionality
becomes available in -core in some form?

Regards, Dave.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Remote administration contrib module

2006-03-27 Thread Peter Eisentraut
Am Montag, 27. März 2006 15:12 schrieb Dave Page:
 So, how would people feel about including this as
 a contrib module in the future, until(/if) equivalent functionality
 becomes available in -core in some form?

Right now you have got plenty of time to get it in shape for inclusion in 
core, so we might not even have to take the detour through a contrib module.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Remote administration contrib module

2006-03-27 Thread Dave Page



-Original Message-
From: Peter Eisentraut [mailto:[EMAIL PROTECTED]
Sent: Mon 3/27/2006 3:00 PM
To: pgsql-hackers@postgresql.org
Cc: Dave Page
Subject: Re: [HACKERS] Remote administration contrib module
 
Am Montag, 27. März 2006 15:12 schrieb Dave Page:
  So, how would people feel about including this as
  a contrib module in the future, until(/if) equivalent functionality
  becomes available in -core in some form?

 Right now you have got plenty of time to get it in shape for inclusion in 
 core, so we might not even have to take the detour through a contrib module.

As it stands it was previously rejected for inclusion in -core in it's current 
form. The main objector was Tom, but as it was him that encouraged me to change 
the licencing for inclusion as a contrib module I assume he doesn't object to 
that.

I have submitted a conference discussion proposal to talk about ways of 
implementing the remote configuration that we want which was the main thing I 
think Tom objected to originally, but that's some time away and would likely 
result in a significantly more complex solution that may or may not get done 
for 8.2. In addition, there are one or two helper functions in there that were 
previously rejected more on the 'no use to end user' basis iirc (see 
pg_logdir_ls) which may never end up in -core without reconsideration of the 
need. It would be good (for us at least) to get it in as a contrib, even if we 
do manage to replace it, or some of it, by release.

Regards, Dave


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [GENERAL] PANIC: heap_update_redo: no block

2006-03-27 Thread Tom Lane
[ redirecting to a more appropriate mailing list ]

Alex bahdushka [EMAIL PROTECTED] writes:

 LOG:  REDO @ D/19176610; LSN D/19176644: prev D/191765E8; xid 81148979: Heap 
 - clean: rel 1663/16386/16559898; blk 0
 LOG:  REDO @ D/19176644; LSN D/191766A4: prev D/19176610; xid 81148979: Heap 
 - move: rel 1663/16386/16559898; tid 1/1; new 0/10
 PANIC:  heap_update_redo: no block: target blcknum: 1, 
 relation(1663/16386/16559898) length: 1

I think what's happened here is that VACUUM FULL moved the only tuple
off page 1 of the relation, then truncated off page 1, and now
heap_update_redo is panicking because it can't find page 1 to replay the
move.  Curious that we've not seen a case like this before, because it
seems like a generic hazard for WAL replay.

The simplest fix would be to treat WAL records as no-ops if they refer
to nonexistent pages, but that seems much too prone to hide real failure
conditions.  Another thought is to remember that we ignored this record,
and then complain if we don't see a TRUNCATE that would've removed the
page.  That would be pretty complicated but not impossible.  Anyone have
a better idea?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Recovery from multi trouble

2006-03-27 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 On Mon, 2006-03-27 at 12:14 +0900, OKADA Satoshi wrote:
 Our aim is giving database recovery chances to a database administrator
 at PostgreSQL startup time when there is possibility of data loss of
 losing log files.

 There is no possibility of data loss because of loss of a single log
 file, if you have your hardware configured correctly.

I'm fairly concerned about whether this isn't just replacing one failure
mode with another one.  See nearby discussion with Alex Bahdushka for a
graphic reminder why PANICing on any little thing during WAL replay is
not necessarily such a great idea.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Shared memory

2006-03-27 Thread Tom Lane
Thomas Hallgren [EMAIL PROTECTED] writes:
 The real downside is that a call from SQL to PL/Java using the current
 in-process approach is really fast. It takes about 5 micro secs on my
 2.8GHz i386 box. The overhead of an IPC-call on that box is about 18
 micro secs on Linux and 64 micro secs on Windows. That's an overhead
 of between 440% and 1300% due to context switching alone. Yet, for
 some applications, perhaps that overhead is acceptable?

It's only that much difference?  Given all the other advantages of
separating the JVM from the backends, I'd say you should gladly pay
that price.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Shared memory

2006-03-27 Thread Thomas Hallgren

Tom Lane wrote:

It's only that much difference?  Given all the other advantages of
separating the JVM from the backends, I'd say you should gladly pay
that price.

If I'm right, and the most common scenario is clients using connection pools, then it's very 
likely that you don't get any advantages at all. Paying for nothing with a 440% increase in 
calling time (at best) seems expensive :-)


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Shared memory

2006-03-27 Thread Tom Lane
Thomas Hallgren [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 It's only that much difference?  Given all the other advantages of
 separating the JVM from the backends, I'd say you should gladly pay
 that price.
 
 If I'm right, and the most common scenario is clients using connection pools, 
 then it's very 
 likely that you don't get any advantages at all. Paying for nothing with a 
 440% increase in 
 calling time (at best) seems expensive :-)

You are focused too narrowly on a few performance numbers.  In my mind
the primary advantage is that it will *work*.  I do not actually believe
that you'll ever get the embedded-JVM approach to production-grade
reliability, because of the fundamental problems with threading, error
processing, etc.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Domains as Subtypes

2006-03-27 Thread Tom Lane
elein [EMAIL PROTECTED] writes:
 But I like the idea of centralizing the check in the input/output
 functions.  It seems clearer and cleaner.

I remembered the problem with doing it that way: an input function can't
enforce a domain NOTNULL constraint, because it won't even get invoked
for a null input value.  So there seems no way around having a special
case for domains in all places where I/O conversion is done.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] Why are default encoding conversions namespace-specific?

2006-03-27 Thread Tom Lane
See $SUBJECT.  It seems to me this is a bad idea for much the same
reasons that we recently decided default index operator classes should
not be namespace-specific:
http://archives.postgresql.org/pgsql-hackers/2006-02/msg00284.php

I don't mind having encoding conversions be named within schemas,
but I propose that any given encoding pair be allowed to have only
one default conversion, period, and that when we are looking for
a default conversion we find it by a non-namespace-aware search.

With the existing definition, any change in search_path could
theoretically cause a change in client-to-server encoding conversion
behavior, and this just seems like a really bad idea.  (It's only
theoretical because we don't actually redo the conversion function
search on a search_path change ... but if you think the existing
definition is good then that's a bug.)

Comments?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Shared memory

2006-03-27 Thread Thomas Hallgren

Tom Lane wrote:

Thomas Hallgren [EMAIL PROTECTED] writes:
  

Tom Lane wrote:


It's only that much difference?  Given all the other advantages of
separating the JVM from the backends, I'd say you should gladly pay
that price.

  
If I'm right, and the most common scenario is clients using connection pools, then it's very 
likely that you don't get any advantages at all. Paying for nothing with a 440% increase in 
calling time (at best) seems expensive :-)



You are focused too narrowly on a few performance numbers.  In my mind
the primary advantage is that it will *work*.  I do not actually believe
that you'll ever get the embedded-JVM approach to production-grade
reliability, because of the fundamental problems with threading, error
processing, etc.
  
My focus with PL/Java over the last year has been to make it a 
production-grade product and I think I've succeeded pretty well. The 
current list of open bugs is second to none. What fundamental problems 
are you thinking of that hasn't been solved already?


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] proposal - plpgsql: execute using into

2006-03-27 Thread Pavel Stehule

Hello

Current EXECUTE statemtn doesn't support other way for parametrisation than 
concating strings. It works well but it's little bit unreadable. Oracle's 
statement EXECUTE has positional replacement feature. It works similar our 
RAISE statement (when position holder is %). EXECUTE position holder has 
form :.  has only symbolic value and isn't used for anything. Syntax 
of enhanced statements is:


EXECUTE 'format string' USING expr_list

There are some problems about replacing string values in the SQL string. 
Sometimes we have to enclose value between spaces or others symbols 
(apostrophe or double apostrophe), sometimes not. Possible rules:
 a) if position holder is inside string or identifier we don't enclose 
value;
 b) else numeric values are enclosed spaces and others (non regclass) 
single apostrophes

 c) regclass's values are enclosed douple apostrophes.

PL/pgSQL knows three dynamic statements. All will be enhanced.

Some examples:

EXECUTE 'SELECT :name||:sp||:surname' USING 'Pavel',' ','Stehule';
EXECUTE e'SELECT \':name :surname' USING 'Pavel','Stehule';
EXECUTE 'SELECT * FROM :tabname' USING 'xb'::regclass;
EXECUTE 'SELECT * FROM :base:num USING 'mytab',1;

You can test it. I sent patch to pg_patches.

I invite any comments

Pavel Stehule

_
Emotikony a pozadi programu MSN Messenger ozivi vasi konverzaci. 
http://messenger.msn.cz/



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] proposal - plpgsql: execute using into

2006-03-27 Thread Tom Lane
Pavel Stehule [EMAIL PROTECTED] writes:
 Current EXECUTE statemtn doesn't support other way for parametrisation than 
 concating strings. It works well but it's little bit unreadable. Oracle's 
 statement EXECUTE has positional replacement feature.
 ...
 There are some problems about replacing string values in the SQL string. 

Doesn't the Oracle implementation already imply a solution to that?

The examples you give look to me like they are escaping problems waiting
to happen, especially in view of the upcoming change in default
backslash behavior, so this whole thing makes me feel pretty nervous.
I think we'd be best off to leave EXECUTE alone, at least until we've
converged to the point where almost nobody is using non-standard-compliant
strings.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [GENERAL] PANIC: heap_update_redo: no block

2006-03-27 Thread Greg Stark

Tom Lane [EMAIL PROTECTED] writes:

 I think what's happened here is that VACUUM FULL moved the only tuple
 off page 1 of the relation, then truncated off page 1, and now
 heap_update_redo is panicking because it can't find page 1 to replay the
 move.  Curious that we've not seen a case like this before, because it
 seems like a generic hazard for WAL replay.

This sounds familiar

http://archives.postgresql.org/pgsql-hackers/2005-05/msg01369.php


-- 
greg


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [GENERAL] PANIC: heap_update_redo: no block

2006-03-27 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 This sounds familiar
 http://archives.postgresql.org/pgsql-hackers/2005-05/msg01369.php

Hm, I had totally forgotten about that todo item :-(.  Time to push it
back up the priority list.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Domains as Subtypes

2006-03-27 Thread elein
On Mon, Mar 27, 2006 at 11:41:30AM -0500, Tom Lane wrote:
 elein [EMAIL PROTECTED] writes:
  But I like the idea of centralizing the check in the input/output
  functions.  It seems clearer and cleaner.
 
 I remembered the problem with doing it that way: an input function can't
 enforce a domain NOTNULL constraint, because it won't even get invoked
 for a null input value.  So there seems no way around having a special
 case for domains in all places where I/O conversion is done.
 

The notnull attribute of the pg_type table should be set to not null
in the case of a not null constraint on a domain (type).
You should not have to invoke the input function to check for that.  
Or perhaps I'm missing the details.

   regards, tom lane
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
 

elein
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Domains as Subtypes

2006-03-27 Thread Tom Lane
elein [EMAIL PROTECTED] writes:
 On Mon, Mar 27, 2006 at 11:41:30AM -0500, Tom Lane wrote:
 I remembered the problem with doing it that way: an input function can't
 enforce a domain NOTNULL constraint, because it won't even get invoked
 for a null input value.  So there seems no way around having a special
 case for domains in all places where I/O conversion is done.

 The notnull attribute of the pg_type table should be set to not null
 in the case of a not null constraint on a domain (type).
 You should not have to invoke the input function to check for that.  
 Or perhaps I'm missing the details.

Well, I can see two problems:

1. If we have to add code to everyplace that calls an input function to
do that, then we've failed to achieve the hoped-for goal of solving the
problem in just one place.

2. NOTNULL is just the most obvious form of the problem.  There could be
domain CHECK constraints that fail on null input --- CHECK(VALUE IS NOT
NULL) for example, or something more subtle.  If we don't run the input
function then this means the CHECK constraints also have to be done
out-of-band, and then we've lost any leverage whatsoever.


We could push the problem into a domain input function if we abandoned
the current rule that input functions are never invoked for nulls (we
could check their strictness flag to decide whether to do it).  This
sort of change seems distinctly cleaner than pushing explicit knowledge
about domains into all the places that use input functions, but it's
still pretty ugly:

A. We still have to touch everyplace that uses an input function; any
code not changed will simply do the Wrong Thing for nulls, which is not
a very friendly failure mode.  (And we know there are places outside the
core that use this stuff, for instance non-core PLs.)

B. C-language input functions for most datatypes will need to be
declared strict, else they'll crash on null input, which is an even
less friendly behavior.  Again, we can't be sure that non-core datatypes
get this right at present.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Why are default encoding conversions

2006-03-27 Thread Tatsuo Ishii
 See $SUBJECT.  It seems to me this is a bad idea for much the same
 reasons that we recently decided default index operator classes should
 not be namespace-specific:
 http://archives.postgresql.org/pgsql-hackers/2006-02/msg00284.php
 
 I don't mind having encoding conversions be named within schemas,
 but I propose that any given encoding pair be allowed to have only
 one default conversion, period, and that when we are looking for
 a default conversion we find it by a non-namespace-aware search.

That doesn't sound good idea to me.

 With the existing definition, any change in search_path could
 theoretically cause a change in client-to-server encoding conversion
 behavior, and this just seems like a really bad idea.  (It's only
 theoretical because we don't actually redo the conversion function
 search on a search_path change ... but if you think the existing
 definition is good then that's a bug.)

Then why do we have CREATE DEFAULT CONVERSION command at all?
--
Tatsuo Ishii
SRA OSS, Inc. Japan

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Why are default encoding conversions namespace-specific?

2006-03-27 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes:
 I don't mind having encoding conversions be named within schemas,
 but I propose that any given encoding pair be allowed to have only
 one default conversion, period, and that when we are looking for
 a default conversion we find it by a non-namespace-aware search.

 That doesn't sound good idea to me.

What does it mean to have different default encoding conversions in
different schemas?  Even if this had a sensible interpretation, I don't
think the existing code implements it properly.

 Then why do we have CREATE DEFAULT CONVERSION command at all?

So you can create the one you're allowed to have, of course ...

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Why are default encoding conversions namespace-specific?

2006-03-27 Thread Andrew Dunstan
Tom Lane said:
 Tatsuo Ishii [EMAIL PROTECTED] writes:
 I don't mind having encoding conversions be named within schemas, but
 I propose that any given encoding pair be allowed to have only one
 default conversion, period, and that when we are looking for a
 default conversion we find it by a non-namespace-aware search.

 That doesn't sound good idea to me.

 What does it mean to have different default encoding conversions in
 different schemas?  Even if this had a sensible interpretation, I don't
 think the existing code implements it properly.

perhaps I'm misunderstanding, but why not just resolve the namespace at the
time the default conversion is created?

cheers

andrew






---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Why are default encoding conversions namespace-specific?

2006-03-27 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Tom Lane said:
 What does it mean to have different default encoding conversions in
 different schemas?  Even if this had a sensible interpretation, I don't
 think the existing code implements it properly.

 perhaps I'm misunderstanding, but why not just resolve the namespace at the
 time the default conversion is created?

Isn't that the same thing as saying that there can only be one default
conversion across all schemas?  (Only one for a given source and
target encoding pair, of course.)  If it isn't the same, please explain
more clearly.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Why are default encoding conversions namespace-specific?

2006-03-27 Thread Andrew Dunstan
Tom Lane said:
 Andrew Dunstan [EMAIL PROTECTED] writes:
 Tom Lane said:
 What does it mean to have different default encoding conversions in
 different schemas?  Even if this had a sensible interpretation, I
 don't think the existing code implements it properly.

 perhaps I'm misunderstanding, but why not just resolve the namespace
 at the time the default conversion is created?

 Isn't that the same thing as saying that there can only be one default
 conversion across all schemas?  (Only one for a given source and
 target encoding pair, of course.)  If it isn't the same, please explain
 more clearly.



Yeah, I guess it is. I was thinking of it more as namespace-specified than
as non-namespace-aware. I guess it's a matter of perspective.

cheers

andrew




---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [GENERAL] PANIC: heap_update_redo: no block

2006-03-27 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 I think what's happened here is that VACUUM FULL moved the only tuple
 off page 1 of the relation, then truncated off page 1, and now
 heap_update_redo is panicking because it can't find page 1 to replay the
 move.  Curious that we've not seen a case like this before, because it
 seems like a generic hazard for WAL replay.

 This sounds familiar
 http://archives.postgresql.org/pgsql-hackers/2005-05/msg01369.php

After further review I've concluded that there is not a systemic bug
here, but there are several nearby local bugs.  The reason it's not
a systemic bug is that this scenario is supposed to be handled by the
same mechanism that prevents torn-page writes: the first XLOG record
that touches a given page after a checkpoint is supposed to rewrite
the entire page, rather than update it incrementally.  Since XLOG replay
always begins at a checkpoint, this means we should always be able to
write a fresh copy of the page, even after relation deletion or
truncation.  Furthermore, during XLOG replay we are willing to create
a table (or even a whole tablespace or database directory) if it's not
there when touched.  The subsequent replay of the deletion or truncation
will get rid of any unwanted data again.

Therefore, there is no systemic bug --- unless you are running with
full_page_writes=off.  I assert that that GUC variable is broken and
must be removed.

There are, however, a bunch of local bugs, including these:

* On a symlink-less platform (ie, Windows), TablespaceCreateDbspace is
#ifdef'd to be a no-op.  This is wrong because it performs the essential
function of re-creating a tablespace or database directory if needed
during replay.  AFAICS the #if can just be removed and have the same
code with or without symlinks.

* log_heap_update decides that it can set XLOG_HEAP_INIT_PAGE instead
of storing the full destination page, if the destination contains only
the single tuple being moved.  This is fine, except it also resets the
buffer indicator for the *source* page, which is wrong --- that page
may still need to be re-generated from the xlog record.  This is the
proximate cause of the bug report that started this thread.

* btree_xlog_split passes extend=false to XLogReadBuffer for the left
sibling, which is silly because it is going to rewrite that whole page
from the xlog record anyway.  It should pass true so that there's no
complaint if the left sib page was later truncated away.  This accounts
for one of the bug reports mentioned in the message cited above.

* btree_xlog_delete_page passes extend=false for the target page,
which is likewise silly because it's going to init the page (not that
there was any useful data on it anyway).  This accounts for the other
bug report mentioned in the message cited above.

Clearly, we need to go through the xlog code with a fine tooth comb
and convince ourselves that all pages touched by any xlog record will
be properly reconstituted if they've later been truncated off.  I have
not yet examined any of the code except the above.

Notice that these are each, individually, pretty low-probability
scenarios, which is why we've not seen many bug reports.  If we had had
a systemic bug I'm sure we'd be seeing far more.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Tablespaces oddity?

2006-03-27 Thread Philip Yarra
Hi folks after discussing this on IRC today (thanks G_SabinoMullane!), I'm 
still surprised by this behaviour on 8.1.3:

pyarra=# create TABLESPACE spctables location '/mnt/pg_tables/data';
CREATE TABLESPACE
pyarra=# create table foo(id int) tablespace spctables;
CREATE TABLE
pyarra=# \d foo
  Table public.foo
 Column |  Type   | Modifiers
+-+---
 id | integer |
Tablespace: spctables

So far, so good...

pyarra=# CREATE DATABASE spctest TABLESPACE spctables;
CREATE DATABASE
pyarra=# \c spctest;
You are now connected to database spctest.
spctest=# create table foo(id int) tablespace spctables;
CREATE TABLE
spctest=# create table bar(id int);
CREATE TABLE
spctest=# \d foo
  Table public.foo
 Column |  Type   | Modifiers
+-+---
 id | integer |

spctest=# \d bar
  Table public.bar
 Column |  Type   | Modifiers
+-+---
 id | integer |

I hoped that these last two tables would also be listed as being on spctables.

I think the issue is that pg_class.reltablespace = 0 where these objects are 
created on the default tablespace for this database. I can find out which 
tablespace the objects are really on with:

select relname, COALESCE(t.spcname,(select spcname from pg_tablespace where 
oid = (select dattablespace from pg_database where datname 
=current_database( as tablespace from pg_class c left join pg_tablespace 
t on (t.oid = c.reltablespace)

Someone else might be able to see a better way to write this query, but I 
think it would be good if \d could show this information, when you really 
want to know which tablespace an object is on. 

Note also that \l won't show you the tablespace for a DB, so you need to query 
pg_database to even know which is the default tablespace for a DB. It's not 
impossible, just harder than it needs to be, I reckon.

Any thoughts?

Regards, Philip.

-- 

Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it. - Brian W. Kernighan

-
Utiba Pty Ltd 
This message has been scanned for viruses and
dangerous content by Utiba mail server and is 
believed to be clean.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster