Re: [HACKERS] buildfarm breakage

2010-02-15 Thread Zdenek Kotala
Andrew Dunstan píše v po 08. 02. 2010 v 20:07 -0500:

> 
> Our Solaris *moth members seem to have stopped building. Have we lost them?

Hi Andrew,

The answer is not simple. Yes, we lost Solaris 8 and 9 machines which
was reinstalled and now they are used for different purpose. It was
planned before the April and I announced it long time ago. It
unfortunately happed and timing looks strange. And I did not find
replacement.

I have replacement for nevada/x86 machine already, but I need to setup
it which is one item in my very long TODO list :(. Solaris 10 Sparc/x86
and nevada sparc are covered at this moment.

Zdenek


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] codlin_month is up and complain - PL/Python crash

2010-02-17 Thread Zdenek Kotala

I revived codlin_month and it falls during PL/Python test:

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=codlin_moth&dt=2010-02-16%2015:09:05


TRAP: BadArgument("!(((context) != 0 && (Node*)((context)))->type) 
== T_AllocSetContext", File: "mcxt.c", Line: 641)



 feaf5005 _lwp_kill (1, 6, 80459c8, fea9bbde) + 15
 fea9bbea raise(6, 0, 8045a18, fea725aa) + 22
 fea725ca abort(8046670, 8361f80, 8045a48, 8719ccf, 89021f0, 
89021e4) + f2

 086d07c0 ExceptionalCondition (89021f0, 89021e4, 89021dc, 281) + 58
 08719ccf MemoryContextSwitchTo (89264ac, 0, 0, 8045a7c) + 47
 fec21990 PLy_spi_execute (0, 8b141cc, 80460f8, fe84abde) + 750
 fe84ad6e PyCFunction_Call (8b0ff6c, 8b141cc, 0, fe8a8d92) + 19e
 fe8a91a0 call_function (80461bc, 1, 610f2d31, fe8a3206) + 41c
 fe8a6221 PyEval_EvalFrameEx (8b5798c, 0, 8b0cbdc, 0) + 3029
 fe8a9310 fast_function (8b05144, 80462fc, 0, 0, 0, fe91c63c) + 108
 fe8a8e72 call_function (80462fc, 0, 80462d8, fe8a3206) + ee
 fe8a6221 PyEval_EvalFrameEx (8b576a4, 0, 8b0cbdc, 8b0cbdc) + 3029
 fe8a7cd0 PyEval_EvalCodeEx (8ab4770, 8b0cbdc, 8b0cbdc, 0, 0, 0) + 91c
 fe8a3102 PyEval_EvalCode (8ab4770, 8b0cbdc, 8b0cbdc, fec17831) + 32
 fec1799c PLy_function_handler (8046980, 8b5d508, 8046880, fec1480f) + 17c
 fec14b92 plpython_call_handler (8046980, 8046bb0, 8046be8, 8323774) + 3aa
 08324393 ExecEvalFunc (8a033b0, 8a0329c, 8a0390c, 8a039b8) + e33
 0832b1bc ExecProject (8a03920, 8046c6c, 2, 8977abc) + 834
 08348785 ExecResult (8a03210, 8a03184, 0, 1) + 9d
 0831f66f ExecProcNode (8a03210, 1, 8a037ec, 8731314) + 227
 0831a186 ExecutorRun (8a02d7c, 1, 0, 8719ad4) + 2de
 084d7778 PortalRun (898effc, 7fff, 1, 8977b38, 8977b38) + 450
 084ceae9 exec_simple_query (8976984, 0, 80473b8, 84d5185) + ba9
 084d51a2 PostgresMain (2, 8973b4c, 897398c, 893d00c, 893d008, 
130d7661) + 7fa

 0844aded BackendRun (898c3d0) + 1cd
 084440f3 ServerLoop (1, 89561d4, 3, fea7bb7e, 5c54, feb83cd8) + 973
 08443004 PostmasterMain (3) + 119c
 0837db12 main (3, 8047b14, 8047b24, 80fa21f) + 1ea
 080fa27d _start   (3, 8047be8, 8047fb0, 8047fb0, 0, 8047c35) + 7d

It seems that problem is with compiler aggressive optimization. I change 
it to lower level and now it works fine. Interesting is that 
MemoryContext corruption only appears with PL/Python.


Zdenek

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] codlin_month is up and complain - PL/Python crash

2010-02-18 Thread Zdenek Kotala

Dne 17.02.10 18:39, Peter Eisentraut napsal(a):

On ons, 2010-02-17 at 11:26 -0500, Tom Lane wrote:

But the behavior gcc appears to exhibit is that it won't warn about
variables that are only assigned once before the PG_TRY is entered,
and that seems reasonable to me since such a variable ought to have
the correct value either way.


FWIW, this is a Sun Studio build that is complaining here.



Yes It is SS12. I add volatile keyword and problem disappears. The code 
difference is following:



< PLy_spi_execute+0x742:  83 ec 0c   subl   $0xc,%esp
< PLy_spi_execute+0x745:  ff b5 b8 f9 ff ff  pushl  0xf9b8(%ebp)
< PLy_spi_execute+0x74b:  e8 fc ff ff ff call   MemoryContextSwitch


> PLy_spi_execute+0x742:  8b 85 cc f9 ff ff  movl 
0xf9cc(%ebp),%eax

> PLy_spi_execute+0x748:  83 ec 0c   subl   $0xc,%esp
> PLy_spi_execute+0x74b:  50 pushl  %eax
> PLy_spi_execute+0x74c:  e8 fc ff ff ff call   MemoryContextSwitch

Good to mention that SS inline PLy_spi_execute_query inside 
PLy_spi_execute(), because it is only one caller.



Zdenek

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] psql with GSS can crash

2010-02-25 Thread Zdenek Kotala

Hi all,

I got following stack:

 fd7ffed14b70 strlen () + 40
 fd7ffed71665 snprintf () + e5
 fd7fff36d088 pg_GSS_startup () + 88
 fd7fff36d43a pg_fe_sendauth () + 15a
 fd7fff36e557 PQconnectPoll () + 3b7
 fd7fff36e152 connectDBComplete () + a2
 fd7fff36dc32 PQsetdbLogin () + 1b2
 0041e96d main () + 30d
 0041302c  ()

It seems that connection is not fully configured and krbsrvname or 
pghost is not filled. Following code in fe-auth.c pg_GSS_startup() 
causes a crash:


440 maxlen = NI_MAXHOST + strlen(conn->krbsrvname) + 2;
441 temp_gbuf.value = (char *) malloc(maxlen);
442 snprintf(temp_gbuf.value, maxlen, "%...@%s",
443  conn->krbsrvname, conn->pghost);
444 temp_gbuf.length = strlen(temp_gbuf.value);

And following code in fe-connect.c fillPGconn() fill NULL value.

571 tmp = conninfo_getval(connOptions, "krbsrvname");
572 conn->krbsrvname = tmp ? strdup(tmp) : NULL;

I think that pg_GSS_startup should sanity the input.

Zdenek

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql with GSS can crash

2010-03-01 Thread Zdenek Kotala
Magnus Hagander píše v čt 25. 02. 2010 v 15:17 +0100:
> On Thu, Feb 25, 2010 at 15:04, Zdenek Kotala  wrote:
> > Hi all,
> >
> > I got following stack:
> >
> >  fd7ffed14b70 strlen () + 40
> >  fd7ffed71665 snprintf () + e5
> >  fd7fff36d088 pg_GSS_startup () + 88
> >  fd7fff36d43a pg_fe_sendauth () + 15a
> >  fd7fff36e557 PQconnectPoll () + 3b7
> >  fd7fff36e152 connectDBComplete () + a2
> >  fd7fff36dc32 PQsetdbLogin () + 1b2
> >  0041e96d main () + 30d
> >  0041302c  ()
> >
> > It seems that connection is not fully configured and krbsrvname or pghost is
> > not filled. Following code in fe-auth.c pg_GSS_startup() causes a crash:
> >
> >440 maxlen = NI_MAXHOST + strlen(conn->krbsrvname) + 2;
> >441 temp_gbuf.value = (char *) malloc(maxlen);
> >442 snprintf(temp_gbuf.value, maxlen, "%...@%s",
> >443  conn->krbsrvname, conn->pghost);
> >444 temp_gbuf.length = strlen(temp_gbuf.value);
> >
> > And following code in fe-connect.c fillPGconn() fill NULL value.
> >
> >571 tmp = conninfo_getval(connOptions, "krbsrvname");
> >572 conn->krbsrvname = tmp ? strdup(tmp) : NULL;
> >
> > I think that pg_GSS_startup should sanity the input.
> 
> How did you get NULL in there? :-)
> There's a default set for that one that's PG_KRB_SRVNAM, so it really
> should never come out as NULL, I think...

Yeah, you are right. conn->krbsrvname is "postgres" and conn->pghost is
null

> As for pghost, that certainly seems to be a bug. We check that one in
> krb5 and SSPI, but for some reason we seem to be missing it in GSSAPI.

Yes. The check should be in GSSAPI too.

However what I see in pg_hba.conf is following line:

local   all all   gss

Gss is used on local unix socket which probably cause a problem that
conn->pghost is not filled when psql tries to connect.

thanks Zdenek



Zdenek



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql with GSS can crash

2010-03-07 Thread Zdenek Kotala
Magnus Hagander píše v po 01. 03. 2010 v 16:55 +0100:
> 2010/3/1 Zdenek Kotala :
> > Magnus Hagander píše v čt 25. 02. 2010 v 15:17 +0100:
> >> On Thu, Feb 25, 2010 at 15:04, Zdenek Kotala  wrote:
> >> > Hi all,
> >> >
> >> > I got following stack:
> >> >
> >> >  fd7ffed14b70 strlen () + 40
> >> >  fd7ffed71665 snprintf () + e5
> >> >  fd7fff36d088 pg_GSS_startup () + 88
> >> >  fd7fff36d43a pg_fe_sendauth () + 15a
> >> >  fd7fff36e557 PQconnectPoll () + 3b7
> >> >  fd7fff36e152 connectDBComplete () + a2
> >> >  fd7fff36dc32 PQsetdbLogin () + 1b2
> >> >  0041e96d main () + 30d
> >> >  0041302c  ()
> >> >
> >> > It seems that connection is not fully configured and krbsrvname or 
> >> > pghost is
> >> > not filled. Following code in fe-auth.c pg_GSS_startup() causes a crash:
> >> >
> >> >440 maxlen = NI_MAXHOST + strlen(conn->krbsrvname) + 2;
> >> >441 temp_gbuf.value = (char *) malloc(maxlen);
> >> >442 snprintf(temp_gbuf.value, maxlen, "%...@%s",
> >> >443  conn->krbsrvname, conn->pghost);
> >> >444 temp_gbuf.length = strlen(temp_gbuf.value);
> >> >
> >> > And following code in fe-connect.c fillPGconn() fill NULL value.
> >> >
> >> >571 tmp = conninfo_getval(connOptions, "krbsrvname");
> >> >572 conn->krbsrvname = tmp ? strdup(tmp) : NULL;
> >> >
> >> > I think that pg_GSS_startup should sanity the input.
> >>
> >> How did you get NULL in there? :-)
> >> There's a default set for that one that's PG_KRB_SRVNAM, so it really
> >> should never come out as NULL, I think...
> >
> > Yeah, you are right. conn->krbsrvname is "postgres" and conn->pghost is
> > null
> 
> Ah, good. We should defentd against that then.
> 
> 
> >> As for pghost, that certainly seems to be a bug. We check that one in
> >> krb5 and SSPI, but for some reason we seem to be missing it in GSSAPI.
> >
> > Yes. The check should be in GSSAPI too.
> >
> > However what I see in pg_hba.conf is following line:
> >
> > local   all all   gss
> >
> > Gss is used on local unix socket which probably cause a problem that
> > conn->pghost is not filled when psql tries to connect.
> 
> So there are really two errors - because we should disallow that.
> 
> See attached patch - can you confirm it removes the crash with just
> the client side applied, and then that it properly rejects GSS with
> the server side applied as well?

I tested it, but I cannot reproduce crash because I cannot setup illegal
combination now ;-). I think it is OK.

Thanks Zdenek



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gothic_moth, codlin_moth failures on REL8_2_STABLE

2010-03-11 Thread Zdenek Kotala

Hi Tom,

I'm sorry that I did not look on it early. I played with it and there 
are some facts. gothic(sparc) and codlin(x86) uses Sun Studio 12 nad I 
setup them to use very high optimization.


Gothic:
---
-xalias_level=basic -xarch=native -xdepend -xmemalign=8s -xO5 
-xprefetch=auto,explicit


Codlin:
---
-xalias_level=basic -xarch=native -xdepend -xO4 -xprefetch=auto,explicit

-xO5 is highest optimization, -xO4 is little bit worse

A play with flags and found that

"-xO4 -xalias_level=basic" generates problem.

"-xO3 -xalias_level=basic" works fine

"-xO5" works fine


As documentation say:

Cite from Sun studio compiler guide:
http://docs.sun.com/app/docs/doc/819-5265/bjapp?a=view


xalias_level=basic
--
If you use the -xalias_level=basic option, the compiler assumes that 
memory references that involve different C basic types do not alias each 
other. The compiler also assumes that references to all other types can 
alias each other as well as any C basic type. The compiler assumes that 
references using char * can alias any other type.


For example, at the -xalias_level=basic level, the compiler assumes that 
a pointer variable of type int * is not going to access a float object. 
Therefore it is safe for the compiler to perform optimizations that 
assume a pointer of type float * will not alias the same memory that is 
referenced with a pointer of type int *.


-x04
-
Preforms automatic inlining of functions contained in the same file in 
addition to performing -xO3 optimizations. This automatic inlining 
usually improves execution speed, but sometimes makes it worse. In 
general, this level results in increased code size.





I redefined  bitfields to char in  HLWORD and it works. Your guess is 
correct. But question still where is the place when bitfields works bad. 
Any idea where I should look?


IIRC, I had this problem also on head, when I tried to fix tsearch 
regression test for Czech locale. This problem appears and disappears.


Zdenek




Dne 11.03.10 00:37, Tom Lane napsal(a):

Since the buildfarm is mostly green these days, I took some time to look
into the few remaining consistent failures.  One is that gothic_moth and
codlin_moth fail on contrib/tsearch2 in the 8.2 branch, with a
regression diff like this:

*** 2453,2459 

Sea  view wowfoo  bar  qq
http://www.google.com/foo.bar.html"; target="_blank">YES 
!   ff-bg

   document.write(15);

--- 2453,2459 

Sea  view wowfoo  bar  qq
http://www.google.com/foo.bar.html"; target="_blank">YES 
!  ff-bgff-bg

   document.write(15);


These animals are not testing any branches older than 8.2.  The same
test appears in newer branches and passes, but the code involved got
migrated to core and probably changed around a bit.

I traced through this test on my own machine and determined that the
way it's supposed to work is like this: the tsearch parser breaks the
string into a series of tokens that include these:

ff-bg   compound word
ff  compound word element
-   punctuation
bg  compound word element

The highlight function is then supposed to set skip = 1 on the compound
word, causing it to be skipped when genhl() reconstructs the text.
The failure looks to me like the compound word is not getting skipped.
Both the setting and the testing of the flag seem to be absolutely
straightforward portable code; although the "skip" struct field is a
bitfield, which is a C feature we don't use very heavily.

My conclusion is that this is probably a compiler bug.  Both buildfarm
animals appear to be using Sun Studio, although on different
architectures which weakens the compiler-bug theory a bit.  Even though
we are not seeing the failure in later PG branches, it would probably be
worth looking into more closely, because if it's biting 8.2 it could
start biting newer code as well.  But without access to a machine
showing the problem it is difficult to do much.

regards, tom lane



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gothic_moth, codlin_moth failures on REL8_2_STABLE

2010-03-11 Thread Zdenek Kotala

Dne 11.03.10 16:24, Greg Stark napsal(a):

On Wed, Mar 10, 2010 at 11:37 PM, Tom Lane  wrote:

My conclusion is that this is probably a compiler bug.  Both buildfarm
animals appear to be using Sun Studio, although on different
architectures which weakens the compiler-bug theory a bit.  Even though
we are not seeing the failure in later PG branches, it would probably be
worth looking into more closely, because if it's biting 8.2 it could
start biting newer code as well.  But without access to a machine
showing the problem it is difficult to do much.



Could be this:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6750087

It's fixed in patch 124861-11 which came out Feb 23, 2009. Is this
patch missing on both gothic-moth and codlin-moth?


It seems as a our case. See compiler versions:

Ghost:
-bash-3.2$ cc -V
cc: Sun C 5.9 SunOS_sparc Patch 124867-09 2008/11/25

Codlin
-bash-4.0$ cc -V
cc: Sun C 5.9 SunOS_i386 Patch 124868-10 2009/04/30


I should apply patch on Ghost, but Codlin have to wait, because I don't 
have a control on compiler version. I try to find update SS12 somewhere 
on the disk/network.


The patch which you refer does not fix cc itself but some others 
binaries/libs which cc uses.


I try to update Ghost and we will see what happen.


I suppose it's possible to have a configure test to check for whether
this patch is present but I'm not sure how much it's worthwhile given
that it'll only help people who happen to recompile their 8.2 server
after the next Postgres patch. And I'm not sure we can check for
patches without assuming the CC is the OS-shipped cc. Does cc itself
have an option to list which patches it has applied to it?



cc is not shipped with solaris you have to install it separately. And 
bug appear only when you use high optimization (see my email). You can 
see patch version when you run cc -V but you see only compiler version.


Zdenek


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gothic_moth, codlin_moth failures on REL8_2_STABLE

2010-03-11 Thread Zdenek Kotala

Dne 11.03.10 17:37, Tom Lane napsal(a):

Zdenek Kotala  writes:

"-xO4 -xalias_level=basic" generates problem.
"-xO3 -xalias_level=basic" works fine
"-xO5" works fine



As documentation say:



Cite from Sun studio compiler guide:
http://docs.sun.com/app/docs/doc/819-5265/bjapp?a=view



xalias_level=basic
--
If you use the -xalias_level=basic option, the compiler assumes that
memory references that involve different C basic types do not alias each
other. The compiler also assumes that references to all other types can
alias each other as well as any C basic type. The compiler assumes that
references using char * can alias any other type.



For example, at the -xalias_level=basic level, the compiler assumes that
a pointer variable of type int * is not going to access a float object.
Therefore it is safe for the compiler to perform optimizations that
assume a pointer of type float * will not alias the same memory that is
referenced with a pointer of type int *.


I think you need to turn that off.  On gcc we use -fno-strict-aliasing
which disables the type of compiler assumption that this is talking about.
I'm not sure exactly how that might create the specific failure we are
seeing here, but I can point you to lots and lots of places in the
sources where such an assumption would break things.


OK. I first try to update compiler to latest version to see if it helps 
and finally I will remove aliasing.


Thanks Zdenek

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gothic_moth, codlin_moth failures on REL8_2_STABLE

2010-03-11 Thread Zdenek Kotala
Tom Lane píše v čt 11. 03. 2010 v 11:37 -0500:
> Zdenek Kotala  writes:
> > "-xO4 -xalias_level=basic" generates problem.
> > "-xO3 -xalias_level=basic" works fine
> > "-xO5" works fine
> 
> > As documentation say:
> 
> > Cite from Sun studio compiler guide:
> > http://docs.sun.com/app/docs/doc/819-5265/bjapp?a=view
> 
> > xalias_level=basic
> > --
> > If you use the -xalias_level=basic option, the compiler assumes that 
> > memory references that involve different C basic types do not alias each 
> > other. The compiler also assumes that references to all other types can 
> > alias each other as well as any C basic type. The compiler assumes that 
> > references using char * can alias any other type.
> 
> > For example, at the -xalias_level=basic level, the compiler assumes that 
> > a pointer variable of type int * is not going to access a float object. 
> > Therefore it is safe for the compiler to perform optimizations that 
> > assume a pointer of type float * will not alias the same memory that is 
> > referenced with a pointer of type int *.
> 
> I think you need to turn that off.  On gcc we use -fno-strict-aliasing
> which disables the type of compiler assumption that this is talking about.
> I'm not sure exactly how that might create the specific failure we are
> seeing here, but I can point you to lots and lots of places in the
> sources where such an assumption would break things.

Reconfigured and both animal are green.

Zdenek



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Japanies translation breaks solaris build

2010-05-13 Thread Zdenek Kotala
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=comet_moth&dt=2010-05-13%2021:06:01


msgfmt -o po/ja.mo po/ja.po
WARNING: the string after closing " is ignored at line number 11.
Error, No space after directive at line number 2008.
ERROR: Exiting...
gmake[2]: *** [po/ja.mo] Error 2


The problem is that it contains mix of DOS/Unix end of lines.


Zdenek


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Japanies translation breaks solaris build

2010-05-14 Thread Zdenek Kotala
Takahiro Itagaki píše v pá 14. 05. 2010 v 19:38 +0900:
> Zdenek Kotala  wrote:
> 
> > http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=comet_moth&dt=2010-05-13%2021:06:01
> > The problem is that it contains mix of DOS/Unix end of lines.
> 
> I removed two CRs in ja.po.

Thanks. Gothic moth is green again

http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=gothic_moth&br=HEAD

The rest solaris machine will recover during a night.

Zdenek


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Crash bug in 8.2.3 on Solaris 10/Sparc

2007-03-23 Thread Zdenek Kotala

Zoltan Boszormenyi wrote:

Hi,

we have found that psql in PostgreSQL 8.2.3
has problems connecting to the server
running on Solaris 10/Sun SPARC.

$ uname -a
SunOS dev-machine 5.10 Generic_118833-36 sun4u sparc SUNW,Sun-Fire-V440

It seems that somehow the system provided
GCC 3.4.3 miscompiles timestamptz_send()
and it segfaults. The default function looks like this:



Can you send me how you compiled Postgres (configure switches, LDFLAGS 
...) and is possible get core file?


Did you try compile with different optimalization flags or did you try 
sun studio compiler?


Zdenek

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Crash bug in 8.2.3 on Solaris 10/Sparc

2007-03-29 Thread Zdenek Kotala

Zoltan Boszormenyi wrote:



We compiled GCC-4.1.2 on this machine, recompiled PostgreSQL
with the new GCC without --enable-integer-datetimes and it fixed
the problem we experienced. It seems that my suspicion was right:
GCC-3.4.3 on Solaris 10/Sparc is buggy.



I tried original S10 gcc (3.4.3) on two different machine with different 
kernel update and both work fine. In term of our offlist communication 
and Tom's mention, It looks more as problem in linking/loading. Maybe 
some libraries mismatch. I'm not able say more without core.


Zdenek


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Questions about pid file creation code

2007-04-02 Thread Zdenek Kotala
I'm looking on pid file creation code (src/backend/utils/init/miscinit.c 
 - CreateLockFile) and I have couple of questions:


1) Is there still some reason have negative value in postmaster.pid? It 
happens only if backend runs in single mode. But I think now is not 
necessary to use it. And there are some confusing messages about 
postgres/postmaster. See:


errhint("Is another postgres (PID %d) running in data directory \"%s\"?",
(int) other_pid, refName) :
errhint("Is another postmaster (PID %d) running in data directory \"%s\"?",
(int) other_pid, refName)) :

2) Why 100? What race condition should happen? This piece of code looks 
like kind of magic.


3) Why pid checking and cleanup is in postgres? I think it is role of 
pg_ctl or init scripts.



4) The following condition is buggy, because atoi function does not have 
defined result if parameter is not valid number. (OK in most 
implementation it really returns 0)


 if (other_pid <= 0)
 elog(FATAL, "bogus data in lock file \"%s\": \"%s\"",
  filename, buffer)

I think usage of strtol there should be better.


Zdenek

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Is this portable?

2007-04-02 Thread Zdenek Kotala

Alvaro Herrera wrote:

Can I declare a struct in a function's declaration section?  Something
like this:

static void
foobar(void)
{
struct foo {
Oid foo;
int bar;
};

struct foo baz;

baz.foo = InvalidOid;
baz.bar = 42;

}

I tried here and GCC does not complain, with -std=c89 -pedantic.



It works fine with Sun Studio 11.

Zdenek

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Questions about pid file creation code

2007-04-02 Thread Zdenek Kotala

Tom Lane wrote:

Zdenek Kotala <[EMAIL PROTECTED]> writes:

1) Is there still some reason have negative value in postmaster.pid?


Just to distinguish postmasters from standalone backends in the error
messages.  I think that's still useful.


I'm not sure what you mean. It is used only in CreatePidFile function 
and I think that if directory is locked by some process, I don't see any 
useful reason to know if it is postmaster or standalone backend.


(PS: Is standalone backend same as --single switch?)

2) Why 100? What race condition should happen? This piece of code looks 
like kind of magic.


There are at least two race cases identified in the comments in the
loop.


Yes there are. But it does not sense for me. If I want to open file and 
another process remove it, why I want to try created it again when 
another process going to do it?


There is only one reason and it is that user delete file manually from 
the system, but in this case I don't believe that administrator shot 
right time.


Or if it still have sense do it in this way I expect some sleep instead 
of some loop which depends on CPU speed.


3) Why pid checking and cleanup is in postgres? I think it is role of 
pg_ctl or init scripts.


Let's see, instead of one place in the postgres code we should do it in
N places in different init scripts, and just trust to luck that a
particular installation is using an init script that knows to do that?
I don't think so.  Besides, how is the init script going to remove it
again?  It won't still be running when the postmaster exits.


I'm sorry, I meant why there is a pid cleanup which stays there after 
another postmaster crash. Many application only check OK there is some 
pid file -> exit. And rest is on start script or some other monitoring 
facility.


4) The following condition is buggy, because atoi function does not have 
defined result if parameter is not valid number.



  if (other_pid <= 0)


It's not actually trying to validate the syntax of the lock file, only
to make certain it doesn't trigger any unexpected behavior in kill().


I not sure if we talk about same place. kill() is called after this if. 
If I miss that atoi need not return 0 if fails, then following condition 
is more accurate:


  if (other_pid == 0)



I don't think I've yet seen any reports that suggest that more syntax
checking of the lock file would be a useful activity.


Yes, I agree.

Zdenek

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Questions about pid file creation code

2007-04-03 Thread Zdenek Kotala

Tom Lane wrote:

Zdenek Kotala <[EMAIL PROTECTED]> writes:

Tom Lane wrote:

Just to distinguish postmasters from standalone backends in the error
messages.  I think that's still useful.


I'm not sure what you mean. It is used only in CreatePidFile function 
and I think that if directory is locked by some process, I don't see any 
useful reason to know if it is postmaster or standalone backend.


You don't?  Consider the decisions the user needs to take upon seeing
the message --- should he kill that other process or not, and if so how?
Knowing whether it's a postmaster seems pretty important to me.


If somebody want to kill some process he must know what he want to do. 
How many postgres user know what is different between postmaster and 
postgres in error message?


And other problem. If another application (e.g. pg_migrator) want to 
lock this directory to prevent data corruption. How shall it do that? 
How big sense have this message in this case?


I suggest to remove this behavior and modify message.

Yes there are. But it does not sense for me. If I want to open file and 
another process remove it, why I want to try created it again when 
another process going to do it?


That could be the track of another postmaster just now shutting down.
There's no reason to fail to start in such a scenario.  The looping
logic is necessary anyway (to guard against races involving two
postmasters trying to start at the same time), so we might as well let
it handle this case too.


Ok. I now understand (I hope) what this loop try to handle. However, If 
one server go down and another go up there is only really small time 
piece between first open attempt and second one. I guess in this case we 
can say stop to the startup postmaster. For me it is better then make 
one hundred loops depend on cpu speed and recheck it again.  I think 
that in this case postgres doubled role of startup scripts.


There is also another issue which can occur. If you have two node with 
access to one shared filesystem. One node is for backup and somebody run 
postgres on second node. In this case postgres remove file and create 
own and two postgres on one dbcluster is not good idea. Good cluster 
solution protect this situation, but it can happen if somebody run it 
manually.


I'm sorry, I meant why there is a pid cleanup which stays there after 
another postmaster crash. Many application only check OK there is some 
pid file -> exit. And rest is on start script or some other monitoring 
facility.


The start script does not typically have the intelligence to get this
right, particularly not the is-shmem-still-in-use part.  If you check
the archives you will find many of us on record telling people who think
they should remove the pidfile in their start script that they're crazy.


It is true, but question is what way is better. Keep all logic in 
postmaster or improve pg_ctl to share more information and keep 
responsibility on start scripts or monitoring tool which has more 
information about system as complex.



It's not actually trying to validate the syntax of the lock file, only
to make certain it doesn't trigger any unexpected behavior in kill().



I not sure if we talk about same place.


Yes, we are.  Read the kill(2) man page and note the special behaviors
for pid = 0 or -1.  The test is just trying to be darn certain we don't
invoke those behaviors.


No we don't :-). I mean code few lines up after atoi().

with regards Zdenek



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] Fix for large file support

2007-04-06 Thread Zdenek Kotala

Tom Lane wrote:

[ redirecting to -hackers for wider comment ]

Zdenek Kotala <[EMAIL PROTECTED]> writes:

Tom Lane wrote:
LET_OS_MANAGE_FILESIZE is good way. I think one problem of this option I 
fixed. It is size of offset. I went thru the code and did not see any 
other problem there. However, how you mentioned it need more testing. I 
going to take server with large disk array and I will test it.


I would like to add --enable-largefile switch to configure file to 
enable access to wide group of users. What you think about it?


Yeah, I was going to suggest the same thing --- but not with that switch
name.  We already use enable/disable-largefile to control whether 64-bit
file access is built at all (this mostly affects pg_dump at the moment).


hmm :( It looks that ./configure largefile detection does not work on 
Solaris.



I think the clearest way might be to flip the sense of the variable.
I never found "LET_OS_MANAGE_FILESIZE" to be a good name anyway.  I'd
suggest "USE_SEGMENTED_FILES", which defaults to "on", and you can
turn it off via --disable-segmented-files if configure confirms your
OS has largefile support (thus you could not specify both this and
--disable-largefile).


It sounds good. There is one think for clarification (for the present). 
How to handle buffile? It does not currently support non segmented 
files. I suggest to use same switch to enable/disable segments there.


Zdenek

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] [PATCHES] Fix for large file support

2007-04-16 Thread Zdenek Kotala

Tom Lane wrote:

Jim Nasby <[EMAIL PROTECTED]> writes:
If we expose LET_OS_MANAGE_FILESIZE, should we add a flag to the  
control file so that you can't start a backend that has that defined  
against a cluster that was initialized without it?


I imagine we'd flag that as relsegsize = 0 or some such.


Yes I have it in my patch. I put relsegsize = 0 in the control file when 
 non-segmentation mode is enabled.


Zdenek

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Adjusting index special storage for pg_filedump's convenience

2007-04-16 Thread Zdenek Kotala

Heikki Linnakangas wrote:



That's a clever trick, but I can't help thinking we really should have 
an explicit field in the page header to indicate what kind of a page it 
is. It would make life simpler for any external tools that want to peek 
into pages, including migration utilities after a release or two. We've 
also been talking about setting hint bits and doing some kind of retail 
vacuuming in bgwriter with HOT. To do that, we need to identify heap 
pages in the bgwriter. While heap pages can currently be identified by 
the fact that they don't have a special area, it feels hackish, and we 
might want to do something like that for index pages too in the future.


We now have a 16-bit pd_flags field in the page header. We could use a 
few bits from that.




+1

or add one extra field

Zdenek

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Feature freeze progress report

2007-04-30 Thread Zdenek Kotala

Dave Page wrote:



In my original message I described my thinking:

- Developer submits patch, with additional info through a web interface.

- The web interface formats an email containing the patch description,
patch and any other info entered, assigns it a patch number, and
forwards it to the patches list.

- Discussion ensues on list as per normal. The tracking system monitors
the list and automatically attaches any posts to the patch that have the
appropriate reference in the subject line.

- Community members and committers can review the entire discussion
through the systems web interface. Updated patches could be submitted
online.

- Committers log into the system when necessary to alter the status
(committed, rejected, awaiting revision, under review etc), or the queue
name (8.3, 8.4 etc). This could also be done automagically via email
keywords from specific addresses.

You would no longer need to manually manage the queue, and the
committers would simply need to tweak the status flag as required.




You can look on Apache Derby project. I think everything what you 
mentioned there Derby project is using.


See http://db.apache.org/derby/

Zdenek

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Feature freeze progress report

2007-04-30 Thread Zdenek Kotala

Dave Page wrote:

Stefan Kaltenbrunner wrote:

This means that there is a huge rush of new code in pgAdmin's
development cycle, right at the time when we should be testing - making
the release process more and more rushed as each release of PostgreSQL
gets more efficient and adds more and more new features.

this is indeed an issue - but there is nothing that says that pgadminIII
has to support all the new features of a backend the they it get
released. pgadmin is decoupled from the min development cycle anyway so
adding support for a new features a few months later sounds not too much
an issue.


No it's not decoupled; it runs almost exactly in sync. Our most popular
platform ships with pgAdmin as standard because thats what the users of
that platform expect. Not shipping with pgAdmin (or a functionally
complete one) would be like shipping SQL Server without Enterprise Manager.


And also from another point of view Postgres and related version of 
PgAdmin must fit in same release cycle windows of OS distributions. When 
OS release is out new feature is not usually accepted and OS should be 
shipped with old version pgAdmin.


And bigger problem then new feature in pgAdmin is 
pg_upgrade/pg_migrator. Upgrade procedure must be finished at same time 
as new release, but some upgrade functions are possible coding only 
after feature freeze or when all affected patches are committed.


If we want to have this feature (upgrade) in postgres we would adjust 
release cycle anyway.


Zdenek

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Fix for large file support (nonsegment mode support)

2007-04-30 Thread Zdenek Kotala

Tom Lane wrote:

[ redirecting to -hackers for wider comment ]

Zdenek Kotala <[EMAIL PROTECTED]> writes:

Tom Lane wrote:
LET_OS_MANAGE_FILESIZE is good way. I think one problem of this option I 
fixed. It is size of offset. I went thru the code and did not see any 
other problem there. However, how you mentioned it need more testing. I 
going to take server with large disk array and I will test it.


I would like to add --enable-largefile switch to configure file to 
enable access to wide group of users. What you think about it?


Yeah, I was going to suggest the same thing --- but not with that switch
name.  We already use enable/disable-largefile to control whether 64-bit
file access is built at all (this mostly affects pg_dump at the moment).

I think the clearest way might be to flip the sense of the variable.
I never found "LET_OS_MANAGE_FILESIZE" to be a good name anyway.  I'd
suggest "USE_SEGMENTED_FILES", which defaults to "on", and you can
turn it off via --disable-segmented-files if configure confirms your
OS has largefile support (thus you could not specify both this and
--disable-largefile).



There is latest version of nonsegment support patch. I changed 
LET_OS_MANAGE_FILESIZE to USE_SEGMENTED_FILES and I added 
-disable-segmented-files switch to configure. I kept tuplestore behavior 
and it still split file in both mode.


I also little bit cleanup some other datatypes (e.g int->mode_t).
Autoconf and autoheader must be run after patch application.

I tested it with 9GB table and both mode works fine.

Please, let me know your comments.

Zdenek


nonseg.patch.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] how pgsql writes tuples on disk

2007-04-30 Thread Zdenek Kotala

See http://www.postgresql.org/docs/8.2/interactive/storage.html

in code it is for example in

src/backend/storage/...
src/backend/utils/adt/...
src/backend/access/...

and very good also is
src/include/stroage/bufpage.h


I hope it helps

Zdenek

jorge alberto wrote:

Hello!
 
I wanna know where can I find the source code that writes tuples on 
disk, right now I'm reading about object persistence and I wanna see how 
it is done in pgsql
 
I hope you can help me

regards
jorge



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Heap page diagnostic functions

2007-05-01 Thread Zdenek Kotala

Heikki Linnakangas wrote:

Tom Lane wrote:

Bruce Momjian <[EMAIL PROTECTED]> writes:

Heikki Linnakangas wrote:

Any suggestions? pgdiagnostics?



Yes, I like "diagnostics", or "internals".  I just think forensics isn't
going to be understood by the average native English speaker, let alone
non-English speakers.


"diagnostics" is a two-dollar word also.  It might be a bit more widely
known than "forensics", but it's longer and harder to spell.  Not that
I have any great suggestions to make.  Maybe "pgtestfuncs"?


I thought about pgdebugfuncs myself at first. But that's so generic and 
pgtestfuncs even more so. Diagnostics is pretty good, that word is in 
use in other languages as well as a medical term, so it's meaning would 
be clear to a lot of people even if they've never encountered the 
English word before.


Besides, the audience for these functions is quite narrow: people 
working on PostgreSQL internals, so I don't think the language issue 
matters that much.


I did not find "forensics" in translator and It mentions in Oxford 
vocabulary but explanation is not clear for me. I agree with Bruce It is 
not good name. What about short form of diagnostic "diag"?



Zdenek

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Heap page diagnostic functions

2007-05-02 Thread Zdenek Kotala

Heikki Linnakangas wrote:

I'm going to go with pgdiagnostics. We could short it to just "pgdiag", 
but that feels too short :). We could make it "pgdiagfuncs", but that's 
not much shorter than pgdiagnostics.


pgdiagfn? It is only 8 chars length ;).


Zdenek

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Boatload of warnings in CVS HEAD :-(

2007-05-03 Thread Zdenek Kotala

Tom Lane wrote:


We can fix this for gcc by putting __attribute__((noreturn)) on the
declaration of pg_re_throw(), but what about other compilers?



Sun studio also complains about it :(.

Zdenek

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Boatload of warnings in CVS HEAD :-(

2007-05-04 Thread Zdenek Kotala

Zdenek Kotala wrote:

Tom Lane wrote:


We can fix this for gcc by putting __attribute__((noreturn)) on the
declaration of pg_re_throw(), but what about other compilers?



Sun studio also complains about it :(.



I'm sorry it was to late for me, I recheck it again and Sun studio is
happy :-) and does not complaint about it, however there are a lot of
warning messages (not relevant with this issue). Most of them is about
following construct:

switch(..)
{

  case x :
return(..);
break;
...

Is the reason for keeping this in a code? Another kind of construct is:

#define PG_RETURN_NULL()  \
do { fcinfo->isnull = true; return (Datum) 0; } while (0)

It looks strange for me. Why it is used?

or

for(;;) { ... break;} see e.g
http://doxygen.postgresql.org/postgres_8c-source.html#l00198

or
why is there while ... break instead if?
http://doxygen.postgresql.org/comment_8c-source.html#l00221

thanks for explanation
Zdenek


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] New idea for patch tracking

2007-05-05 Thread Zdenek Kotala

I would like to add one point:

Bruce Momjian wrote:



Patch committers check several things before applying a patch:

1)  Follows the SQL standard or community agreed-upon behavior
2)  Style merges seamlessly into the surrounding code
3)  Written as simply and efficiently as possible
4)  Uses the available PostgreSQL subsystems properly
5)  Contains sufficient comments
6)  Contains code that works on all supported operating systems
7)  Has proper documentation
8)  Passes all regression tests


  8.5) Contains regression test(s) which covered performed changes


9)  Behaves as expected, even under unusual cirumstances
10)  Contains no reliability risks
11)  Does not overly complicate the source code
12)  If performance-related, it should have a measureable performance benefit
13)  Is of sufficient usefulness to the average PostgreSQL user
14)  Follows existing PostgreSQL coding standards




Zdenek


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Integer datetimes

2007-05-05 Thread Zdenek Kotala

Neil Conway wrote:


So, are there any corresponding benefits to providing both FP and
integer datetimes? AFAIK the following differences in user-visible
behavior exist:



There should be also problem with floating point implementation on 
client and server side. For example if somebody use floating point 
optimalization (-fast switch in Sun Studio) for server compilation and 
client will be connected from another machine with standard floating 
point behavior. Result could be wrong.




P.S. One thing to verify is that the performance of integer datetimes is
no worse than the perf. of FP datetimes. I'd intuitively expect this to
be true, but it would be worth investigating.


Some multi core/thread CPUs has only one FPU (e.g. Niagara).


Zdenek

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Integer datetimes

2007-05-06 Thread Zdenek Kotala

Bruce Momjian wrote:

Neil Conway wrote:

On Sat, 2007-05-05 at 20:52 -0400, Bruce Momjian wrote:

What?  We don't pass float as a binary to clients.

Sure we do, if the client is sending or receiving data in binary format.


But in those cases, we assume the client and server have the same
configuration, right?



It is correct assumption, but I did not find it in documentation and if 
you look on floating data type description there is mention about non 
IEEE 754 platform, but nothing about this assumption.


I think IEEE 754 compliance must be required on all platforms.


Zdenek

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] New idea for patch tracking

2007-05-07 Thread Zdenek Kotala

Jim Nasby wrote:


People have suggested different trackers that have varying amounts of 
email capability, but I don't think any of them have had the full 
capability that we'd need. At best they might accept comments on a 
bug/issue via email, but to work for the community they'd need to go 
beyond that. You'd have to be able to resolve via email (preferably tied 
to -commiters). You'd need to be able to make a bug as invalid. You'd 
need to be able to open a new issue via email. And change status. And 
assign it to someone. And it would have to actually thread discussion to 
be useful. Probably some other things as well.


As I wrote few days ago. You can see how and what use e.g. Apache Derby 
community. I guess more of mentioned issues they have solved and we can 
take some of their ideas. However I still  miss list of tracker 
requirements - what tracker MUST have and what is nice to have.


And you describe current processes based on email communication. But if 
we setup some tracker some process will be changed. I think first step 
is determine what we really want and after we can discuss how to reach it.



Since a system like that doesn't exist I think it's going to be up to us 
to create one. When it comes to the full set of features you'd expect 
out of an issue tracker, it would probably make sense to start with an 
existing project and try and add this functionality. But right now I 
don't think such an effort would work well, because we don't know well 
enough how all these new features should work.


Create own tracker is reinvent a wheel and waste a time. There are a lot 
of trackers and I believe that one of them fit postgres requirements. I 
agree with your idea to try one and if it will be necessary we can add 
some functionality. But I think that there are not clear requirements 
and I also afraid that there is not unified view of core team on this.



I suggest following process:

1) create list of requirements - MUST HAVE/NICE TO HAVE
2) create list of tracker
3) reject trackers which does not fit "MUST HAVE"
4) each member of core team create own order
5) results will be put together and one tracker will be select for testing.

Zdenek




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] race condition in pgplsql call handler?

2007-05-11 Thread Zdenek Kotala
I'm looking into PG/PLSql code and following code in 
plpgsql_call_handler looks strange:


/* Find or compile the function */
func = plpgsql_compile(fcinfo, false);

/* Mark the function as busy, so it can't be deleted from under 
us */

func->use_count++;


I don't have deep knowledge about this part of code. But what happen if 
in parallel execution "func" will be deleted between these two lines?



Zdenek

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] race condition in pgplsql call handler?

2007-05-11 Thread Zdenek Kotala

Tom Lane wrote:

Zdenek Kotala <[EMAIL PROTECTED]> writes:
I'm looking into PG/PLSql code and following code in 
plpgsql_call_handler looks strange:



 /* Find or compile the function */
 func = plpgsql_compile(fcinfo, false);


 /* Mark the function as busy, so it can't be deleted from under 
us */

 func->use_count++;



I don't have deep knowledge about this part of code. But what happen if 
in parallel execution "func" will be deleted between these two lines?


This is not a race condition because the backend is single-threaded.


I see. Each backend has own function cache and use_count is for handle 
recursion usage. Are my assumption correct?


thanks Zdenek

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Performance monitoring

2007-05-14 Thread Zdenek Kotala

Heikki Linnakangas napsal(a):

Jim C. Nasby wrote:




There is two counters for checkpoints in pgstats, the number of timed 
(triggered by checkpoint_timeout) and requested (triggered by 
checkpoint_segments) checkpoints.


Maybe we should improve the stats system so that we can collect events 
with timestamps and durations, but in my experience log files actually 
are the most reliable and universal way to collect real-time performance 
information. Any serious tool has a generic log parser. The other 
alternative is SNMP. I welcome the efforts on pgsnmpd..


Whats about add some DTrace probes?

Zdenek


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Lack of urgency in 8.3 reviewing

2007-05-18 Thread Zdenek Kotala

Heikki Linnakangas wrote:


- heap page diagnostic functions


I would like to take this review (after PGCon).


Zdenek

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Lack of urgency in 8.3 reviewing

2007-05-18 Thread Zdenek Kotala

Heikki Linnakangas wrote:

Zdenek Kotala wrote:

Heikki Linnakangas wrote:


- heap page diagnostic functions


I would like to take this review (after PGCon).


Too late, Bruce applied it already :).


Yes ... Never mind


What's about

- full page writes improvement

but I will have time after PGCon.

Zdenek

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Idea that might inspire more patch reviewing.

2007-05-20 Thread Zdenek Kotala

Ron Mayer wrote:

Bruce Momjian wrote:

In talking to people who are assigned to review patches or could review
patches, I often get the reply, "Oh, yea, I need to do that".


Would it inspire more people to learn enough to become patch
reviewers if patch authors scheduled walkthroughs of their
patches with question and answer sessions over IRC or maybe
even some voice conferencing system like skype?


It is common in one company but I'm not sure if it is possible do in 
open source community.


I think the following tool looks likes good solution for patch review:

http://www.chipx86.com/blog/?p=222
http://code.google.com/p/reviewboard/

Zdenek

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] [GSOC] - I ntegrity check algorithm for data files

2007-05-23 Thread Zdenek Kotala

Robert Mach wrote:



I see different ways of delivering this functionality to Postgresql. The 
best of course would be to become part of Postgresql release either as a 
PostgreSQL command (like UPDATE) or as an postgresql server application 
like vacuumdb.
Other possibility is to create a freestanding program that would be 
called with location of datafiles as arguments.
Last possibility is to create an administrative console access (single 
user mode) to the database in which this integrity check could be fired.


I discussed It with Niel and for logical test It should be PostgreSQL 
function. However, for physical layer test database cluster should be in 
inconsistent state and PostgreSQL startup process fails. It means that 
there are only limited possibilities.


1) standalone binary, which will be linked mostly from postgreSQL object 
files and replace main and add new testing functionality. This is 
recommended by Neil.


2) Add new postmaster context/mod - repair and recovery, when postgres 
is run as "postmaster --recovery".


3) My personal favorite is create management console - which allows to 
perform check without physically access to local machine. Management 
console should be use for another purpose - for example disable/enable 
databases from cluster, perform upgrade of datalayer to the new version, 
kill sessions, update postgresql.conf and so on...


However this solution has significant impact on current postgres 
behavior, but I think it should be big deal for postgres.



Zdenek


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [GSOC] - I ntegrity check algorithm for data files

2007-05-23 Thread Zdenek Kotala

Richard Huxton wrote:

Zdenek Kotala wrote:
3) My personal favorite is create management console - which allows to 
perform check without physically access to local machine. Management 
console should be use for another purpose - for example disable/enable 
databases from cluster, perform upgrade of datalayer to the new 
version, kill sessions, update postgresql.conf and so on...


However this solution has significant impact on current postgres 
behavior, but I think it should be big deal for postgres.


I believe pgadmin lets you update postgresql.conf remotely,


IIRC, it is possible only when you have some additional module/function 
installed on server and it works only for pgadmin, not for psql.


and I know 
you can kill sessions from within PG ( pg_cancel_backend() ).


If you look how kill session is implemented it does not look much safe.


For the rest, that's what ssh is for imho.


And what you will do when you don't have ssh access on this machine and 
5432 is only one way how to administrate a server? (Windows is another 
story.)


Zdenek


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] [GSOC] - I ntegrity check algorithm for data files

2007-05-23 Thread Zdenek Kotala

Richard Huxton wrote:

Zdenek Kotala wrote:

Richard Huxton wrote:


For the rest, that's what ssh is for imho.


And what you will do when you don't have ssh access on this machine 
and 5432 is only one way how to administrate a server? (Windows is 
another story.)


If I've not got ssh access to the machine, then I'm not the one 
responsible for editing configuration files.


I'm not sure if ssh access is in correlation with db administration. 
However in some Unix system you can get right to user start stop 
postgres (RBAC, sudo ...), but there is not easy way how to allow him to 
access configuration file only in way to share postgres user password.


Zdenek





---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Users specific to a Database

2007-05-29 Thread Zdenek Kotala

Sahoo, Ranjan Rashmi wrote:

Hi All,

 

I am new to this postgres and now struggling with creation of a user 
specific to a database, so the user can login to that specific database 
and can do the activities as per the given credentials and can not touch 
or see any information from other databases. Can any one please help me 
in this regard?




Start there http://www.postgresql.org/docs/8.2/interactive/user-manag.html

And please, future question send to the general or novice mailing list. 
This list is focused on postgres development.


Zdenek

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Backend crash during explain

2007-05-31 Thread Zdenek Kotala

Grant Finnemore napsal(a):

CrashReporter trace:

Date/Time:  2007-05-31 10:21:39.285 +0200
OS Version: 10.4.9 (Build 8P2137)
Report Version: 4

Command: postmaster
Path:./bin/postmaster
Parent:  postmaster [23091]

Version: ??? (???)

PID:23096
Thread: 0

Exception:  EXC_BAD_ACCESS (0x0001)
Codes:  KERN_PROTECTION_FAILURE (0x0002) at 0x0018

Thread 0 Crashed:
0   postmaster 0x00116ec6 ExecSetSlotDescriptor + 77 (execTuples.c:344)
1   postmaster 0x001182f9 ExecAssignScanTypeFromOuterPlan + 33 
(execUtils.c:771)

2   postmaster 0x001240c8 ExecInitSort + 168 (nodeSort.c:211)


It looks that tupDesc contains invalid pointer. I found some strange 
assignment in ExecAssignScanTypeFromOuterPlan function. See comment 
bellow. OuterPlanState expects PlaneState structure instead ScanState.


00762 ExecAssignScanTypeFromOuterPlan(ScanState *scanstate)
00763 {
00764 PlanState  *outerPlan;
00765 TupleDesc   tupDesc;
00766
00767 outerPlan = outerPlanState(scanstate);
^
scanstate->ps ??

00768 tupDesc = ExecGetResultType(outerPlan);
00769
00770 ExecAssignScanType(scanstate, tupDesc);
00771 }


Zdenek

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Command tags in create/drop scripts

2007-06-01 Thread Zdenek Kotala

Peter Eisentraut wrote:

The messages output by the "scripts" always seemed unnecessary to me, e.g.,

$ createdb foo
CREATE DATABASE

A Unix program (hi Magnus) shouldn't need to say anything if the requested 
action succeeded.


I believe the history of this output is actually that these scripts simply 
used to call psql and no one really questioned why the output should be this 
way.  There are also some inconsistencies, e.g., createlang doesn't output 
anything, and createuser outputs CREATE ROLE.


I would just remove all this, but I suppose this will be controversial?


And what about replace all "scripts" by one command e.g pg_cmd with 
following interface:


pg_cmd create database <...>
   create role ...
   drop   database ...
   list  database
...


It solves potential name collision (for example createuser should 
collide with other unix command) and output is easy maintainable in one 
application.



Zdenek



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Command tags in create/drop scripts

2007-06-01 Thread Zdenek Kotala

Tom Lane wrote:

Andrew Dunstan <[EMAIL PROTECTED]> writes:

Zdenek Kotala wrote:
And what about replace all "scripts" by one command e.g pg_cmd with 
following interface:


pg_cmd create database <...>


And we'll break a million applications that rely on the command names. 


Compatibility is possible solve with parsing arg[0]. If old name appears 
there command line will be parsed in old style... This functionally 
should keep there for some transition interval. Packager will be 
responsible if he create links for backward compatibility or not.



If you want that, it already exists: it's pronounced "psql -c ...".


No, it is not solution, because you still have binaries with strange 
names (e.g. createuser) on the disk.


Zdenek

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Command tags in create/drop scripts

2007-06-04 Thread Zdenek Kotala

Tom Lane napsal(a):

Andrew Dunstan <[EMAIL PROTECTED]> writes:

Zdenek Kotala wrote:
And what about replace all "scripts" by one command e.g pg_cmd with 
following interface:


Well, I don't think rolling up the miscellaneous commands into a single 
binary with behaviour dependent on arg[0] is a bad idea.


I don't think that responds to Zdenek's complaint though.  He's unhappy
about polluting /usr/bin with commands like "createuser" --- which is
not an unreasonable gripe.  AFAICS it doesn't help if "createuser"
is a link to a single executable rather than a file of its own.


Yes, It was one idea (after midnight :-) how to keep backward 
compatibility for next few releases, but better solution is keep old 
binaries for couple of release cycles.



But in the light of backwards-compatibility requirements, I can't see us
removing "createuser" from the distribution ... not in 8.4, nor several
releases thereafter.


I agree, I only want to start discussion about new command which will 
replace old binaries. Old binaries will be marked as obsolete and they 
will be removed e.g. in 9.0. Until we do not have the new command we can 
not start thinking about remove old one. I think we can have new command 
ready for 8.4.



Zdenek



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Command tags in create/drop scripts

2007-06-05 Thread Zdenek Kotala

Tom Lane wrote:

Bruce Momjian <[EMAIL PROTECTED]> writes:

Is this a TODO?


I don't think so; there is no demand from anybody but Zdenek to remove
those programs.  Has it ever even come up before?



Tom, Bruce

I started with postgres 6.5 as administrator and from this version names 
of these utilities have been for me little bit confusing and annoying. I 
have hoped that it will be changed in the future. It is my personal 
feeling since 1999. I prefer to add on TODO list and also question is 
why we don't have:


"pg_ctl -D  init" instead of initdb command?


Zdenek

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Autovacuum launcher doesn't notice death of postmaster immediately

2007-06-12 Thread Zdenek Kotala

Alvaro Herrera wrote:

Zeugswetter Andreas ADI SD escribió:

The launcher is set up to wake up in autovacuum_naptime
seconds 

at most.

Imho the fix is usually to have a sleep loop.
This is what we have.  The sleep time depends on the schedule 
of next vacuum for the closest database in time.  If naptime 
is high, the sleep time will be high (depending on number of 
databases needing attention).

No, I meant a "while (sleep 1(or 10) and counter < longtime) check for
exit" instead of "sleep longtime".


Ah; yes, what I was proposing (or thought about proposing, not sure if I
posted it or not) was putting a upper limit of 10 seconds in the sleep
(bgwriter sleeps 10 seconds if configured to not do anything).  Though
10 seconds may seem like an eternity for systems like the ones Peter was
talking about, where there is a script trying to restart the server as
soon as the postmaster dies.


There is also one "wild" solution. Postmaster and bgwriter will connect 
 with socket/pipe and select command will be used instead sleep. If 
connection unexpectedly fails, select finish immediately and we are able 
to handle this issue asap. This socket should be used also in some 
special case when we need wake up it faster.



Zdenek


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Autovacuum launcher doesn't notice death of postmaster immediately

2007-06-12 Thread Zdenek Kotala

Magnus Hagander wrote:

On Tue, Jun 12, 2007 at 12:23:50PM +0200, Zdenek Kotala wrote:

Alvaro Herrera wrote:

Zeugswetter Andreas ADI SD escribió:

The launcher is set up to wake up in autovacuum_naptime
seconds 

at most.

Imho the fix is usually to have a sleep loop.
This is what we have.  The sleep time depends on the schedule 
of next vacuum for the closest database in time.  If naptime 
is high, the sleep time will be high (depending on number of 
databases needing attention).

No, I meant a "while (sleep 1(or 10) and counter < longtime) check for
exit" instead of "sleep longtime".

Ah; yes, what I was proposing (or thought about proposing, not sure if I
posted it or not) was putting a upper limit of 10 seconds in the sleep
(bgwriter sleeps 10 seconds if configured to not do anything).  Though
10 seconds may seem like an eternity for systems like the ones Peter was
talking about, where there is a script trying to restart the server as
soon as the postmaster dies.
There is also one "wild" solution. Postmaster and bgwriter will connect 
 with socket/pipe and select command will be used instead sleep. If 
connection unexpectedly fails, select finish immediately and we are able 
to handle this issue asap. This socket should be used also in some 
special case when we need wake up it faster.


Given the amount of problems we've had with pipes on win32, let's try to
avoid adding extra ones unless they're really necessary. If split-sleep
works, that seems a safer bet.


Ok It should be problem. But I'm afraid split-sleep is not good solution 
as well. It should generate a lot of race condition in start/stop 
scripts and monitoring tools. Much better should be improve pg_ctl to 
perform clean up ("pg_ctl cleanup) when postmaster fails.


I think we must offer deterministic way to packagers integrator how to 
handle this issue.


Zdenek

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Command tags in create/drop scripts

2007-06-12 Thread Zdenek Kotala

David Fetter wrote:

On Fri, Jun 08, 2007 at 08:12:22PM -0500, Jim C. Nasby wrote:

On Tue, Jun 05, 2007 at 05:52:39PM -, Andrew Hammond wrote:

On Jun 5, 9:19 am, [EMAIL PROTECTED] (Alvaro Herrera) wrote:

Zdenek Kotala wrote:

Tom Lane wrote:

Bruce Momjian <[EMAIL PROTECTED]> writes:

Is this a TODO?

I don't think so; there is no demand from anybody but Zdenek to remove
those programs.  Has it ever even come up before?

Personally I found really strange to have "createuser" and "createdb"
shipped by Postgres when I started using it.  I just didn't complain.

+1. Given the prevalence of the pg_foo convention, those names are
clunky. So is initdb. I'm less creative than Zdenek, so I'd suggest
simply renaming to pg_createuser and friends with the same command
line options as the originals. Have the binaries check $0 and emit a
warning about using the deprecated name to STDERR if called by a name
that doesn't have the pg_ prefix. Default to symlinking the old names
for backwards compatibility until 9.0.

+1


+1

It's a lot easier just to prefix the names than to do something
"clever."


I agree that it is easier to implement. But my original idea was create 
 one command which should be easy expandable. For example  add LIST 
command which it allow to get list of users, roles, langs, databases 
(instead of psql -l). There is also no way how to create table space? 
Will we add command pg_createtablespace ? I think better is keep it in 
one binary instead extend list of deliverable object.


I think for people is better to remember pg_cmd --help instead looking 
for pg_something command. And a lot of code is share anyway. The nice 
example is ZFS implementation. You need only know two commands (ZPOOL 
and ZFS) to configure everything include NFS sharing.



Zdenek

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)

2007-06-19 Thread Zdenek Kotala

Simon Riggs wrote:



The objections to applying this patch originally were:
1. it changes on-disk format (we've done this, so argument is void)


I'm little bit confused when we introduce new page layout version? I 
expect that new version become with changes with pageheader, tuple 
header or data encoding (varlen/TOAST ...). But in case when there is 
new data type internal implementation, there was not reason to update 
version (see inet/cidr between 8.1 -> 8.2). Can me somebody clarify this?



However, In upgrade point of view. It will be necessary to keep 
information which postgreSQL version creates page to correctly track 
data type implementation (or special structure). Information in 
pg_control file and PG_VERSION files are unusable for upgrade purpose.


Zdenek

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)

2007-06-21 Thread Zdenek Kotala

Andrew Sullivan wrote:

On Wed, Jun 20, 2007 at 12:34:21PM -0400, Robert Treat wrote:
FWIW pg_migrator is a pretty good swing at an in-place upgrade tool for 
8.1->8.2.   Unfortunately until the PGDG decides that in-place upgrade is a 
constraint their willing to place on development, I see them a good 
chicken/egg away from making it a continually usefull tool. 


Or maybe cart/horse.  It seems to me that the rule more likely needs
to be that the migrator follow the development of the database than
that the database engine be strongly constrained by the needs of an
upgrade tool.  I agree that some commitment is needed, though.


I don't think that upgrade will strongly constrain a database engine. 
But some constrains we can expect. Any new development in some area will 
have to take care about upgrade. If you break some "upgrade coding 
rules" you will not to able create upgrade procedure or upgrade process 
will be risky.


For example upgrade between 8.1 -> 8.2 is risky, because you are not 
able easy determine what format of cidr/inet is used on the page. If 
something fails during table structure upgrade, then it is not easy to 
recovery from this situation.



Zdenek

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)

2007-06-22 Thread Zdenek Kotala

Heikki Linnakangas wrote:
Since we're discussing upgrades, let me summarize the discussions we had 
over dinner in Ottawa for the benefit of all:




Thanks for summary.


As before, someone just needs to step up and do it.


I'm now working on proposal. I hope that it will ready soon.

Zdenek

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] New Zealand - TZ change

2007-06-25 Thread Zdenek Kotala
I would like to inform, that New Zealand changed DST rules and new 
timezone files are available.  See 
http://www.dia.govt.nz/diawebsite.nsf/wpg_URL/Services-Daylight-Saving-Daylight-saving-to-be-extended


Patch for head attached. I kept zic.c untouched, but I think it would be 
nice to update it as well.


Are there any updated release scheduled 8.0-8.2?


Zdenek

? tz.diff
Index: src/timezone/data/africa
===
RCS file: /projects/cvsroot/pgsql/src/timezone/data/africa,v
retrieving revision 1.5
diff -c -r1.5 africa
*** src/timezone/data/africa	19 Apr 2007 22:44:32 -	1.5
--- src/timezone/data/africa	25 Jun 2007 10:26:50 -
***
*** 1,4 
! # @(#)africa	8.7
  # 
  
  # This data is by no means authoritative; if you think you know better,
--- 1,4 
! # @(#)africa	8.8
  # 
  
  # This data is by no means authoritative; if you think you know better,
***
*** 416,421 
--- 416,435 
  # Namibia
  # The 1994-04-03 transition is from Shanks & Pottenger.
  # Shanks & Pottenger report no DST after 1998-04; go with IATA.
+ 
+ # From Petronella Sibeene (2007-03-30) in
+ # :
+ # While the entire country changes its time, Katima Mulilo and other
+ # settlements in Caprivi unofficially will not because the sun there
+ # rises and sets earlier compared to other regions.  Chief of
+ # Forecasting Riaan van Zyl explained that the far eastern parts of
+ # the country are close to 40 minutes earlier in sunrise than the rest
+ # of the country.
+ # 
+ # From Paul Eggert (2007-03-31):
+ # Apparently the Caprivi Strip informally observes Botswana time, but
+ # we have no details.  In the meantime people there can use Africa/Gaborone.
+ 
  # RULE	NAME	FROM	TO	TYPE	IN	ON	AT	SAVE	LETTER/S
  Rule	Namibia	1994	max	-	Sep	Sun>=1	2:00	1:00	S
  Rule	Namibia	1995	max	-	Apr	Sun>=1	2:00	0	-
Index: src/timezone/data/australasia
===
RCS file: /projects/cvsroot/pgsql/src/timezone/data/australasia,v
retrieving revision 1.5
diff -c -r1.5 australasia
*** src/timezone/data/australasia	19 Apr 2007 22:44:32 -	1.5
--- src/timezone/data/australasia	25 Jun 2007 10:26:51 -
***
*** 1,4 
! # @(#)australasia	8.6
  # 
  
  # This file also includes Pacific islands.
--- 1,4 
! # @(#)australasia	8.7
  # 
  
  # This file also includes Pacific islands.
***
*** 348,357 
  Rule	Chatham	1976	1989	-	Mar	Sun>=1	2:45s	0	S
  Rule	NZ	1989	only	-	Oct	Sun>=8	2:00s	1:00	D
  Rule	Chatham	1989	only	-	Oct	Sun>=8	2:45s	1:00	D
! Rule	NZ	1990	max	-	Oct	Sun>=1	2:00s	1:00	D
! Rule	Chatham	1990	max	-	Oct	Sun>=1	2:45s	1:00	D
! Rule	NZ	1990	max	-	Mar	Sun>=15	2:00s	0	S
! Rule	Chatham	1990	max	-	Mar	Sun>=15	2:45s	0	S
  # Zone	NAME		GMTOFF	RULES	FORMAT	[UNTIL]
  Zone Pacific/Auckland	11:39:04 -	LMT	1868 Nov  2
  			11:30	NZ	NZ%sT	1946 Jan  1
--- 348,361 
  Rule	Chatham	1976	1989	-	Mar	Sun>=1	2:45s	0	S
  Rule	NZ	1989	only	-	Oct	Sun>=8	2:00s	1:00	D
  Rule	Chatham	1989	only	-	Oct	Sun>=8	2:45s	1:00	D
! Rule	NZ	1990	2006	-	Oct	Sun>=1	2:00s	1:00	D
! Rule	Chatham	1990	2006	-	Oct	Sun>=1	2:45s	1:00	D
! Rule	NZ	1990	2007	-	Mar	Sun>=15	2:00s	0	S
! Rule	Chatham	1990	2007	-	Mar	Sun>=15	2:45s	0	S
! Rule	NZ	2007	max	-	Sep	lastSun	2:00s	1:00	D
! Rule	Chatham	2007	max	-	Sep	lastSun	2:45s	1:00	D
! Rule	NZ	2008	max	-	Apr	Sun>=1	2:00s	0	S
! Rule	Chatham	2008	max	-	Apr	Sun>=1	2:45s	0	S
  # Zone	NAME		GMTOFF	RULES	FORMAT	[UNTIL]
  Zone Pacific/Auckland	11:39:04 -	LMT	1868 Nov  2
  			11:30	NZ	NZ%sT	1946 Jan  1
***
*** 1146,1151 
--- 1150,1161 
  # transitions at 2:45 local standard time; this confirms that Chatham
  # is always exactly 45 minutes ahead of Auckland.
  
+ # From Colin Sharples (2007-04-30):
+ # DST will now start on the last Sunday in September, and end on the
+ # first Sunday in April.  The changes take effect this year, meaning
+ # that DST will begin on 2007-09-30 2008-04-06.
+ # http://www.dia.govt.nz/diawebsite.nsf/wpg_URL/Services-Daylight-Saving-Daylight-saving-to-be-extended
+ 
  ###
  
  
Index: src/timezone/data/northamerica
===
RCS file: /projects/cvsroot/pgsql/src/timezone/data/northamerica,v
retrieving revision 1.7
diff -c -r1.7 northamerica
*** src/timezone/data/northamerica	19 Apr 2007 22:44:32 -	1.7
--- src/timezone/data/northamerica	25 Jun 2007 10:26:54 -
***
*** 1,4 
! # @(#)northamerica	8.16
  # 
  
  # also includes Central America and the Caribbean
--- 1,4 
! # @(#)northamerica	8.17
  # 
  
  # also includes Central America and the Caribbean
***
*** 2325,2330 
--- 2325,2333 
  #
  # The reason seems to be an energy crisis.
  
+ # From Stephen Colebourne (2007-02-22):
+ # Some IATA info: Hai

Re: [HACKERS] [PATCHES] New Zealand - TZ change

2007-06-25 Thread Zdenek Kotala

Tom Lane wrote:

Zdenek Kotala <[EMAIL PROTECTED]> writes:
I would like to inform, that New Zealand changed DST rules and new 
timezone files are available.  See 
http://www.dia.govt.nz/diawebsite.nsf/wpg_URL/Services-Daylight-Saving-Daylight-saving-to-be-extended



Patch for head attached.


We do not "patch" those files; the update technique is "download, untar,
commit", and there's really no point in doing it except when a release
is imminent.  See src/timezone/README and src/tools/RELEASE_CHANGES.


Thanks for explanation. However there is still the question when we will release 
new 8.0-8.2 versions. The changes will happen after two months and there is not 
much space for packagers to create and release packages.


Zdenek

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[HACKERS] Proposal: In-Place upgrade concept

2007-07-03 Thread Zdenek Kotala


I attach In-Place upgrade project concept. Any technical details about 
implementation of each part will be sent later (after concept acceptance).


Please, let me know your comments.

thanks Zdenek
In-place Upgrade project
---

Overview

PostgreSQL community releases new version of PostgreSQL at least once a year 
but the released version is not able to run with old data files. For a user who 
wants to upgrade his database server is a nightmare to upgrade bigger database 
and it is impossible to upgrade a very large database. This project will try to 
solve this issue and provide a functionality for PostgreSQL which will cover 
all user's requirements.


Current status
--
1) Data upgrade is supported only by exporting and importing data using pg_dump
   - slow and causes a long downtime
   - extra disk space needed
   - require both version (old&new)
2) pg_migrator
   - faster then 1) but it still has a long downtime when on disk structure has 
changed
   - previous version of PostgreSQL is needed.
   - no downgrade
   - require both version (old&new)


Project Goals
-
1) Minimize the database downtime.
2) Do not require extra disk space.
3) Do not require an old version of PostgreSQL to be installed.
4) Allow a fallback to original PostgreSQL version if something would go wrong.
5) Support all data types, including user defined types
6) Should be easy to use.


Functional specification

There might be more different approaches on how to perform upgrade. We assume 
that we need to convert a very big database. If we will upgrade the data files 
offline the database downtime might be critical. We will better  try to convert 
data online. This will bring a requirement that new major version of PostgreSQL 
must understand the old data structures and it must be able to work with them.

The next important requirement is an ability to fallback (downgrade) to the 
previous PostgreSQL version. When new major version does not work well for the 
user, he needs to be able to go back to the previous version with minimal 
downtime. This will bring a requirement that PostgreSQL should also be able to 
write data in the old format. However, we might expect some performance penalty.

The above requirements lead to PostgreSQL running in three modes:

compatibility mode (compat) - a new version is running on the old data files. 
It is possible to read and write data as in a previous version. Features of a 
new version might not be available. Fallback to the previous version is 
possible.

upgrade mode - a conversion of data files is running in the background. 
PostgreSQL have to be able to work with both the new data file format and the 
old data file format. Features of a new version might not be available. 
PostgreSQL have to be able to recover and continue in the upgrade mode in case 
of a power/system failure. Fallback to the previous version is not possible.

normal mode - all required data files were upgraded to the new format. Features 
of a new version are available. Fallback to the previous version is not 
possible.

Project Roadmap
-
1) Create a detailed design for each affected part
2) Design a testing environment/process
3) Implementation
4) Change development/release processes - patch author have to be responsible 
for providing the required upgrade operations/functions. Some more development 
related to upgrade might be done also after feature freeze. In-place upgrade 
should be available since x.y.1 version.


List of affected parts
--

1) Data Page Structure
1.1) Page Layout
1.2) Tuple header and structure
1.3) Data encoding - data type changes
2) Control File
3) WAL
4) Catalog
5) Configuration files
7) Others (Procedure language, Communication protocol ...)



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Proposal: In-Place upgrade concept

2007-07-03 Thread Zdenek Kotala

Heikki Linnakangas wrote:

Zdenek Kotala wrote:

Project Goals
-
...
3) Do not require an old version of PostgreSQL to be installed.


Why not? Having two versions installed at the same time doesn't seem 
like a problem to me. You can remove the old version as soon as the 
upgrade is done.


This should be problem if you want to upgrade operation system together 
with DB and new operation system does not have this version. This 
requirement is based on operation system vendor and also on my 
experience with Oracle upgrade on tru64 cluster when two version of 
installed Oracle does not works good - ok postgresql is better in this 
way :-).



4) Allow a fallback to original PostgreSQL version if something would 
go wrong.

...
compatibility mode (compat) - a new version is running on the old data 
files. It is possible to read and write data as in a previous version. 
Features of a new version might not be available. Fallback to the 
previous version is possible.


That's madness. Why would you want to do that? The complexity it adds is 
just mind-boggling.


I don't think that it increase complexity much. You will have conversion 
function to convert page/data from version x->x+1 and the same way could 
 be use to convert data back. It need more investigation, but I'm not 
afraid about complexity (meanwhile :-).


You still wouldn't be able to downgrade after you switch from 
compatibility mode, and it seems likely that any problems, whatever they 
might be, would not appear until you switch. That means you'll need to 
be prepared to downgrade using a backup anyway, so the compatibility 
mode doesn't buy you much.


One kind of problem is user defined type/function and extension. You can 
test a lot of things on test machines, but some problem could occur on 
big tables/storage. No many postgresql users have multiterrabyte disk 
array for testing.


Second kind of problem is operation system upgrade. If you upgrade OS 
and postgresql together and something fails (not only new version of 
postgresql, but many other things) then admin want to revert back to the 
old  OS version with old postgres version. In Solaris it is called Live 
Upgrade feature.


This feature enable possibility to do it easily. However, backup/restore 
is possible solution - unfortunately not much comfortable. I think it is 
nice to have requirement and if it will be possible we can do that, if 
it will bring a lot of complication, we can remove it.


upgrade mode - a conversion of data files is running in the 
background. PostgreSQL have to be able to work with both the new data 
file format and the old data file format. Features of a new version 
might not be available. PostgreSQL have to be able to recover and 
continue in the upgrade mode in case of a power/system failure. 
Fallback to the previous version is not possible.


That's feasible, though I don't see why new features wouldn't be available.


For example Oracle 10 implemented some new incremental backup feature. 
But you can use it only when you have converted database to correct 
format. I'm not oracle and I'm not able to predict which new feature in 
postgresql will require all updated pages for example. I don't expect 
that database upgrading mode will affect a lot of new features but we 
must have it in the mind.



Zdenek

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Proposal: In-Place upgrade concept

2007-07-03 Thread Zdenek Kotala

Heikki Linnakangas wrote:

Gregory Stark wrote:




Conceivably we could grab another infomask bit to indicate "uses 
new-style
varlenas" and then have heaptuple.c understand how to convert them in 
place.

But that leads to a ton of memory management or page locking problems.


My thinking is that when a page in the old format is read in, it's 
converted to the new format before doing anything else with it.


Yes, I agree with Heikki. Any other place for conversion will generate a 
big complexity of maintanace.


Zdenek



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Proposal: In-Place upgrade concept

2007-07-03 Thread Zdenek Kotala

Martijn van Oosterhout wrote:

On Tue, Jul 03, 2007 at 12:05:07PM +0100, Heikki Linnakangas wrote:

This is actually a bit of a problem. We would need to know when we
read in a page what the tupledescriptor for that relation looks like
to know which fields are varlena. I'm not sure how easy it would be
to arrange for the tuple descriptor to be passed down that far.
Speaking of on-the-fly upgrading, ReadBuffer is already passed the 
Relation, which contains the TupleDesc, so I don't think that's a 
problem. Not sure how easy that would be to do in an external program 
like pg_migrator.


My reading of this thread so far is that we're intending to upgrade
pages on the fly? Which means that at any point in time, some pages
will be converted and some not, so the tuple descriptor isn't going to
help, surely you need some identifier on the page telling you if it's
upgraded or not?


Currently there is page layout version number. But this information is 
not useful for conversion from 8.1 or 8.2, because both version has same 
number, but some datatypes (inet/cidr) have different storage format.


Small discussion about page version is there

http://archives.postgresql.org/pgsql-hackers/2007-06/msg00745.php

The page identification is one thing what must been improved.


Zdenek

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Proposal: In-Place upgrade concept

2007-07-03 Thread Zdenek Kotala

Tom Lane wrote:

Heikki Linnakangas <[EMAIL PROTECTED]> writes:
My thinking is that when a page in the old format is read in, it's 
converted to the new format before doing anything else with it.


Yeah, I'm with Heikki on this.  What I see as a sane project definition
is:

* pg_migrator or equivalent to convert the system catalogs
* a hook in ReadBuffer to allow a data page conversion procedure to
  be applied, on the basis of checking for old page layout version.



pg_migrator is separate tool which requires old postgres version and I 
would like to have solution in postgres binary without old version 
presence. Very often new postgres version is store in same location 
(e.g. /usr/bin) and normal users could have a problem.


I see there three possible solution:

1) special postgres startup mode - postgres --upgrade-catalog
2) automatic conversion - when postgres convert catalog automatically on 
first startup on old db cluster
3) (in compat mode) catalog will be converted on fly (read/write), until 
upgrade mode is not start


> I think insisting on a downgrade option is an absolutely certain way
> of guaranteeing that the project will fail.

How I mentioned before. This is nice to have requirement. I would like 
to have in the mind and when it starts complexity explosion we can 
remove it from the requirement list.



I'm not sure it's feasible to expect that we can change representations
of user-defined types, either.  I don't see how you would do that
without catalog access (to look up the UDT), and the page conversion
procedure is going to have to be able to operate without catalog
accesses.  (Thought experiment: a page is read in during crash recovery
or PITR slave operation, and discovered to have the old format.)


The idea how to solve problem in data type on disk representation change 
is to keep old and new datatype in/out function. New created tables will 
contains new type implementation and old tables could be converted with 
ALTER TABLE command on user request. Old data type could be store in 
compat library.



BTW, I thought of a likely upgrade problem that we haven't discussed
(AFAIR) in any of the many threads on this subject.  What about an index
access method change that involves an index-wide restructuring, such
that it can't be done one page at a time?  A plausible example is
changing hash indexes to have multiple buckets per page.  Presumably
you can fix the index with REINDEX, but that doesn't meet the goal of
limited downtime, if the index is big.  Is there another way?



Yes, there is way to keep old and new implementation of index and each 
will have different oid. Primary key for pg_am table will be 
name+pg_version - It is similar to UDT solution. CREATE INDEX as a 
REINDEX will use actual implementation.



Zdenek

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Proposal: In-Place upgrade concept

2007-07-03 Thread Zdenek Kotala

Martijn van Oosterhout wrote:

On Tue, Jul 03, 2007 at 07:13:29PM +0200, Zdenek Kotala wrote:
pg_migrator is separate tool which requires old postgres version and I 
would like to have solution in postgres binary without old version 
presence. Very often new postgres version is store in same location 
(e.g. /usr/bin) and normal users could have a problem.


Possibly. But you have to demonstrate it actually works and it's best
to do that with a seperate process. Or fix pg_migrator to not require
the old version, either way will do.


Pg_migrator use pg_dump for dump catalog and main concept is based on 
presence of old postgres version. I think Fix it means rewrite it.



I think insisting on a downgrade option is an absolutely certain way
of guaranteeing that the project will fail.
How I mentioned before. This is nice to have requirement. I would like 
to have in the mind and when it starts complexity explosion we can 
remove it from the requirement list.


It seems to me that if every step of the process is WAL logged, then
downgrading is simply a matter of restoring and using PITR. Downgrades
are hard work, primarily because in the long run it's going to be
totally untested code because hardly anyone is going to need it.


It is not downgrade. It is about keep old structure until user says 
convert to the new data structure.


Yes, there is way to keep old and new implementation of index and each 
will have different oid. Primary key for pg_am table will be 
name+pg_version - It is similar to UDT solution. CREATE INDEX as a 
REINDEX will use actual implementation.


Bad idea. Indexes are already complicated peices of code, there's no
guarentee the old code will even work with the new version. Seems like
too much work considering REINDEX will simply fix the problem outright.


I do not expect that old code will work with new index structure. I want 
to keep both implementation and old index will be processed by old code 
and new one will be processed by new implementation. Each will have 
different OID and pg_class.relam will point to correct implementation. 
These obsolete index method could be located in separate compat lib.


Zdenek

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Proposal: In-Place upgrade concept

2007-07-03 Thread Zdenek Kotala

Tom Lane wrote:



Again, you are setting yourself up for complete failure if you insist
on having every possible nicety in the first version.  An incremental
approach is far more likely to succeed than a "big bang".


Yes, I know. I don't want to solve everything in one patch. I just 
looking forward and I'm trying to summarize complexity of problem. I 
will not want to stay on dead road and say ups after couple of months of 
development ...


Zdenek

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Proposal: In-Place upgrade concept

2007-07-04 Thread Zdenek Kotala

Tom Lane wrote:



I do not expect that old code will work with new index structure. I want 
to keep both implementation and old index will be processed by old code 
and new one will be processed by new implementation. Each will have 
different OID and pg_class.relam will point to correct implementation. 


I don't think it's quite that easy when you consider user-defined
datatypes.  Where are you going to get two sets of opclasses from?


Good point. I think if new index implementation does not change meaning 
of strategy and function number old opclasses set could be used. And it 
is only for new created or recreated indexes. Existing indexes will work 
fine with old opclasses set.



Zdenek

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Proposal: In-Place upgrade concept

2007-07-04 Thread Zdenek Kotala

Martijn van Oosterhout wrote:

On Tue, Jul 03, 2007 at 09:51:59PM -0400, Tom Lane wrote:

Out-of-line datums aren't the only issue, either: consider inline
compressed datums.  A data representation change, even one that is known
not to increase the ordinary uncompressed size of the datum, could
easily render it slightly less compressible, resulting in a risk that
the tuple doesn't fit on the page anymore.  It hardly seems practical
(maybe not even possible) to guarantee that this cannot happen.


Does the converted page have to fit into a single page? It's a bit of a
corner-case, there's slack on every page. Wouldn't it be possible to
replace an oversize tuple with a pointer to a new tuple on a new page
(indexes are the problem here).


I think it is same problem as Tom mentioned about catalog access which 
you probably need for handle index modification. I have some wild idea 
how to handle, but it needs more experiments.



So maybe we are up against the conclusion that in-place updates cannot
support datatype representation changes, at least not for toastable
datatypes.  We could still handle 'em by the expedient suggested
upthread for user-defined types, ie the "new representation" is treated
as a whole new type.  That's not terribly appetizing though; I had
expected we could be more efficient for the case of changes in built-in
types.


Well, it seems to me 99% of cases can be handled easily, we just might
have to have a backup method for the cases where the simple cases
doesn't work...


The question is how to check if database is fit in 1% of corner cases. 
One suggested solution was to run preupgrade check on old version, but 
it also generates problem how to avoid "ugly" modification of already 
checked pages.



Zdenek

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] compiler warnings on the buildfarm

2007-07-13 Thread Zdenek Kotala

Heikki Linnakangas wrote:

Zdenek Kotala wrote:
If I look there 
http://www.opengroup.org/onlinepubs/008329799/chap5.htm#tagcjh_06


in "Call Back Information" section. The structure is defined as

struct pam_conv{ int (*conv) (int, struct pam_message **, struct 
pam_response **, void *); void *appdata_ptr; };



I don't see any "const" keyword there.


Right after that:

 where int conv(int num_msg, const struct pam_message **msg, 
struct pam_response **resp, void *appdata_ptr); 


Ups, I overlooked it.


How confusing...


Yes agree.

Zdenek



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] compiler warnings on the buildfarm

2007-07-13 Thread Zdenek Kotala

Stefan Kaltenbrunner wrote:

Zdenek Kotala wrote:



For sun studio -erroff=E_STATEMENT_NOT_REACHED is useful there. If you
want to determine warning tags for each warning add -errtags.


Is that supported on all versions of sun studio(Sun WorkShop 6, Sun
Studio 8,11) we have on the farm ?


Yes. Also on SS12.

Zdenek

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] compiler warnings on the buildfarm

2007-07-13 Thread Zdenek Kotala

Kris Jurka wrote:



On Fri, 13 Jul 2007, Zdenek Kotala wrote:


Tom Lane wrote:

Kris Jurka <[EMAIL PROTECTED]> writes:

So pam_message ** isn't const.


Ah, thanks.  I see luna_moth is giving the same warning, so it's still
not const in Solaris 11 either.

Is it worth working around this?  It's strictly cosmetic AFAICS.

The main issue in my mind would be how to determine whether to use
const or not.  If all Solaris releases are like this, and can be
expected to stay that way,


I think yes. It is defined as X/Open standard says.



Not according to the link you sent earlier.  My reading says that 
Solaris has it defined wrong and pg has it right.


If I look there 
http://www.opengroup.org/onlinepubs/008329799/chap5.htm#tagcjh_06


in "Call Back Information" section. The structure is defined as

struct pam_conv{ int (*conv) (int, struct pam_message **, struct 
pam_response **, void *); void *appdata_ptr; };



I don't see any "const" keyword there.


Zdenek


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] compiler warnings on the buildfarm

2007-07-13 Thread Zdenek Kotala

Tom Lane wrote:

Kris Jurka <[EMAIL PROTECTED]> writes:

So pam_message ** isn't const.


Ah, thanks.  I see luna_moth is giving the same warning, so it's still
not const in Solaris 11 either.

Is it worth working around this?  It's strictly cosmetic AFAICS.

The main issue in my mind would be how to determine whether to use
const or not.  If all Solaris releases are like this, and can be
expected to stay that way,


I think yes. It is defined as X/Open standard says.


I'd be inclined to just put a "#define
PAM_CONV_PROC_NOT_CONST" in include/port/solaris.h and drive the
function declaration off that.  If there's a version dependency
involved then it gets a lot more complicated, and might not be worth
the trouble.


Following patch works for me, but I did not test it on other platform.


retrieving revision 1.153
diff -r1.153 auth.c
61c61
<   &pam_passwd_conv_proc,
---
>   (int (*)())pam_passwd_conv_proc,


Zdenek

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] compiler warnings on the buildfarm

2007-07-13 Thread Zdenek Kotala

Stefan Kaltenbrunner wrote:

Peter Eisentraut wrote:

Am Donnerstag, 12. Juli 2007 15:25 schrieb Stefan Kaltenbrunner:

a lot of those are simply noise (like the LOOP VECTORIZED stuff from the
icc boxes or the "statement not reached" spam from the sun compilers)
but others might indicate real issues.
To find warnings that might be a real problem we might want to look into
 suppressing those - if possible -  using compiler switches.
It would be good to determine an appropriate set of compiler switches to 
reduce the warnings to a reasonable level.


yeah once we have determined that this whole experiment is useful it
should be pretty easy to tweak the compiler switches for the non-gcc
compilers (mostly icc and sun studio seem to be the ones that generate
excessive output).



For sun studio -erroff=E_STATEMENT_NOT_REACHED is useful there. If you 
want to determine warning tags for each warning add -errtags.



Zdenek

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] compiler warnings on the buildfarm

2007-07-13 Thread Zdenek Kotala

Tom Lane wrote:

Heikki Linnakangas <[EMAIL PROTECTED]> writes:

Zdenek Kotala wrote:

I don't see any "const" keyword there.



Right after that:


where int conv(int num_msg, const struct pam_message **msg, struct pam_response **resp, void *appdata_ptr); 



How confusing...


And the pam_start page he cited earlier has a different set of typos in
its version of the struct :-(.  Still, that's two out of three places
that say it's const, and Solaris appears to be the only implementation
that has chosen to read it as not const.


Yes, Agree. I try to send request to security team for explanation. 
Maybe original author also overlooked it as me today :-).


Zdenek

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] compiler warnings on the buildfarm

2007-07-13 Thread Zdenek Kotala

Kris Jurka wrote:



On Thu, 12 Jul 2007, Tom Lane wrote:

static int pam_passwd_conv_proc(int num_msg, const struct pam_message 
** msg,

struct pam_response ** resp, void *appdata_ptr);

which exactly matches what my Fedora 6 pam header file says it should
be.  What is it on those Solaris machines?


struct pam_conv {
int (*conv)(int, struct pam_message **,
struct pam_response **, void *);
void *appdata_ptr;  /* Application data ptr */
};

So pam_message ** isn't const.



Yes, according to X/Open XSSO Standard - see 
http://www.opengroup.org/onlinepubs/008329799/

http://www.opengroup.org/onlinepubs/008329799/pam_start.htm#tagcjh_07_32

Zdenek

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] GSSAPI on Solaris - psql segfault

2007-07-13 Thread Zdenek Kotala

Stefan Kaltenbrunner wrote:

I just took a look at adding gssapi build support on solaris (solaris
10/x86_64, sun studio 10, 64bit build) which seemed easy enough by
educating configure to look for -lgss but while it compiles just fine
the resulting tree will not be able to complete a make check due to
psql(!) segfaulting as soon as it tries to connect to the backend.
the backtrace looks similiar to:


program terminated by signal SEGV (no mapping at the fault address)
0x7fd3d401: _memcpy+0x00e1: movq %rax,(%rdi)
Current function is pqPutMsgBytes
  475   memcpy(conn->outBuffer + conn->outMsgEnd, buf, len);
(dbx) where
  [1] _memcpy(0x4, 0x4709a0, 0x67, 0x2, 0x646c697562677000,
0x7361626174616400), at 0x7fd3d401
=>[2] pqPutMsgBytes(buf = 0x4709a0, len = 103U, conn = 0x4682f0), line
475 in "fe-misc.c"
  [3] pqPutnchar(s = 0x4709a0 "", len = 103U, conn = 0x4682f0), line 189
in "fe-misc.c"
  [4] pqPacketSend(conn = 0x4682f0, pack_type = '\0', buf = 0x4709a0,
buf_len = 103U), line 2439 in "fe-connect.c"
  [5] PQconnectPoll(conn = 0x4682f0), line 1299 in "fe-connect.c"
  [6] connectDBComplete(conn = 0x4682f0), line 936 in "fe-connect.c"
  [7] PQsetdbLogin(pghost = (nil), pgport = (nil), pgoptions = (nil),
pgtty = (nil), dbName = 0xfd7fffdffbeb "postgres", login = (nil),
pwd = (nil)), line 660 in "fe-connect.c"
  [8] main(argc = 3, argv = 0xfd7fffdff9b8), line 211 in "startup.c"


ideas ?


Do you use also 64bit psql? Can you send me your ./configure setup? 
However, I see there potential integer overflow, because len is size_t 
and  conn->outMsgEnd is defined as int.


 Zdenek

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] compiler warnings on the buildfarm

2007-07-16 Thread Zdenek Kotala

Stefan Kaltenbrunner wrote:

Zdenek Kotala wrote:

Stefan Kaltenbrunner wrote:

Zdenek Kotala wrote:

For sun studio -erroff=E_STATEMENT_NOT_REACHED is useful there. If you
want to determine warning tags for each warning add -errtags.

Is that supported on all versions of sun studio(Sun WorkShop 6, Sun
Studio 8,11) we have on the farm ?

Yes. Also on SS12.


hmm - sure about that ? I was about to submit a patch to disable some
compiler warnings but then I noted the following discussion thread:

http://forum.java.sun.com/thread.jspa?threadID=5163903&messageID=9637391

which seems to indicate that at least the compiler installed on kudu:

http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kudu&dt=2007-07-15%2003:30:01

does NOT support turning of specific warnings.



I tested it on cc version 5.3 and it works. See

-bash-3.00$ SUNWspro/SC6.2/bin/cc -V
cc: Sun WorkShop 6 update 2 C 5.3 Patch 111680-09 2003/05/18
usage: cc [ options] files.  Use 'cc -flags' for details
-bash-3.00$ /ws/onnv-tools/SUNWspro/SC6.2/bin/cc pokus.c
"pokus.c", line 5: warning: statement not reached
-bash-3.00$ /ws/onnv-tools/SUNWspro/SC6.2/bin/cc -errtags pokus.c
"pokus.c", line 5: warning: statement not reached (E_STATEMENT_NOT_REACHED)
-bash-3.00$ /ws/onnv-tools/SUNWspro/SC6.2/bin/cc 
-erroff=E_STATEMENT_NOT_REACHED pokus.c

-bash-3.00$


It works since Sun Workshop 4.2 (cc: WorkShop Compilers 4.2 26 Jun 1997 
C 4.2 patch 105062-01). I tested it also on SunWorkshop 2.0.1 and it 
does not work there, but I belive that nobody uses ten years old 
compiler :-).


Please, can you send me a cc -V output ( I think It should be added in log).

Zdenek

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] compiler warnings on the buildfarm

2007-07-17 Thread Zdenek Kotala

Stefan Kaltenbrunner napsal(a):

Zdenek Kotala wrote:

Stefan Kaltenbrunner wrote:

Zdenek Kotala wrote:

Stefan Kaltenbrunner wrote:

Zdenek Kotala wrote:

For sun studio -erroff=E_STATEMENT_NOT_REACHED is useful there. If you
want to determine warning tags for each warning add -errtags.

Is that supported on all versions of sun studio(Sun WorkShop 6, Sun
Studio 8,11) we have on the farm ?

Yes. Also on SS12.

hmm - sure about that ? I was about to submit a patch to disable some
compiler warnings but then I noted the following discussion thread:

http://forum.java.sun.com/thread.jspa?threadID=5163903&messageID=9637391

which seems to indicate that at least the compiler installed on kudu:

http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kudu&dt=2007-07-15%2003:30:01


does NOT support turning of specific warnings.


I tested it on cc version 5.3 and it works. See


ah cool - thanks for testing!

so on my box we would need to add
-erroff=E_EMPTY_TRANSLATION_UNIT,E_STATEMENT_NOT_REACHED,E_END_OF_LOOP_CODE_NOT_REACHED,E_FUNC_HAS_NO_RETURN_STMT,E_LOOP_NOT_ENTERED_AT_TOP

to CFLAGS to get down to the following 2 warnings:

"pgstat.c", line 652: warning: const object should have initializer:
all_zeroes (E_CONST_OBJ_SHOULD_HAVE_INITIZR)
"pgstat.c", line 2118: warning: const object should have initializer:
all_zeroes (E_CONST_OBJ_SHOULD_HAVE_INITIZR)

the open question is if that is what want or if we would be simply
adding (unnecessary) complexity (or confusion).

comments ?


E_STATEMENT_NOT_REACHED,E_END_OF_LOOP_CODE_NOT_REACHED, E_EMPTY_TRANSLATION_UNIT 
are probably ok to ignore.
E_FUNC_HAS_NO_RETURN_STMT is there because main is leaved by exit() instead 
return. And In another case It should be regular warning.



I think good solution is compare previous warning log with latest build and make 
a diff. If some new warning appears it is probably regular warning.


Zdenek

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] compiler warnings on the buildfarm

2007-07-17 Thread Zdenek Kotala

Tom Lane napsal(a):

Zdenek Kotala <[EMAIL PROTECTED]> writes:
E_FUNC_HAS_NO_RETURN_STMT is there because main is leaved by exit() instead 
return. And In another case It should be regular warning.


That should be gone now; I changed the two places that triggered it.
I'd suggest not disabling that warning.


Yes I agree. Did you also clean up on old branches? If not I think we can live 
with this warning on old branches.



Zdenek

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] pgcrypto & strong ciphers limitation

2007-07-24 Thread Zdenek Kotala
Stefan reported me that prcrypto regression test fails on solaris 10 
with openssl support. I investigated this problem and the result is that 
Solaris 10 delivers only support for short keys up to 128. Strong crypto 
(SUNWcry and SUNWcryr packages) is available on web download pages. (It 
is result of US crypto export policy.)


However, on default installation  (which is commonly used) it is a 
problem. Regression test cannot be fixed because it tests strong 
ciphers, but there two very strange issue:


1) First issue is blowfish cipher. Because pgcrypto uses old interface 
instead new "evp" it calls bf_set_key function which does not return any 
output and cut key if it is too long. See 
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/common/openssl/crypto/bf/bf_skey.c

line 84.

If user installs strong crypto he will not be able decrypt data which 
has been encrypted before.


The fix of this issue is ugly, because there is not way how to verify 
supported key length with old openssl API and only new API return err if 
length is not supported.



2) AES ciphere crashes when key is longer. It happens because return 
value from AES_set_encrypt_key is ignored and AES_encrypt is called with 
uninitialized structure.



I attach patch which fix both issues, but main problem is there that old 
openssl API is used and supported key lengths are hardcoded. I think we 
can add to TODO list rewrite pgcrypto to use evp openssl interface.



Any comments?

Zdenek

Index: openssl.c
===
RCS file: /projects/cvsroot/pgsql/contrib/pgcrypto/openssl.c,v
retrieving revision 1.30
diff -c -r1.30 openssl.c
*** openssl.c	4 Oct 2006 00:29:46 -	1.30
--- openssl.c	24 Jul 2007 11:20:02 -
***
*** 380,385 
--- 380,399 
  {
  	ossldata   *od = c->ptr;
  
+ 	/* Test if key len is supported. BF_set_key silently cut large keys and it could be
+ 	 be a problem when user transfer crypted data from one server to another. */
+ 	EVP_CIPHER_CTX ctx;
+ 	EVP_CIPHER_CTX_init(&ctx);
+ 	EVP_EncryptInit_ex(&ctx, EVP_bf_cbc(), NULL, NULL, NULL);
+ 	EVP_CIPHER_CTX_set_key_length(&ctx,klen);
+ 	if( !EVP_EncryptInit_ex(&ctx,NULL, NULL, key, NULL) )
+ 	{
+ 		EVP_CIPHER_CTX_cleanup(&ctx);
+ 		return PXE_KEY_TOO_BIG;
+ 	}
+ 	EVP_CIPHER_CTX_cleanup(&ctx);
+ 
+ 	/* Key len is supported. We can use it. */
  	BF_set_key(&od->u.bf.key, klen, key);
  	if (iv)
  		memcpy(od->iv, iv, BF_BLOCK);
***
*** 692,705 
  	return 0;
  }
  
! static void
  ossl_aes_key_init(ossldata * od, int type)
  {
  	if (type == AES_ENCRYPT)
! 		AES_set_encrypt_key(od->key, od->klen * 8, &od->u.aes_key);
  	else
! 		AES_set_decrypt_key(od->key, od->klen * 8, &od->u.aes_key);
! 	od->init = 1;
  }
  
  static int
--- 706,728 
  	return 0;
  }
  
! static int
  ossl_aes_key_init(ossldata * od, int type)
  {
+ 	int err;
+ 	/* Strong key support could miss on some openssl installation, we must
+ 		check return value, from set key function.
+ 	*/ 
  	if (type == AES_ENCRYPT)
! 	err = AES_set_encrypt_key(od->key, od->klen * 8, &od->u.aes_key);
  	else
! 		err = AES_set_decrypt_key(od->key, od->klen * 8, &od->u.aes_key);
! 
! 	if (err == 0)
! 		od->init = 1;
! 	else 
! 		od->init = 0;
! 	return err;
  }
  
  static int
***
*** 711,717 
  	const uint8 *end = data + dlen - bs;
  
  	if (!od->init)
! 		ossl_aes_key_init(od, AES_ENCRYPT);
  
  	for (; data <= end; data += bs, res += bs)
  		AES_ecb_encrypt(data, res, &od->u.aes_key, AES_ENCRYPT);
--- 734,741 
  	const uint8 *end = data + dlen - bs;
  
  	if (!od->init)
! 		if( ossl_aes_key_init(od, AES_ENCRYPT) )
! 			return PXE_KEY_TOO_BIG;
  
  	for (; data <= end; data += bs, res += bs)
  		AES_ecb_encrypt(data, res, &od->u.aes_key, AES_ENCRYPT);
***
*** 727,733 
  	const uint8 *end = data + dlen - bs;
  
  	if (!od->init)
! 		ossl_aes_key_init(od, AES_DECRYPT);
  
  	for (; data <= end; data += bs, res += bs)
  		AES_ecb_encrypt(data, res, &od->u.aes_key, AES_DECRYPT);
--- 751,758 
  	const uint8 *end = data + dlen - bs;
  
  	if (!od->init)
! 		if( ossl_aes_key_init(od, AES_DECRYPT) )
! 			return PXE_KEY_TOO_BIG;
  
  	for (; data <= end; data += bs, res += bs)
  		AES_ecb_encrypt(data, res, &od->u.aes_key, AES_DECRYPT);
***
*** 741,748 
  	ossldata   *od = c->ptr;
  
  	if (!od->init)
! 		ossl_aes_key_init(od, AES_ENCRYPT);
! 
  	AES_cbc_encrypt(data, res, dlen, &od->u.aes_key, od->iv, AES_ENCRYPT);
  	return 0;
  }
--- 766,774 
  	ossldata   *od = c->ptr;
  
  	if (!od->init)
! 		if( ossl_aes_key_init(od, AES_ENCRYPT) )
! 			return PXE_KEY_TOO_BIG;
! 	
  	AES_cbc_encrypt(data, res, dlen, &od->u.aes_key, od->iv, AES_ENCRYPT);
  	return 0;
  }
***
*** 754,760 
  	ossldata   *od = c->ptr;
  
  	if (!od->init)
! 		ossl_aes_key_init(od, AES_DECRYPT);
  
  	AES_cbc_encrypt(data, res, dlen, &od->u.aes_key, od->iv, AES_DE

Re: [HACKERS] pgcrypto & strong ciphers limitation

2007-07-24 Thread Zdenek Kotala

Marko Kreen wrote:

On 7/24/07, Zdenek Kotala <[EMAIL PROTECTED]> wrote:




However, on default installation  (which is commonly used) it is a
problem. Regression test cannot be fixed because it tests strong
ciphers, but there two very strange issue:

1) First issue is blowfish cipher. Because pgcrypto uses old interface
instead new "evp" it calls bf_set_key function which does not return any
output and cut key if it is too long. See
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/common/openssl/crypto/bf/bf_skey.c 


line 84.

If user installs strong crypto he will not be able decrypt data which
has been encrypted before.

The fix of this issue is ugly, because there is not way how to verify
supported key length with old openssl API and only new API return err if
length is not supported.


NAK.  The fix is broken because it uses EVP interface.  EVP is not
a general-purpose interface because not all valid keys for cipher
pass thru it.  Only key-lengths used in SSL will work...


I'm not openssl expert, but if you look how to EVP call for setkey is 
implemented you can see that finally is call BF_set_key. Only there is 
one extra layer  see 
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/common/openssl/crypto/evp/e_bf.c




Could you rework the fix that it uses the BF_* interface,
does a test-encoding with full-length key and compares it to
expected result.  And does it just once, not on each call.


OK. I can do, but it is not general solution. Because it will work only 
in our case, because we know 128 is a restricted limit.



That should be put into separate function probably.


yes


2) AES ciphere crashes when key is longer. It happens because return
value from AES_set_encrypt_key is ignored and AES_encrypt is called with
uninitialized structure.


ACK, error checking is good.  But please return PXE_KEY_TOO_BIG
directly from ossl_aes_key_init.


OK.


I must admit the internal API for ciphers is clumsy and could
need rework to something saner.  This shows here.


I attach patch which fix both issues, but main problem is there that old
openssl API is used and supported key lengths are hardcoded. I think we
can add to TODO list rewrite pgcrypto to use evp openssl interface.


pgcrypto _was_ written using EVP, but I needed to rewrite it
when I found out EVP supports only key lengths used in SSL.


Is it still correct? It seems that blowfish accepts all key range, but 
How I mention I'm not openssl guru and documentation is very bad :(.


Zdenek




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] strange buildfarm failure on lionfish

2007-07-24 Thread Zdenek Kotala

Stefan Kaltenbrunner wrote:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lionfish&dt=2007-07-24%2005:30:13 




any ideas ?



This test is very sensitive to floating point operations behavior. Any 
gcc, libc update on this machine?


Zdenek

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Zdenek Kotala

Hans-Juergen Schoenig napsal(a):
the idea is basically to hide codes - many companies want that and ask 
for it again and again.


i would suggest keys to reside in $PGDATA. we do this for SSL and so 
already.


initdb could create such keys so that they are unique to every database 
instance.
decrypting could be avoided as much as possible basically we should just 
decrypt on first all and when it changes.


But, Companies want to hide code also because they distribute their software. If 
you store key somewhere on server, user will be able to decrypt the original 
code.  If I remember correctly Oracle wrap generates something like bytecode and 
each Oracle installation is able to understand them. But It is not possible 
decode it back to original form.



My suggestion is to extend PL API and each PL language should offer wrap or 
encrypt function which generate encrypted code and this code will be store in 
the pg_proc. PL language will be responsible to detect if it raw or crypted 
code. PG_Dump will dump crypted procedure and author is responsible keep his 
uncrypted version in source repository.



Zdenek

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Question about change in page/tuple header (v4)

2007-08-14 Thread Zdenek Kotala

I'm comparing now different version of page layer, and I have two questions:

1) We now store only low 16bits TLI, but name in structure stays same. 
Maybe pg_tli_lo could be better.


2) HASOID has been moved in infomask and original place is unused. Is 
there some reason for that? This change little bit complicate tuple 
header upgrade. If there is not real reason to have it in current place, 
I recommend to move it back.



thanks Zdenek

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] PAM authentication fails for local UNIX users

2007-08-20 Thread Zdenek Kotala

Andrew Dunstan wrote:



Dhanaraj M wrote:




The non-root user does not have the permission to read other unix 
local user password.

I found two solutions:

1. usermod -K defaultpriv=Basic,file_dac_read  postgres

 - Gives privilege to read all files. This solution works. Is it the 
right way to do?


2. chmod +s processName

 - This does not work, because postgres never allows this.


Is there anyother solution to this problem?


Usage questions really don't belong on -hackers - in future please use 
-general. Both your proposed solutions are utterly insecure.


The problem what Dhanaraj tries to address is how to secure solve problem with 
PAM and local user. Other servers (e.g. sshd) allow to run master under root 
(with limited privileges) and forked process under normal user. But postgresql

requires start as non-root user. It limits to used common pattern.

There is important question:

Is current requirement to run postgresql under non-root OK? If yes, than we must 
update PAM documentation to explain this situation which will never works 
secure. Or if we say No, it is stupid limitation (in case when UID 0 says 
nothing about user's privileges) then we must start discussion about solution.




See  http://itc.musc.edu/wiki/PostgreSQL for some discussion of using 
PAM for postgres auth.


It also offer also same insecure solution to add read permission on shadow for 
postgresql user.



Zdenek

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] PAM authentication fails for local UNIX users

2007-08-20 Thread Zdenek Kotala

Andrew Dunstan wrote:



Zdenek Kotala wrote:


The problem what Dhanaraj tries to address is how to secure solve 
problem with PAM and local user. Other servers (e.g. sshd) allow to 
run master under root (with limited privileges) and forked process 
under normal user. But postgresql

requires start as non-root user. It limits to used common pattern.

There is important question:

Is current requirement to run postgresql under non-root OK? If yes, 
than we must update PAM documentation to explain this situation which 
will never works secure. Or if we say No, it is stupid limitation (in 
case when UID 0 says nothing about user's privileges) then we must 
start discussion about solution.





For now I think we should update the docs. 


I agree.


I suspect 
the changes involved in allowing us to  run as root and then give up 
privileges safely would be huge, and the gain quite small.


The main problem there is that there are a lot of different ways how to 
do it and there is not standard. For example on Solaris applications use 
RBAC functionality to handle privileges and this is not available on 
other platforms and so on...



I'd rather see an HBA fallback mechanism, which I suspect might overcome 
most of the  problems being encountered here.


The question is why don't use fallback functionality guaranteed by PAM 
and naming services. It seems that only fallback to or from password 
auth makes sense. Other could be handled by PAM/naming.




Zdenek

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [COMMITTERS] pgsql: Add configure option --with-system-tzdata to use operating system

2007-08-23 Thread Zdenek Kotala

Gregory Stark wrote:

"Tom Lane" <[EMAIL PROTECTED]> writes:





Why would --with-zoneinfo want to use a symlink though? Shouldn't it just
compile the binary to use the path specified directly? Symlinks are fine for a
sysadmin or a packager but if it's going to be supported by Postgres code
directly why not do it directly?


I sent patch which hardwire path into postgres couple of months ago. There is 
discussion:


http://archives.postgresql.org/pgsql-patches/2007-03/msg00293.php

with conclusion do it as a symlink... However I still convinced that hardwired 
solution invoke less problems then symlink solution.


Zdenek




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] [COMMITTERS] pgsql: Add configure option --with-system-tzdata to use operating system

2007-08-23 Thread Zdenek Kotala

Tom Lane wrote:

[ catching up on today's email ]

Gregory Stark <[EMAIL PROTECTED]> writes:



Why would --with-zoneinfo want to use a symlink though? Shouldn't it just
compile the binary to use the path specified directly?


AFAICS that just moves the problem to a different place, one where an
admin *can't* fix it without recompiling ...


Is there real reason why admin should do it? Timezone files are stored in 
defined location where "all" software expect it.  If it is important to change 
timezone file location we could add extra command line parameter or GUC 
variable, but I don't see any reason for do it configurable on run-time.



Zdenek


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [COMMITTERS] pgsql: Add configure option --with-system-tzdata to use operating system

2007-08-23 Thread Zdenek Kotala

Tom Lane wrote:



I think we've already found out that the opinions *aren't* equal.
So far the score seems to be:
Red Hat: will use relative symlink
Solaris: will use hardwired path in program


Finally, Because my original patch has not been accepted, Solaris also 
use relative symlink for PostgreSQL 8.2. Absolute symlink invokes some 
problem during packaging and also breaks live upgrade.



Zdenek

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] [COMMITTERS] pgsql: Add configure option --with-system-tzdata to use operating system

2007-08-23 Thread Zdenek Kotala

Alvaro Herrera wrote:



It would be even better to use --with-system-tzdata=/usr/share/zoneinfo
which enables lookup in the specified dir, hardwired at compile time in
the executable (I'm not aware if the patch already accepts a path
argument -- seems like the only sane choice).  No symlink needed.


Yes, path is accepted (if I remember correctly :-). One thing should be 
added and it is default path based on OS.


Zdenek

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] [COMMITTERS] pgsql: Add configure option --with-system-tzdata to use operating system

2007-08-23 Thread Zdenek Kotala

Tom Lane wrote:

Peter Eisentraut <[EMAIL PROTECTED]> writes:
FWIW, hardwiring plus an environment variable would seem to address all 
currently known und unknown requirements, and is not so totally 
different from solutions to previous, related problems.


I think we are converging on the recognition that Zdenek was right the
first time.  I'd go for plain hardwired --- I think an environment
variable is more likely to be a foot-gun than anything useful.
(Recall the original argument that choosing to use an external TZ
database is not something to be done lightly.)


Tom,

Let me know if there something what should be adjusted on my patch. I 
would like to do it tomorrow, because I will be offline for next two weeks.


thanks Zdenek

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] pg_ctl configurable timeout

2007-08-23 Thread Zdenek Kotala

Peter Eisentraut wrote:
I'm having trouble with the hardcoded 60 second timeout in pg_ctl.  pg_ctl 
sometimes just times out and there is no way to make it wait a little longer.  
I would like to add an option to be able to change that, say 
pg_ctl -w --timeout=120.  Comments?


+1

I played with 2GB shared buffers and stop action takes 10-20s. On system 
with more memory 60s is not enough.



Zdenek


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [COMMITTERS] pgsql: Add configure option --with-system-tzdata to use operating system

2007-08-23 Thread Zdenek Kotala

Peter Eisentraut wrote:

Zdenek Kotala wrote:

Let me know if there something what should be adjusted on my patch. I
would like to do it tomorrow, because I will be offline for next two
weeks.


I can try to fit your patch into what's there now, if you'd rather just 
start your holiday or whatever.


OK, thanks

Zdenek






---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] pg_ctl configurable timeout

2007-08-23 Thread Zdenek Kotala

Joshua D. Drake wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Zdenek Kotala wrote:

Peter Eisentraut wrote:
I'm having trouble with the hardcoded 60 second timeout in pg_ctl. 
pg_ctl sometimes just times out and there is no way to make it wait a

little longer.  I would like to add an option to be able to change
that, say pg_ctl -w --timeout=120.  Comments?

+1

I played with 2GB shared buffers and stop action takes 10-20s. On system
with more memory 60s is not enough.


Huh? I have never seen this problem.



It happened when I stop server after heavy performance test. I expected 
that postgres tries to check if there is not some dirty page in the 
buffer, but I did not investigate in it.


Zdenek

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Per-function search_path => per-function GUC settings

2007-09-11 Thread Zdenek Kotala

Tom Lane wrote:



I thought about ways to include GUC settings directly into CREATE
FUNCTION, but it seemed pretty ugly and inconsistent with the
existing syntax.  So I'm thinking of supporting only the above
syntaxes, meaning it'll take at least two commands to create a secure
SECURITY DEFINER function.

Comments?


I have a question about what does happen if search path is not defined 
for SECURITY DEFINER function. My expectation is that SECURITY DEFINER 
function should defined empty search patch in this case. This behavior 
is similar to how dynamic linker processes setuid binaries - (ignoring 
LD_LIBRARY_PATH and so on).



Zdenek


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] pgcrypto related backend crash on solaris 10/x86_64

2007-09-11 Thread Zdenek Kotala

Marko Kreen wrote:

On 9/9/07, Stefan Kaltenbrunner <[EMAIL PROTECTED]> wrote:

I brought back clownfish(still a bit dubious about the unexplained
failures which seem vmware emulation bugs but this one seems to be
easily reproduceable) onto the buildfarm and enabled --with-openssl
after the the recent openssl/pgcrypto related fixes but I'm still
getting a backend crash during the pgcrypto regression tests:

http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=clownfish&dt=2007-09-09%2012:14:50



backtrace looks like:

program terminated by signal SEGV (no mapping at the fault address)
0xfd7fff241b61: AES_encrypt+0x0241: xorq (%r15,%rdx,8),%rbx
(dbx) where
=>[1] AES_encrypt(0x5, 0x39dc9a7a, 0xf560e7b50e, 0x90ca350d49,
0xf560e7b50ea90dfb, 0x6b6b6b6b), at 0xfd7fff241b61
  [2] 0x0(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x0


This is crashing because of the crippled OpenSSL on some version
of Solaris.  Zdenek Kotala posted a workaround for that, I am
cleaning it but have not found the time to finalize it.

I'll try to post v03 of Zdenek's patch ASAP.



However, I guess there still will be a problem with regression tests, 
because pg_crypto will reports error in case when user tries to use 
stronger cipher, but it generates diff between expected and real output.


I don't know if is possible select different output based on test if 
strong crypto is installed or not. Maybe some magic in 
Makefile/Configure. Test should be:


# ldd /usr/postgres/8.2/lib/pgcrypto.so  | grep libcrypto_extra
#   libcrypto_extra.so.0.9.8 =>  (file not found)

if output contains (file not found) library is not installed or not in 
path (/usr/sfw/lib).



Zdenek

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Per-function search_path => per-function GUC settings

2007-09-11 Thread Zdenek Kotala

Tom Lane wrote:

Zdenek Kotala <[EMAIL PROTECTED]> writes:
I have a question about what does happen if search path is not defined 
for SECURITY DEFINER function. My expectation is that SECURITY DEFINER 
function should defined empty search patch in this case.


Your expectation is incorrect.  We are not in the business of breaking
every application in sight, which is what that would do.  


Oh. I see. In this point of view I suggest to add some warning about 
potential security issue if SECURITY DEFINER function will create 
without preset search_path. I'm aware that a lot of developer forget to 
modify their application.



Zdenek

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] pgcrypto related backend crash on solaris 10/x86_64

2007-09-11 Thread Zdenek Kotala

Marko Kreen wrote:

On 9/11/07, Zdenek Kotala <[EMAIL PROTECTED]> wrote:

Marko Kreen wrote:

This is crashing because of the crippled OpenSSL on some version
of Solaris.  Zdenek Kotala posted a workaround for that, I am
cleaning it but have not found the time to finalize it.

I'll try to post v03 of Zdenek's patch ASAP.



However, I guess there still will be a problem with regression tests,
because pg_crypto will reports error in case when user tries to use
stronger cipher, but it generates diff between expected and real output.

I don't know if is possible select different output based on test if
strong crypto is installed or not. Maybe some magic in
Makefile/Configure. Test should be:

# ldd /usr/postgres/8.2/lib/pgcrypto.so  | grep libcrypto_extra
#   libcrypto_extra.so.0.9.8 =>  (file not found)

if output contains (file not found) library is not installed or not in
path (/usr/sfw/lib).


Failing regression tests are fine - it is good if user can
easily see that the os is broken.



But if build machine  still complain about problem we can easily 
overlook another problems. There are two possible solution 1) modify reg 
test or 2) recommend to install crypto package on all affected build 
machine.


Anyway I plan to add some mention into solaris FAQ when we will have 
final patch. I also think It should be good to mention in pg_crypto 
README or add comment into regression test expected output file which 
will be visible in regression.diff.



Zdenek


---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] Timezones change - never ending story

2007-09-17 Thread Zdenek Kotala
Just for information. Venezuela is going to have new timezone change 
(30minutes shift) on this weekend. This change is not yet integrated in 
the last version in Olson database. (Original announcement said it 
happens on 1.1.2008)


More info:

http://www.reuters.com/article/oddlyEnoughNews/idUSN2328980320070824


Zdenek

PS: Can we add information about Olson database version somewhere (e.g 
Readme in src/timezone directory, or in Release notes?




---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Timezones change - never ending story

2007-09-20 Thread Zdenek Kotala

Gregory Stark napsal(a):

"Zdenek Kotala" <[EMAIL PROTECTED]> writes:


Just for information. Venezuela is going to have new timezone change (30minutes
shift) on this weekend. This change is not yet integrated in the last version
in Olson database. (Original announcement said it happens on 1.1.2008)


Is there still time to slip this into the upcoming 8.2.5?

I haven't been able to find anything which specifies precisely when it'll
happen though. Just knowing the week or even day isn't enough.


There is latest info

http://www.worldtimezone.com/dst_news/dst_news_venezuela02.html

but I think until Olson database will not be updated, we can do nothing. Maybe 
put some info on webpage.



Zdenek



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-09-25 Thread Zdenek Kotala

Tom Lane wrote:

Gregory Stark <[EMAIL PROTECTED]> writes:

We previously discussed compressing the numeric data type for small values:
http://archives.postgresql.org/pgsql-hackers/2007-06/msg00715.php



We didn't do this for 8.3 but in any case Tom did suggest we ought to reverse
the weight and sign/dscale so we could do this sometime without introducing
another incompatibility.


I had forgotten about that, but it does seem like a good idea to do it now.
Any objections?


For in-place upgrade purpose It would be good change also OID for 
numeric type and preserve current OID for current implementation on 
updated system.



Zdenek

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] pgcrypto & strong ciphers limitation

2007-09-26 Thread Zdenek Kotala

Stefan Kaltenbrunner wrote:

Bruce Momjian wrote:

Just confirming, this should be applied to 8.3, right?


I think marko is working on an updated patch for this:

http://archives.postgresql.org/pgsql-hackers/2007-09/msg00386.php

without that the backend will coredump if ones uses string ciphers with 
pgcrypto on a default solaris install so it seems like a thing we should 
fix for 8.3.


Yes, I also would like to have backport for 8.2 and 8.1. Because this 
branch are also affected. (I think backport is easy there are no much 
change between 8.1 and 8.3)



Zdenek

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Reducing NUMERIC size for 8.3

2007-09-26 Thread Zdenek . Kotala

Andrew Dunstan wrote:




Zdenek Kotala wrote:


Tom Lane wrote:


Gregory Stark <[EMAIL PROTECTED]> writes:

We previously discussed compressing the numeric data type for small 
values:

http://archives.postgresql.org/pgsql-hackers/2007-06/msg00715.php



We didn't do this for 8.3 but in any case Tom did suggest we ought 
to reverse
the weight and sign/dscale so we could do this sometime without 
introducing

another incompatibility.



I had forgotten about that, but it does seem like a good idea to do 
it now.

Any objections?



For in-place upgrade purpose It would be good change also OID for 
numeric type and preserve current OID for current implementation on 
updated system.



   



If we want to get into that game we need a better way of allocating 
Oids. Right now anything not currently used is liable to be grabbed, 
so there's a high risk of reuse.


Yes, it will be necessary. Or maybe second way is create only really 
base types (name, int, bool ...) on bootstrap and others types will be 
created in standard manner by CREATE TYPE, CREATE OPERATOR ... 
commands.  Or third way is not remove old datatypes and only rename them 
to e.g. numeric_old1 ...


  Zdenek


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


  1   2   3   4   5   6   7   8   >