Re: [HACKERS] PGBuildfarm member skylark Branch HEAD Failed at Stage Make

2007-09-28 Thread Magnus Hagander
PG Build Farm wrote:
> The PGBuildfarm member skylark had the following event on branch HEAD:
> 
> Failed at Stage: Make
> 
> The snapshot timestamp for the build that triggered this notification is: 
> 2007-09-29 03:00:01
> 
> The specs of this machine are:
> OS:  Windows XP / SP2
> Arch: x64
> Comp: Visual C++ / 14.00.50727.762
> 
> For more information, see 
> http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=skylark&br=HEAD

I think this just needs a new object added to the libpgport list in
Mkvcbuild.pm at line 46-50. I can do this monday, no earlier :-( (I
could do it now, but I'm mobile so I can't test it, so I won't do it) If
someone else who can actually test it wants to put that in, please do.

//Magnus

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Getting to 8.3 beta1

2007-09-28 Thread Greg Smith

On Thu, 27 Sep 2007, Tom Lane wrote:

Also, I spent a dreary two or three hours this afternoon examining the 
CVS commit logs since 8.3 branched...I tried to post that info to 
pgsql-docs but it broke the list's message size limits (even gzipped, 
it's about 90K).


I just dumped a copy of Tom's file on my personal page and posted a much 
slimmed down version that's missing the names of the files touched onto 
the wiki at http://developer.postgresql.org/index.php/8.3_Changelog (with 
a pointer to the full text in case anyone needs to dig back into the 
details to figure out what a commit touched).


There was already an outline for building the release notes into at 
http://developer.postgresql.org/index.php/8.3release


What I was thinking might be a useful way for multiple people to hack away 
at this problem is to start fleshing out the standard release note 
one-line summaries onto the one page, then delete the relevant commits 
from the other.  When the changelog page is empty, then everything is 
documented.


I didn't actually start doing this though as I didn't want to dump any 
more time into a process that may not actually be used.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[HACKERS] PG on NFS may be just a bad idea

2007-09-28 Thread Tom Lane
I spent a bit of time tonight poking at the issue reported here:
http://archives.postgresql.org/pgsql-novice/2007-08/msg00123.php

It turns out to be quite easy to reproduce, at least for me: start CVS
HEAD on an NFS-mounted $PGDATA directory, and run the contrib regression
tests ("make installcheck" in contrib/).  I see more than half of the
DROP DATABASE commands complaining in exactly the way Miya describes.
This failure rate might be an artifact of the particular environment
(I tested NFS client = Fedora Core 6, server = HPUX 10.20 on a much
slower machine) but the problem is clearly real.

In the earlier thread I cited suggestions that this behavior comes from
client programs holding files open longer than they should.  However,
strace'ing this behavior shows no evidence at all that that is happening
in Postgres.  I have an strace that shows conclusively that the bgwriter
never opened any file in the target database at all, and all earlier
backends exited before the one doing the DROP DATABASE began its dirty
work, and yet:

[pid 19211] 22:50:30.517077 rmdir("base/18193") = -1 ENOTEMPTY (Directory not 
empty)
[pid 19211] 22:50:30.517863 write(2, "WARNING:  could not remove file "..., 
79WARNING:  could not remove file or directory "base/18193": Directory not empty
) = 79
[pid 19211] 22:50:30.517974 sendto(7, "N\0\0\0rSWARNING\0C01000\0Mcould not 
"..., 115, 0, NULL, 0) = 115

After some googling I think that the damage may actually be getting done
at the kernel level.  According to
http://www.time-travellers.org/shane/papers/NFS_considered_harmful.html
it is fairly common for NFS clients to cache writes, meaning that the
kernel itself may be holding an old write and not sending it to the NFS
server until after the file deletion command has been sent.

(I don't have the network-fu needed to prove that this is happening by
sniffing the network traffic; anyone want to try?)

If this is what's happening I'd claim it is a kernel bug, but seeing
that I see it on FC6 and Miya sees it on Solaris 10, it would be a bug
widespread enough that we'd not be likely to get it killed off soon.

Maybe we need to actively discourage people from running Postgres
against NFS-mounted data directories.  Shane Kerr's paper cited above
mentions some other rather scary properties, including O_EXCL file
creation not really working properly.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] [HACKERS] Add function for quote_qualified_identifier?

2007-09-28 Thread Brendan Jurd
On 9/29/07, Bruce Momjian <[EMAIL PROTECTED]> wrote:
> Has anyone every asked for this functionality?

I searched the list archives for previous mentions of the topic, and
didn't find any.  So the answer to your question is "yes", but so far
it seems to be just me.

Cheers,
BJ

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Getting to 8.3 beta1

2007-09-28 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> On Thu, 2007-09-27 at 13:01 -0400, Tom Lane wrote:
>> Anyway, if you can test this tomorrow that'll be great.  I have enough
>> other things to do today ...

> Looks good to me. I was and am still nervous of weird knock-on effects,
> but I think its the right patch to apply.

Me too --- committed and back-patched.

> We need to reword the doc section about time travel, but I'll let you
> tackle that bit, at least for now.

Right, done.

BTW, I realized that the reason I objected to your "option #4"
originally was that I thought you were proposing to make *every*
recovery start a new timeline.  It's not hard to imagine
sorcerer's-apprentice problems in a repeated crash and restart scenario.
But actually this is just starting one new timeline per (successful)
archive recovery, and each one of those will require manual intervention
to kick it off; so it's pretty much impossible to believe an
installation would ever create an untenable number of timelines.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Turn off vacuum in pgbench?

2007-09-28 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> I'd be inclined to leave it there, simply because you'll be changing
>> the conditions of the benchmark if you take it out.  I have not noticed
>> any particular problems with it...

> I wonder if autovacuum itself is going to add more variability to the
> test (like we don't have enough already).

Of course it will, which means that people will likely turn off autovac
when trying to obtain repeatable pgbench numbers, which is another
reason not to take out the built-in vacuum step.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Turn off vacuum in pgbench?

2007-09-28 Thread Bruce Momjian
Tom Lane wrote:
> Tatsuo Ishii <[EMAIL PROTECTED]> writes:
> > Now that PostgreSQL 8.3 enables autovacuum by default, I think pgbench
> > should stop issuing vacuum in pgbench -i since an ordinary vacuum will
> > take very long time under autovacuum running. If there's no objection,
> > I will remove vacuum from pgbench.
> 
> I'd be inclined to leave it there, simply because you'll be changing
> the conditions of the benchmark if you take it out.  I have not noticed
> any particular problems with it...

I wonder if autovacuum itself is going to add more variability to the
test (like we don't have enough already).

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Enforcing database encoding and locale match

2007-09-28 Thread Tom Lane
Zdenek Kotala <[EMAIL PROTECTED]> writes:
> The another question is what do when we know that this codeset/encoding 
> is not supported by postgres.

Ah, I finally grasped what you were on about here.  As CVS HEAD stands,
if you run initdb in an unrecognized locale, you get something like

$ LANG=zh_CN.GB18030 initdb
The files belonging to this database system will be owned by user "tgl".
This user must also own the server process.

The database cluster will be initialized with locale zh_CN.GB18030.
could not determine encoding for locale "zh_CN.GB18030": codeset is "GB18030"
initdb: could not find suitable encoding for locale "zh_CN.GB18030"
Rerun initdb with the -E option.
Try "initdb --help" for more information.
$

which is OK, but if you override it incorrectly, it'll let you do so:

$ LANG=zh_CN.GB18030 initdb -E utf8
The files belonging to this database system will be owned by user "tgl".
This user must also own the server process.

The database cluster will be initialized with locale zh_CN.GB18030.
could not determine encoding for locale "zh_CN.GB18030": codeset is "GB18030"
... but it presses merrily along ...

leading to a database which is in fact broken.

To prevent this, I think it would be sufficient to add entries to the
table for our known frontend-only encodings.  It's reasonable to assume
that anyone who wants to run Postgres will probably have a default
locale that uses *some* encoding that we support; otherwise he's going
to have a pretty unpleasant experience anyway.  If the function returns
a frontend-only encoding value then initdb will fail in a good way,
since it won't let the user select that as a database encoding.  So I
don't think we need an explicit concept of an unsupported encoding in
the table, just some more entries.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Enforcing database encoding and locale match

2007-09-28 Thread Tom Lane
Zdenek Kotala <[EMAIL PROTECTED]> writes:
> On Solaris I got following problematic locales: [...]

I tried this program on Mac OS X 10.4.10 (the current release) and found
out that what that OS mostly returns is the encoding portion of the
locale name, for instance

sv_SE.ISO8859-15... ISO8859-15 - OK
sv_SE.UTF-8 ... UTF-8  - OK
tr_TR   ...- NO MATCH
tr_TR.ISO8859-9 ... ISO8859-9  - OK
tr_TR.UTF-8 ... UTF-8  - OK
uk_UA   ...- NO MATCH
uk_UA.ISO8859-5 ... ISO8859-5  - OK
uk_UA.KOI8-U... KOI8-U - NO MATCH
uk_UA.UTF-8 ... UTF-8  - OK
zh_CN   ...- NO MATCH
zh_CN.eucCN ... eucCN  - OK
zh_CN.GB18030   ... GB18030- NO MATCH
zh_CN.GB2312... GB2312 - OK
zh_CN.GBK   ... GBK- NO MATCH
zh_CN.UTF-8 ... UTF-8  - OK
zh_HK   ...- NO MATCH
zh_HK.Big5HKSCS ... Big5HKSCS  - NO MATCH
zh_HK.UTF-8 ... UTF-8  - OK
zh_TW   ...- NO MATCH
zh_TW.Big5  ... Big5   - NO MATCH
zh_TW.UTF-8 ... UTF-8  - OK
C   ... US-ASCII   - NO MATCH
POSIX   ... US-ASCII   - NO MATCH

They didn't *quite* hard-wire it that way, as evidenced by the C/POSIX
results, but certainly the empty-string results are entirely useless.
Perhaps we should file a bug with Apple.  However, some poking around
in /usr/share/locale indicates that there's a consistent interpretation
to be made:

g42:/usr/share/locale tgl$ ls -l ??_??/LC_CTYPE
lrwxr-xr-x   1 root  wheel17 Apr 26  2006 af_ZA/LC_CTYPE@ -> 
../UTF-8/LC_CTYPE
-r--r--r--   1 root  wheel  3272 Mar 20  2005 am_ET/LC_CTYPE
lrwxr-xr-x   1 root  wheel17 Apr 26  2006 be_BY/LC_CTYPE@ -> 
../UTF-8/LC_CTYPE
lrwxr-xr-x   1 root  wheel17 Apr 26  2006 bg_BG/LC_CTYPE@ -> 
../UTF-8/LC_CTYPE
lrwxr-xr-x   1 root  wheel17 Apr 26  2006 ca_ES/LC_CTYPE@ -> 
../UTF-8/LC_CTYPE
lrwxr-xr-x   1 root  wheel17 Apr 26  2006 cs_CZ/LC_CTYPE@ -> 
../UTF-8/LC_CTYPE
lrwxr-xr-x   1 root  wheel17 Apr 26  2006 da_DK/LC_CTYPE@ -> 
../UTF-8/LC_CTYPE
lrwxr-xr-x   1 root  wheel17 Apr 26  2006 de_AT/LC_CTYPE@ -> 
../UTF-8/LC_CTYPE
lrwxr-xr-x   1 root  wheel17 Apr 26  2006 de_CH/LC_CTYPE@ -> 
../UTF-8/LC_CTYPE
lrwxr-xr-x   1 root  wheel17 Apr 26  2006 de_DE/LC_CTYPE@ -> 
../UTF-8/LC_CTYPE
lrwxr-xr-x   1 root  wheel17 Apr 26  2006 el_GR/LC_CTYPE@ -> 
../UTF-8/LC_CTYPE
lrwxr-xr-x   1 root  wheel17 Apr 26  2006 en_AU/LC_CTYPE@ -> 
../UTF-8/LC_CTYPE
lrwxr-xr-x   1 root  wheel17 Apr 26  2006 en_CA/LC_CTYPE@ -> 
../UTF-8/LC_CTYPE
(etc etc)

The only one that's not actually a symlink to the standard UTF-8 ctype
is am_ET/LC_CTYPE, which is identical to am_ET.UTF-8/LC_CTYPE.
So I think we can get away with something like

#ifdef __darwin__
if (strlen(sys) == 0)
// assume UTF8
#endif

I suppose we'll need a few more hacks like this as the beta-test results
begin to roll in ...

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [HACKERS] Add function for quote_qualified_identifier?

2007-09-28 Thread Bruce Momjian
Tom Lane wrote:
> "Brendan Jurd" <[EMAIL PROTECTED]> writes:
> > Patch includes documentation and new regression tests.  While I was in
> > there I also added regression tests for quote_ident(), which appeared
> > to be absent.
> 
> This seems rather pointless, since it's equivalent to
>   quote_ident(schemaname) || '.' || quote_ident(relname).

Has anyone every asked for this functionality?

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Getting to 8.3 beta1

2007-09-28 Thread Bricklen Anderson

Simon Riggs wrote:

...knock-on...
tackle


Been watching the Rugby World Cup?  :)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] TODO/exotic features/sql*net

2007-09-28 Thread Bruce Momjian
Peter Eisentraut wrote:
> Am Freitag, 21. September 2007 schrieb Abhijit Menon-Sen:
> > Regarding this item in the TODO:
> >
> > SQL*Net listener that makes PostgreSQL appear as an Oracle database
> > to clients
> 
> > (IMO, the TODO item should be dropped.)
> 
> Yeah, if at all, this should be an external proxy server.

Removed from TODO.  Thanks.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Enforcing database encoding and locale match

2007-09-28 Thread Zdenek Kotala

Tom Lane wrote:

Zdenek Kotala <[EMAIL PROTECTED]> writes:

On Solaris I got following problematic locales:



C   ... 646- NO MATCH
POSIX   ... 646- NO MATCH
cs  ... 646- NO MATCH
da  ... 646- NO MATCH
et  ... 646- NO MATCH
it  ... 646- NO MATCH
ja_JP.PCK   ... PCK- NO MATCH
ko  ... 646- NO MATCH
no  ... 646- NO MATCH
ru  ... 646- NO MATCH
sl  ... 646- NO MATCH
sv  ... 646- NO MATCH
tr  ... 646- NO MATCH
zh.GBK  ... GBK- NO MATCH
zh_CN.GB18030   ... GB18030- NO MATCH
[EMAIL PROTECTED]... GB18030- NO MATCH
[EMAIL PROTECTED]   ... GB18030- NO MATCH
[EMAIL PROTECTED]... GB18030- NO MATCH
zh_CN.GBK   ... GBK- NO MATCH
[EMAIL PROTECTED]... GBK- NO MATCH
[EMAIL PROTECTED]   ... GBK- NO MATCH
[EMAIL PROTECTED]... GBK- NO MATCH


Not sure what 646 or PCK are, but we don't need to worry about GB18030
or GBK, because those aren't allowed backend encodings.



PCK is Japanese Shift-JIS encoding. (see
http://www.inter-locale.com/whitepaper/learn/learn_to_type.html)

http://en.wikipedia.org/wiki/Shift_JIS

646 looks like ISO646. I will check it.

http://en.wikipedia.org/wiki/ISO646



The another question is what do when we know that this codeset/encoding 
is not supported by postgres.


I don't really see a need to worry about this case.  The proposed encoding
will already have been checked to be sure it's one that the backend supports.
All we need is to be able to recognize any variant spelling of the
encodings we allow.


OK. Maybe would be good put mapping into text file (e.g. encoding.map) 
into share directory. (Similar to tz_abbrev)


Zdenek

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Getting to 8.3 beta1

2007-09-28 Thread Simon Riggs
On Thu, 2007-09-27 at 13:01 -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > AFAICS the correct test would be
> > if (InArchiveRecovery)
> > since needNewTimeLine can only be true iff InArchiveRecovery is true.
> 
> > It's often a good idea to disable archive_mode when doing a recovery to
> > avoid trying to send files to the same archive as the primary, which
> > would then also fail. So requiring XLogArchivingActive() also may not be
> > desirable.
> 
> Well, that I think is exactly the core of the issue: the "input" archive
> area might or might not be the same as the "output" one.  If they're
> different then this isn't a critical problem; but we have no good way
> to know that.
> 
> But your simplification may be a good idea anyway --- the fewer
> behaviors to think about, the better.

Amen to that.

> Anyway, if you can test this tomorrow that'll be great.  I have enough
> other things to do today ...

Looks good to me. I was and am still nervous of weird knock-on effects,
but I think its the right patch to apply.

There's a whole wedge of new functionality there, so beta should be fun.

We need to reword the doc section about time travel, but I'll let you
tackle that bit, at least for now.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] msvc >= VC7 understands __FUNCTION__

2007-09-28 Thread Hannes Eder

Hi,

Starting from version VC7 msvc supports __FUNCTION__, so I think this 
could be enabled in pg_config.h.win32, see attached diff.


-Hannes




*** ../pgsql-cvshead/src/include/pg_config.h.win32	Mon Apr 16 20:39:19 2007
--- src/include/pg_config.h.win32	Fri Sep 28 22:32:50 2007
***
*** 4,12 
  /* This file is generated from MingW ./configure, and with the following
   * changes to be valid for Visual C++ (and compatible):
   *
!  * HAVE_CBRT, HAVE_FUNCNAME_FUNC, HAVE_FUNCNAME_FUNCTION, HAVE_GETOPT,
!  * HAVE_GETOPT_H, HAVE_GETOPT_LONG, HAVE_RINT, HAVE_STRINGS_H, 
!  * HAVE_STRTOLL, HAVE_STRTOULL, HAVE_STRUCT_OPTION, ENABLE_THREAD_SAFETY
   *
   */
  
--- 4,12 
  /* This file is generated from MingW ./configure, and with the following
   * changes to be valid for Visual C++ (and compatible):
   *
!  * HAVE_CBRT, HAVE_FUNCNAME_FUNC, HAVE_GETOPT, HAVE_GETOPT_H,
!  * HAVE_GETOPT_LONG, HAVE_RINT, HAVE_STRINGS_H, HAVE_STRTOLL,
!  * HAVE_STRTOULL, HAVE_STRUCT_OPTION, ENABLE_THREAD_SAFETY
   *
   */
  
***
*** 134,140 
  //#define HAVE_FUNCNAME__FUNC 1
  
  /* Define to 1 if your compiler understands __FUNCTION__. */
! #undef HAVE_FUNCNAME__FUNCTION
  
  /* Define to 1 if you have getaddrinfo(). */
  /* #undef HAVE_GETADDRINFO */
--- 134,140 
  //#define HAVE_FUNCNAME__FUNC 1
  
  /* Define to 1 if your compiler understands __FUNCTION__. */
! #define HAVE_FUNCNAME__FUNCTION 1
  
  /* Define to 1 if you have getaddrinfo(). */
  /* #undef HAVE_GETADDRINFO */

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Enforcing database encoding and locale match

2007-09-28 Thread Tom Lane
Zdenek Kotala <[EMAIL PROTECTED]> writes:
> On Solaris I got following problematic locales:

> C   ... 646- NO MATCH
> POSIX   ... 646- NO MATCH
> cs  ... 646- NO MATCH
> da  ... 646- NO MATCH
> et  ... 646- NO MATCH
> it  ... 646- NO MATCH
> ja_JP.PCK   ... PCK- NO MATCH
> ko  ... 646- NO MATCH
> no  ... 646- NO MATCH
> ru  ... 646- NO MATCH
> sl  ... 646- NO MATCH
> sv  ... 646- NO MATCH
> tr  ... 646- NO MATCH
> zh.GBK  ... GBK- NO MATCH
> zh_CN.GB18030   ... GB18030- NO MATCH
> [EMAIL PROTECTED]... GB18030- NO MATCH
> [EMAIL PROTECTED]   ... GB18030- NO MATCH
> [EMAIL PROTECTED]... GB18030- NO MATCH
> zh_CN.GBK   ... GBK- NO MATCH
> [EMAIL PROTECTED]... GBK- NO MATCH
> [EMAIL PROTECTED]   ... GBK- NO MATCH
> [EMAIL PROTECTED]... GBK- NO MATCH

Not sure what 646 or PCK are, but we don't need to worry about GB18030
or GBK, because those aren't allowed backend encodings.

> The another question is what do when we know that this codeset/encoding 
> is not supported by postgres.

I don't really see a need to worry about this case.  The proposed encoding
will already have been checked to be sure it's one that the backend supports.
All we need is to be able to recognize any variant spelling of the
encodings we allow.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Enforcing database encoding and locale match

2007-09-28 Thread Zdenek Kotala

Tom Lane wrote:

Andrew Dunstan <[EMAIL PROTECTED]> writes:

Gregory Stark wrote:

"Tom Lane" <[EMAIL PROTECTED]> writes:

Another possibility is to treat the case as a WARNING if you're
superuser and an ERROR if you're not.  This would satisfy people
who are uncomfortable with the idea that CREATEDB privilege comes
with a built-in denial-of-service attack, while still leaving a
loophole for anyone for whom the test didn't work properly.

That sounds like a good combination

+1


After further experimentation I want to change the proposal a bit.
AFAICS, if we recognize the nl_langinfo(CODESET) result, there is
no reason not to trust the answer, so we might as well throw an
error always.  


Agree. Code seems to be OK and on POSIX compatible OS it should be work. 
 I attached testing code. With following command


 for LOCALE in `locale -a`; do ./a.out $LOCALE ; done

is should be possible to verify status on all unix OS.

On Solaris I got following problematic locales:

C   ... 646- NO MATCH
POSIX   ... 646- NO MATCH
cs  ... 646- NO MATCH
da  ... 646- NO MATCH
et  ... 646- NO MATCH
it  ... 646- NO MATCH
ja_JP.PCK   ... PCK- NO MATCH
ko  ... 646- NO MATCH
no  ... 646- NO MATCH
ru  ... 646- NO MATCH
sl  ... 646- NO MATCH
sv  ... 646- NO MATCH
tr  ... 646- NO MATCH
zh.GBK  ... GBK- NO MATCH
zh_CN.GB18030   ... GB18030- NO MATCH
[EMAIL PROTECTED]... GB18030- NO MATCH
[EMAIL PROTECTED]   ... GB18030- NO MATCH
[EMAIL PROTECTED]... GB18030- NO MATCH
zh_CN.GBK   ... GBK- NO MATCH
[EMAIL PROTECTED]... GBK- NO MATCH
[EMAIL PROTECTED]   ... GBK- NO MATCH
[EMAIL PROTECTED]... GBK- NO MATCH



The case that is problematic is where we can get a
CODESET string but we don't recognize it.  In this case it seems
appropriate to do

ereport(WARNING,
(errmsg("could not determine encoding for locale \"%s\": codeset is 
\"%s\"",
ctype, sys),
 errdetail("Please report this to <[EMAIL PROTECTED]>.")));

and then let the user do what he wants.


The another question is what do when we know that this codeset/encoding 
is not supported by postgres. Maybe extend encoding match structure to


struct encoding_match
{
enum pg_enc pg_enc_code;
const char *system_enc_name;
bool supported;
};

and in case when it is unsupported then generates error. In case when 
codeset does not match anyway then generates only warning.



Zdenek
#include 
#include 
#include "postgres_fe.h"
//#include "miscadmin.h"
#include "mb/pg_wchar.h"

/*
 * Checks whether the encoding selected for PostgreSQL and the
 * encoding used by the system locale match.
 */

struct encoding_match
{
	enum pg_enc pg_enc_code;
	const char *system_enc_name;
};

static const struct encoding_match encoding_match_list[] = {
	{PG_EUC_JP, "EUC-JP"},
	{PG_EUC_JP, "eucJP"},
	{PG_EUC_JP, "IBM-eucJP"},
	{PG_EUC_JP, "sdeckanji"},

	{PG_EUC_CN, "EUC-CN"},
	{PG_EUC_CN, "eucCN"},
	{PG_EUC_CN, "IBM-eucCN"},
	{PG_EUC_CN, "GB2312"},
	{PG_EUC_CN, "dechanzi"},

	{PG_EUC_KR, "EUC-KR"},
	{PG_EUC_KR, "eucKR"},
	{PG_EUC_KR, "IBM-eucKR"},
	{PG_EUC_KR, "deckorean"},
	{PG_EUC_KR, "5601"},

	{PG_EUC_TW, "EUC-TW"},
	{PG_EUC_TW, "eucTW"},
	{PG_EUC_TW, "IBM-eucTW"},
	{PG_EUC_TW, "cns11643"},

#ifdef NOT_VERIFIED
	{PG_JOHAB, "???"},
#endif

	{PG_UTF8, "UTF-8"},
	{PG_UTF8, "utf8"},

	{PG_LATIN1, "ISO-8859-1"},
	{PG_LATIN1, "ISO8859-1"},
	{PG_LATIN1, "iso88591"},

	{PG_LATIN2, "ISO-8859-2"},
	{PG_LATIN2, "ISO8859-2"},
	{PG_LATIN2, "iso88592"},

	{PG_LATIN3, "ISO-8859-3"},
	{PG_LATIN3, "ISO8859-3"},
	{PG_LATIN3, "iso88593"},

	{PG_LATIN4, "ISO-8859-4"},
	{PG_LATIN4, "ISO8859-4"},
	{PG_LATIN4, "iso88594"},

	{PG_LATIN5, "ISO-8859-9"},
	{PG_LATIN5, "ISO8859-9"},
	{PG_LATIN5, "iso88599"},

	{PG_LATIN6, "ISO-8859-10"},
	{PG_LATIN6, "ISO8859-10"},
	{PG_LATIN6, "iso885910"},

	{PG_LATIN7, "ISO-8859-13"},
	{PG_LATIN7, "ISO8859-13"},
	{PG_LATIN7, "iso885913"},

	{PG_LATIN8, "ISO-8859-14"},
	{PG_LATIN8, "ISO8859-14"},
	{PG_LATIN8, "iso885914"},

	{PG_LATIN9, "ISO-8859-15"},
	{PG_LATIN9, "ISO8859-15"},
	{PG_LATIN9, "iso885915"},

	{PG_LATIN10, "ISO-8859-16"},
	{PG_LATIN10, "ISO8859-16"},
	{PG_LATIN10, "iso885916"},

	{PG_WIN1252, "CP1252"},
	{PG_WIN1253, "CP1253"},
	{PG_WIN1254, "CP1254"},
	{PG_WIN1255, "CP1255"},
	{PG_WIN1256, "CP1256"},
	{PG_WIN1257, "CP1257"},
	{PG_WIN1258, "CP1258"},
#ifdef NOT_VERIFIED
	{PG_WIN874, "???"},
#endif
	{PG_KOI8R, "KOI8-R"},
	{PG_WIN1251, "CP1251"},
	{PG_WIN866, "CP866"},

	{PG_ISO_8859_5, "ISO-8859-5"},
	{PG_ISO_8859_5, "ISO8859-5"},
	{PG_

Re: [HACKERS] Enforcing database encoding and locale match

2007-09-28 Thread Tom Lane
Andrew Dunstan <[EMAIL PROTECTED]> writes:
> Gregory Stark wrote:
>> "Tom Lane" <[EMAIL PROTECTED]> writes:
>>> Another possibility is to treat the case as a WARNING if you're
>>> superuser and an ERROR if you're not.  This would satisfy people
>>> who are uncomfortable with the idea that CREATEDB privilege comes
>>> with a built-in denial-of-service attack, while still leaving a
>>> loophole for anyone for whom the test didn't work properly.
>> 
>> That sounds like a good combination
> +1

After further experimentation I want to change the proposal a bit.
AFAICS, if we recognize the nl_langinfo(CODESET) result, there is
no reason not to trust the answer, so we might as well throw an
error always.  The case that is problematic is where we can get a
CODESET string but we don't recognize it.  In this case it seems
appropriate to do

ereport(WARNING,
(errmsg("could not determine encoding for locale \"%s\": codeset is 
\"%s\"",
ctype, sys),
 errdetail("Please report this to <[EMAIL PROTECTED]>.")));

and then let the user do what he wants.

There need to be two exceptions to the error-on-mismatch policy.

First off, if the locale is C/POSIX then we can allow any encoding.

Second, it appears that we have to allow SQL_ASCII encoding to be
selected regardless of locale; if we don't, the "make installcheck"
regression tests fail, because they try to do exactly that; and I'm
sure that there are other users out there who don't (think they)
care about encoding.  This is not quite as bad as the generic mismatch
case, because the backend will never try to do encoding conversion
and so the recursive-error panic can't happen.  But you could still
have unexpected sorting behavior and probably index corruption.

What I propose is that we allow SQL_ASCII databases to be created
when the locale is not C, but only by superusers.

Comments?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Hash index todo list item

2007-09-28 Thread Bruce Momjian

This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---

Tom Raney wrote:
> We are pleased to announce an upcoming patch to the hash index code
> which improves build time and index size, based on this item in the
> TODO list:
> During index creation, pre-sort the tuples to improve build speed
> http://archives.postgresql.org/pgsql-hackers/2007-03/msg01199.php
> 
> Using our implementation, build times and index sizes are
> comparable with btree index build times and index sizes.
> For example, for a particular 12 million row relation, the
> 8.2.4 release requires over 2.8 hours to build the hash index. 
> With our patch, the index is built in 80 seconds.
> Here is the link for a graph, showing a comparative analysis of
> btree and hash index builds and describing the benchmark data.
> http://web.cecs.pdx.edu/~raneyt/pg/
> 
> We are currently cleaning up the patch and will submit it asap.
> 
> Regards,
> Shreya Bhargava <[EMAIL PROTECTED]>
> Tom Raney <[EMAIL PROTECTED]>
> 
> 
> Kenneth Marshall wrote:
> > Dear PostgreSQL Hackers:
> >
> > After following the hackers mailing list for quite a while,
> > I am going to start investigating what will need to be done
> > to improve hash index performance. Below are the pieces of
> > this project that I am currently considering:
> >
> > 1. Characterize the current hash index implementation against
> >the BTree index, with a focus on space utilization and
> >lookup performance against a collection of test data. This
> >will give a baseline performance test to evaluate the impact
> >of changes. I initially do not plan to bench the hash creation
> >process since my initial focus will be on lookup performance.
> >
> > 2. Evaluate the performance of different hash index implementations
> >and/or changes to the current implementation. My current plan is
> >to keep the implementation as simple as possible and still provide
> >the desired performance. Several hash index suggestions deal with
> >changing the layout of the keys on a page to improve lookup
> >performance, including reducing the bucket size to a fraction of
> >a page or only storing the hash value on the page, instead of
> >the index value itself. My goal in this phase is to produce one
> >or more versions with better performance than the current BTree.
> >
> > 3. Look at build time and concurrency issues with the addition of
> >some additional tests to the test bed. (1)
> >
> > 4. Repeat as needed.
> >
> > This is the rough plan. Does anyone see anything critical that
> > is missing at this point? Please send me any suggestions for test
> > data and various performance test ideas, since I will be working
> > on that first.
> >
> > Regards,
> > Ken Marshall 
> >
> > ---(end of broadcast)---
> > TIP 5: don't forget to increase your free space map settings
> >
> 
> 
> ---(end of broadcast)---
> TIP 2: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [FEATURE REQUEST] Streaming Onlinebackup (Maybe OFFTOPIC)

2007-09-28 Thread Kevin Grittner
>>> On Thu, Sep 27, 2007 at  4:59 PM, in message
<[EMAIL PROTECTED]>, "Kevin Grittner"
<[EMAIL PROTECTED]> wrote: 
>  
> By the way, I realize that the error messages are still lame.
> I'm going to do something about that.
 
Attached is a version as good as I know how to get it.
It works for us, so barring any problems as we use it, I'm done.
 
I confirmed with management that this code can be contributed to the
PostgreSQL community at large.  It was written by myself as an employee
of the Wisconsin Supreme Court, Consolidated Court Automation Programs.
It is is distributed under the terms of the license of the University of
California as currently referenced here:
 
http://www.postgresql.org/docs/8.2/interactive/LEGALNOTICE.html
 
The only other code I looked at to derive technique was also distributed
under that license.  I gratefully acknowledge the examples provided by the
authors of the code I examined: Tom Lane and Kevin Fall; although any
errors are my own.
 
I hope that others may find this filter useful.
 
-Kevin J. Grittner
 




pg_clearxlogtail.c
Description: Binary data

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [FEATURE REQUEST] Streaming Onlinebackup (MaybeOFFTOPIC)

2007-09-28 Thread Tom Lane
"Kevin Grittner" <[EMAIL PROTECTED]> writes:
> Not quite as good.  Since the archiver process can't actually deliver
> this number in a lightweight manner, all it goes to show is that the
> filter code compares reasonably well in performance with dd and cat.

I'd definitely vote for leaving it as a filter, given that there's
not a large performance penalty for that.  It just seems a lot safer
and cleaner in that form.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Enforcing database encoding and locale match

2007-09-28 Thread Andrew Dunstan



Gregory Stark wrote:

"Tom Lane" <[EMAIL PROTECTED]> writes:

  

Another possibility is to treat the case as a WARNING if you're
superuser and an ERROR if you're not.  This would satisfy people
who are uncomfortable with the idea that CREATEDB privilege comes
with a built-in denial-of-service attack, while still leaving a
loophole for anyone for whom the test didn't work properly.



That sounds like a good combination

  


+1

cheers

andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [pgsql-packagers] [HACKERS] Getting to 8.3 beta1

2007-09-28 Thread Zdenek Kotala

Heikki Linnakangas wrote:

Zdenek Kotala wrote:

I'm Sorry for confusion, I overlooked it. You have right. Unfortunately
struct Port has been modified and by my opinion it means we must bump
major version. See
http://developer.postgresql.org/cvsweb.cgi/pgsql/src/include/libpq/libpq-be.h.diff?r1=1.62;r2=1.63


That header file is *not* part of the libpq interface, see the comment
at the beginning of the file. So no major version bump required.



I see. You are right. I rechecked also install and this file is not 
delivered. OK green for minor bumping.



Zdenek

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [pgsql-packagers] [HACKERS] Getting to 8.3 beta1

2007-09-28 Thread Tom Lane
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> Zdenek Kotala wrote:
>> struct Port has been modified and by my opinion it means we must bump
>> major version. See
>> http://developer.postgresql.org/cvsweb.cgi/pgsql/src/include/libpq/libpq-be.h.diff?r1=1.62;r2=1.63

> That header file is *not* part of the libpq interface, see the comment
> at the beginning of the file. So no major version bump required.

Right, and even pqcomm.h (to which the comment refers) is material that
is known to libpq (and other implementors of the FE/BE protocol), but is
not exposed to client applications.

This comment in pqcomm.c might be helpful:

 * At one time, libpq was shared between frontend and backend, but now
 * the backend's "backend/libpq" is quite separate from "interfaces/libpq".
 * All that remains is similarities of names to trap the unwary...


The only material that is officially part of the libpq client API is
libpq-fe.h (and postgres_ext.h which it includes).  Anyone including
other headers does so at their own risk.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Enforcing database encoding and locale match

2007-09-28 Thread Gregory Stark
"Tom Lane" <[EMAIL PROTECTED]> writes:

> Another possibility is to treat the case as a WARNING if you're
> superuser and an ERROR if you're not.  This would satisfy people
> who are uncomfortable with the idea that CREATEDB privilege comes
> with a built-in denial-of-service attack, while still leaving a
> loophole for anyone for whom the test didn't work properly.

That sounds like a good combination

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [FEATURE REQUEST] Streaming Onlinebackup (MaybeOFFTOPIC)

2007-09-28 Thread Kevin Grittner
>>> On Fri, Sep 28, 2007 at  9:38 AM, in message
<[EMAIL PROTECTED]>, "Kevin Grittner"
<[EMAIL PROTECTED]> wrote: 
 On Fri, Sep 28, 2007 at  5:53 AM, in message
> <[EMAIL PROTECTED]>, "Zeugswetter
> Andreas ADI SD" <[EMAIL PROTECTED]> wrote: 
 
>> archive_command=dd if=%p of=/backup/WAL/%f bs=1 count=%b
 
> I think
> your example would need to cat the dd you showed with one which drew
> from /dev/zero.  I'll run a few tests with full and nearly empty files
> using hand-generated values and see how the performance of this in
> front of gzip compares to the filter.
 
After Tom's email, this is pretty academic; but here are the results
for our "best case" example:
 
pg_clearxlogtail | gzip:
real0m0.132s
user0m0.119s
sys 0m0.024s
 
(dd if=00010004001A bs=1 count=132 ; dd if=/dev/zero bs=1 
count=16777084) | gzip > ../kjgtest2/00010004001A.2.gz
132+0 records in
132+0 records out
16777084+0 records in
16777084+0 records out

real0m19.243s
user0m3.211s
sys 0m27.135s
 
That's a lot worse.  I switched the bs and count:
 
(dd if=00010004001A bs=132 count=1 ; dd if=/dev/zero bs=16777084 
count=1) | gzip > ../kjgtest2/00010004001A.3.gz
1+0 records in
1+0 records out
1+0 records in
1+0 records out

real0m0.196s
user0m0.173s
sys 0m0.025s
 
The filter code still wins.
 
The "worst case" example:
 
pg_clearxlogtail | gzip:
real0m1.073s
user0m1.018s
sys 0m0.063s
gz size: 4554307
 
ADAMS-PG:/var/pgsql/data/kjgtest # time dd if=0001000300F0 
bs=16777216 count=1 | gzip > ../kjgtest2/0001000300F0.3.gz
1+0 records in
1+0 records out
 
Marginal improvement.
 
real0m1.001s
user0m0.923s
sys 0m0.081s
 
ADAMS-PG:/var/pgsql/data/kjgtest # time cat 0001000300F0 | gzip > 
../kjgtest2/0001000300F0.4.gz

real0m1.109s
user0m1.055s
sys 0m0.062s
 
Not quite as good.  Since the archiver process can't actually deliver
this number in a lightweight manner, all it goes to show is that the
filter code compares reasonably well in performance with dd and cat.
 
-Kevin
 


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] Enforcing database encoding and locale match

2007-09-28 Thread Tom Lane
I was reminded again just now of the bad consequences of selecting a
database encoding that is not compatible with your LC_CTYPE setting:
http://archives.postgresql.org/pgsql-bugs/2007-09/msg00158.php
Aside from that one, which is perilously close to being a denial of
service attack, there are known problems with sorting, upper()/lower()
behavior, etc etc.  We're going to keep hearing those types of
complaints until we do something about enforcing that people don't use
an incompatible encoding.

This has been discussed before, of course, and has foundered on the
problem that there's no very reliable/portable way to determine what
encoding is implied by LC_CTYPE.  We do have code in initdb that
purports to determine this on common platforms, but I've never trusted
it very much, because it isn't stressed hard in common use.  So the
problem is how to develop some trust in it.

It occurs me that what we could do is put that code into CREATE
DATABASE, but have it throw a WARNING not an ERROR if it thinks the
encoding doesn't match the locale.  That would be sufficiently in
people's faces that we'd hear about it if it didn't work.  After a
release cycle or so of not hearing complaints, we could promote the
warning to an error.

Another possibility is to treat the case as a WARNING if you're
superuser and an ERROR if you're not.  This would satisfy people
who are uncomfortable with the idea that CREATEDB privilege comes
with a built-in denial-of-service attack, while still leaving a
loophole for anyone for whom the test didn't work properly.

Comments?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [pgsql-packagers] [HACKERS] Getting to 8.3 beta1

2007-09-28 Thread Andrew Dunstan



Zdenek Kotala wrote:

Stephen Frost wrote:

* [EMAIL PROTECTED] ([EMAIL PROTECTED]) wrote:
I'm for bumbing. Because if we use same number it also means that 
new binary will able to use old library. But if there are two new 
functions number must be increased. Standard practice how ELF loader 
works is following:


Each library could have tree numbers libxxx-X.Y.Z. Loader/Linker 
ignores Z number. It means any binaries can be linked e.g. with 
X.Y.Z+1 or X.Y.Z-1. This is used for bugfixing. Middle number Y 
means that binaries which requires Y can also use Y+1 (and linker 
takes it), but not Y-1. It is used for adding new thing into 
interface - backward compatible. Change in major number X means it 
is not backward compatible libraries.


Right, so bump the minor and leave the major (and the overall 'soname')
the same.

In PostgreSQL perspective, we use only major number.  We can 
increase main number (X) or best way is add Y and keep major number 
same. But I don't know if it is possible in current infrastructure 
and if it will work everywhere.


I'm confused by this.  I see both in Makefile.shlib and on my system
that we have a minor version so I don't entirely follow when you say "we
use only major number".  


I'm Sorry for confusion, I overlooked it. You have right. 
Unfortunately struct Port has been modified and by my opinion it means 
we must bump major version. See 
http://developer.postgresql.org/cvsweb.cgi/pgsql/src/include/libpq/libpq-be.h.diff?r1=1.62;r2=1.63 



   


Uh, that's the backend, not the client lib, no?

cheers

andrew

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [pgsql-packagers] [HACKERS] Getting to 8.3 beta1

2007-09-28 Thread Heikki Linnakangas
Zdenek Kotala wrote:
> I'm Sorry for confusion, I overlooked it. You have right. Unfortunately
> struct Port has been modified and by my opinion it means we must bump
> major version. See
> http://developer.postgresql.org/cvsweb.cgi/pgsql/src/include/libpq/libpq-be.h.diff?r1=1.62;r2=1.63

That header file is *not* part of the libpq interface, see the comment
at the beginning of the file. So no major version bump required.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [pgsql-packagers] [HACKERS] Getting to 8.3 beta1

2007-09-28 Thread Zdenek Kotala

Stephen Frost wrote:

* [EMAIL PROTECTED] ([EMAIL PROTECTED]) wrote:
I'm for bumbing. Because if we use same number it also means that new 
binary will able to use old library. But if there are two new functions 
number must be increased. Standard practice how ELF loader works is 
following:


Each library could have tree numbers libxxx-X.Y.Z. Loader/Linker ignores Z 
number. It means any binaries can be linked e.g. with X.Y.Z+1 or X.Y.Z-1. 
This is used for bugfixing. Middle number Y means that binaries which 
requires Y can also use Y+1 (and linker takes it), but not Y-1. It is used 
for adding new thing into interface - backward compatible. Change in major 
number X means it is not backward compatible libraries.


Right, so bump the minor and leave the major (and the overall 'soname')
the same.

In PostgreSQL perspective, we use only major number.  We can increase main 
number (X) or best way is add Y and keep major number same. But I don't 
know if it is possible in current infrastructure and if it will work 
everywhere.


I'm confused by this.  I see both in Makefile.shlib and on my system
that we have a minor version so I don't entirely follow when you say "we
use only major number".  


I'm Sorry for confusion, I overlooked it. You have right. Unfortunately 
struct Port has been modified and by my opinion it means we must bump 
major version. See 
http://developer.postgresql.org/cvsweb.cgi/pgsql/src/include/libpq/libpq-be.h.diff?r1=1.62;r2=1.63 



Zdenek

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] [FEATURE REQUEST] Streaming Onlinebackup (MaybeOFFTOPIC)

2007-09-28 Thread Kevin Grittner
>>> On Fri, Sep 28, 2007 at  5:53 AM, in message
<[EMAIL PROTECTED]>, "Zeugswetter
Andreas ADI SD" <[EMAIL PROTECTED]> wrote: 
 
> I think you misunderstood what I meant.
> The actual archive command is constructed by expanding certain
> placeholders.
> I am suggesting to add such a placeholder for the size of the filled
> part of the log.
> 
> A hypothetical example (note suggested %b placeholder for size in
> bytes):
> archive_command=dd if=%p of=/backup/WAL/%f bs=1 count=%b
> 
> This allows to avoid unnecessary io for the backup of partially filled
> logs.
 
I did understand what you were suggesting regarding the size placeholder.
What didn't click is that the filter might not be necessary at all if we
had that.  Thanks for clarifying that with an example.
 
Are you also suggesting that any code which depends on the log segment
files being at the full size should be changed, too?  If not, I think
your example would need to cat the dd you showed with one which drew
from /dev/zero.  I'll run a few tests with full and nearly empty files
using hand-generated values and see how the performance of this in
front of gzip compares to the filter.
 
-Kevin
 



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [FEATURE REQUEST] Streaming Onlinebackup (MaybeOFFTOPIC)

2007-09-28 Thread Tom Lane
"Zeugswetter Andreas ADI SD" <[EMAIL PROTECTED]> writes:
> I am suggesting to add such a placeholder for the size of the filled
> part of the log.

The archiver has not got that information, and can't compute it any
faster than the called command could.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] Unclarity of configure options

2007-09-28 Thread Nikolay Samokhvalov
The current (CVS version) configure script has the following options
(among many others):

  --enable-dtrace build with DTrace support
  --with-ossp-uuidbuild with OSSP UUID library for UUID generation
  --with-libxml   build with XML support
  --with-libxslt  build with XSLT support

One could think that adding any of this option to ./configure before
building Postgres from sources, he will have corresponding support
after installation and initdb process. But what we have now is the
huge difference between "--with-libxml" and "--with-libxslt": while
the first one adds XML support to the core, the second one doesn't
provide anything automatically, it allows only using contirb/xml2
(what is unclear because the help message is the same as for
--with-libxml -- "build with ... support").

Also, comparing --enable-dtrace and --with-libxml I cannot see any
difference in its semantics: --enable-dtrace also depends on external
library and configure process fails if the system doesn't have it. So
why "--enable-" is used in the first case and "--with-" in the second
one?

-- 
Best regards,
Nikolay

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [FEATURE REQUEST] Streaming Onlinebackup (MaybeOFFTOPIC)

2007-09-28 Thread Zeugswetter Andreas ADI SD

> A nice improvement on that would be to have a "rearchive_command" to
> allow to sync the new bytes written since a previous archive_command
(so
> it needs a new placeholder "start from this byte").  This allows
writing
> dd seek=%s skip=%s count=%b bs=1

But after a log switch nothing is filling that rest anymore.
Maybe this goes too much in the direction of a "streaming the log"
implementation,
which is imho better suited to ship transactions somewhere else as soon
as possible.

Andreas

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [FEATURE REQUEST] Streaming Onlinebackup (MaybeOFFTOPIC)

2007-09-28 Thread Alvaro Herrera
Zeugswetter Andreas ADI SD wrote:
> 
> > > The probably useful next step would be to pass the current length to
> the
> > > archive_command,
> > > so it can write the filled part of the file without the need for a
> > > filter.
> >  
> > I can see that helping a lot, but not by writing onto the file on
> disk.
> > If the file is nearly empty, that would be a lot of disk I/O which
> doesn't
> > need to happen.
> 
> I think you misunderstood what I meant.
> The actual archive command is constructed by expanding certain
> placeholders.
> I am suggesting to add such a placeholder for the size of the filled
> part of the log.
> 
> A hypothetical example (note suggested %b placeholder for size in
> bytes):
> archive_command=dd if=%p of=/backup/WAL/%f bs=1 count=%b
> 
> This allows to avoid unnecessary io for the backup of partially filled
> logs.

A nice improvement on that would be to have a "rearchive_command" to
allow to sync the new bytes written since a previous archive_command (so
it needs a new placeholder "start from this byte").  This allows writing
dd seek=%s skip=%s count=%b bs=1

(I had suggested something like this when PITR was just invented, but it
was disregarded because it was too complex for the first cut or the
feature).

-- 
Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J
"Sallah, I said NO camels! That's FIVE camels; can't you count?"
(Indiana Jones)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] proposal casting from XML[] to int[], numeric[], text[]

2007-09-28 Thread Pavel Stehule
2007/9/28, Nikolay Samokhvalov <[EMAIL PROTECTED]>:
> On 9/28/07, Pavel Stehule <[EMAIL PROTECTED]> wrote:
> > > We would create wrappers returning int[], bool[], string[], but there
> > > are several issues with such functions:
> > >   - if the type of the data located on nodes that match XPath
> > > expression differs from what is expected, what should we do?
> >
> > raise exception
>
> Will it be convenient for cases when there are many different (various
> structures) XMLs in one column (no single DTD)?
>

I don't know

> >
> > >   - in XML world, if you request for a text under some node, all
> > > descendants should be involved in generating result string (example:
> > > what should be returned for XML like "PostgreSQL
> > > is a powerful, open source relational database system" if user
> > > requests for text under "em" node? In XML world, the correct answer is
> > > "PostgreSQL  is a powerful, open source relational database system" --
> > > concatenation of all strings from the node itself and all its
> > > descendants, in the correct order. Will be this expected for RDBMS
> > > users?).
> >
> > It is corect. Or we can disallow any nested elements in casting array.
> > It's poblem only for text type. Numeric types are clear.
>
> Actually, casting to numeric types might seem to be odd. But there is
> some sense from practical point of view -- it works and that's better
> that nothing (like now). But it's too late for 8.3, isn't it?
>

I thing so SQL based casting like my cust functions are relative
simple for adding to core now.
> >
> > > Regarding GIN indexes, alternative approach would be creating opclass
> > > for xml[], it should be pretty simple (and better than creating
> > > implicit CASTs for xml[]<->int[], xml[]<->bool[], etc). Can we do this
> > > for 8.3 or it's too late? It would be very helpful feature.
> >
> > It's not practic. If I would to use it for functional indexes for
> > xpath functions I need constructor for xml[], and I have not it
> > currently:
> >
> > xpath('/root/id/text()', column)::int[] @< ARRAY[199,2200,222]
>
> I do not understand. Do you mean that there is no equality comparison
> operator for type xml yet?
>

No, I mean some different. Nobody will construct special xml nodes for
quality comparision with xpath function when expect xpath's result as
int[], or float. So when result of xpath is xml[] but is with possible
casting to int[] it's more simple do casting and build index on int[]
because I can search int[].

> To implement GIN for xml[] we need to have comparison operator for
> xml. Standard says "XML values are not comparable" (subclause 4.2.4 of
> the latest draft from wiscorp.com), but without that cannot implement
> straight GIN support, what is not good :-/
>

I belive so xml values are not comparable, but I belive so the are
transferable to some of base types.

Pavel

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Getting to 8.3 beta1

2007-09-28 Thread Stephen Frost
* Heikki Linnakangas ([EMAIL PROTECTED]) wrote:
> Gregory Stark wrote:
> > What we want to know is that things like pgadmin can connect properly to
> > either 8.3, 8.2, and even 8.1 using the new libraries regardless of how the
> > server authentication is configured. Do they work correctly if the server
> > tries to do password authentication, ident, kerberos, etc.
> 
> That's a matter of protocol-compatibility, and that's not the issue
> here. We're talking about *ABI* compatibility. In a nutshell, if you
> have a an program that's compiled with 8.2 libpq-library, will it work
> when linked to 8.3 library at runtime? And if you have a program
> compiled with 8.3 library, does it work with a 8.2 library?

Right.

> But yeah, the regression tests won't help much with that. I suppose you
> could try to use an 8.2 psql with an 8.3 library and see if it works,
> but I don't know how much of the library interface psql exercises.

Eh, I figure it'll test the more common calls, though I agree that it
doesn't hit every symbol...  Perhaps we should look at expanding the
regression tests to try and cover everything exported?  That might
require a new binary if we don't want to clutter up psql, but it strikes
me as a pretty decent idea in general.

> I think it's easier and more reliable to just go through the commit logs
> for libpq, and see if anything has changed. Even that isn't bullet-proof
> though. For example, if there's an incompatible change to a struct or a
> typedef that's used as a function parameter, that breaks binary
> compatibility as well.

I agree that this is certainly the best approach, if practical.  Another
option would be to just check the symbol list using objdump.  Sounds
like Tom already did something similar though by looking through the
exports file.

> In the future, we should try to keep this in mind during the development
> cycle, and bump the minor version number the first time a
> backwards-compatible change, like adding a new function, is made, and
> bump the major version number the first time an incompatible change is
> made. Like we do for catalog version, except that we only want to bump
> the libpq version number once per release. (though we do sometimes
> forget to bump the catalog version number as well)

Agreed.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] proposal casting from XML[] to int[], numeric[], text[]

2007-09-28 Thread Nikolay Samokhvalov
On 9/28/07, Pavel Stehule <[EMAIL PROTECTED]> wrote:
> > We would create wrappers returning int[], bool[], string[], but there
> > are several issues with such functions:
> >   - if the type of the data located on nodes that match XPath
> > expression differs from what is expected, what should we do?
>
> raise exception

Will it be convenient for cases when there are many different (various
structures) XMLs in one column (no single DTD)?

>
> >   - in XML world, if you request for a text under some node, all
> > descendants should be involved in generating result string (example:
> > what should be returned for XML like "PostgreSQL
> > is a powerful, open source relational database system" if user
> > requests for text under "em" node? In XML world, the correct answer is
> > "PostgreSQL  is a powerful, open source relational database system" --
> > concatenation of all strings from the node itself and all its
> > descendants, in the correct order. Will be this expected for RDBMS
> > users?).
>
> It is corect. Or we can disallow any nested elements in casting array.
> It's poblem only for text type. Numeric types are clear.

Actually, casting to numeric types might seem to be odd. But there is
some sense from practical point of view -- it works and that's better
that nothing (like now). But it's too late for 8.3, isn't it?

>
> > Regarding GIN indexes, alternative approach would be creating opclass
> > for xml[], it should be pretty simple (and better than creating
> > implicit CASTs for xml[]<->int[], xml[]<->bool[], etc). Can we do this
> > for 8.3 or it's too late? It would be very helpful feature.
>
> It's not practic. If I would to use it for functional indexes for
> xpath functions I need constructor for xml[], and I have not it
> currently:
>
> xpath('/root/id/text()', column)::int[] @< ARRAY[199,2200,222]

I do not understand. Do you mean that there is no equality comparison
operator for type xml yet?

To implement GIN for xml[] we need to have comparison operator for
xml. Standard says "XML values are not comparable" (subclause 4.2.4 of
the latest draft from wiscorp.com), but without that cannot implement
straight GIN support, what is not good :-/

-- 
Best regards,
Nikolay

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [FEATURE REQUEST] Streaming Onlinebackup (MaybeOFFTOPIC)

2007-09-28 Thread Zeugswetter Andreas ADI SD

> > The probably useful next step would be to pass the current length to
the
> > archive_command,
> > so it can write the filled part of the file without the need for a
> > filter.
>  
> I can see that helping a lot, but not by writing onto the file on
disk.
> If the file is nearly empty, that would be a lot of disk I/O which
doesn't
> need to happen.

I think you misunderstood what I meant.
The actual archive command is constructed by expanding certain
placeholders.
I am suggesting to add such a placeholder for the size of the filled
part of the log.

A hypothetical example (note suggested %b placeholder for size in
bytes):
archive_command=dd if=%p of=/backup/WAL/%f bs=1 count=%b

This allows to avoid unnecessary io for the backup of partially filled
logs.

Andreas

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] proposal casting from XML[] to int[], numeric[], text[]

2007-09-28 Thread Pavel Stehule
> We would create wrappers returning int[], bool[], string[], but there
> are several issues with such functions:
>   - if the type of the data located on nodes that match XPath
> expression differs from what is expected, what should we do?

raise exception

>   - in XML world, if you request for a text under some node, all
> descendants should be involved in generating result string (example:
> what should be returned for XML like "PostgreSQL
> is a powerful, open source relational database system" if user
> requests for text under "em" node? In XML world, the correct answer is
> "PostgreSQL  is a powerful, open source relational database system" --
> concatenation of all strings from the node itself and all its
> descendants, in the correct order. Will be this expected for RDBMS
> users?).

It is corect. Or we can disallow any nested elements in casting array.
It's poblem only for text type. Numeric types are clear.

> Regarding GIN indexes, alternative approach would be creating opclass
> for xml[], it should be pretty simple (and better than creating
> implicit CASTs for xml[]<->int[], xml[]<->bool[], etc). Can we do this
> for 8.3 or it's too late? It would be very helpful feature.

It's not practic. If I would to use it for functional indexes for
xpath functions I need constructor for xml[], and I have not it
currently:

xpath('/root/id/text()', column)::int[] @< ARRAY[199,2200,222]

>
> Without that, the only way to have indexes is to use functional btree
> indexes over XPath expression (smth like "...btree(((xpath('...',
> field)[1]::text))" -- pretty ugly construction...)

It's not usefull, if xpath returns more values

Regards
Pavel Stehule

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Getting to 8.3 beta1

2007-09-28 Thread Nikolay Samokhvalov
On 9/27/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> * Draft release notes --- can't really ship a beta without these,
> else beta testers won't know what to test.  Traditionally this has
> taken a fair amount of time, but I wonder whether we couldn't use
> http://developer.postgresql.org/index.php/WhatsNew83
> for at least the first cut.

I've modified XML part of wiki page:

XML Support
* This new data type (XML) validates input for well-formedness and
has a set of type-safe operations.
* SQL/XML publishing functions, per SQL:2003
* xpath() function for XPath 1.0 expressions evaluation (with
Namespaces support)
* Alternative XML export function

-- 
Best regards,
Nikolay

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Getting to 8.3 beta1

2007-09-28 Thread Heikki Linnakangas
Gregory Stark wrote:
> What we want to know is that things like pgadmin can connect properly to
> either 8.3, 8.2, and even 8.1 using the new libraries regardless of how the
> server authentication is configured. Do they work correctly if the server
> tries to do password authentication, ident, kerberos, etc.

That's a matter of protocol-compatibility, and that's not the issue
here. We're talking about *ABI* compatibility. In a nutshell, if you
have a an program that's compiled with 8.2 libpq-library, will it work
when linked to 8.3 library at runtime? And if you have a program
compiled with 8.3 library, does it work with a 8.2 library?

But yeah, the regression tests won't help much with that. I suppose you
could try to use an 8.2 psql with an 8.3 library and see if it works,
but I don't know how much of the library interface psql exercises.

I think it's easier and more reliable to just go through the commit logs
for libpq, and see if anything has changed. Even that isn't bullet-proof
though. For example, if there's an incompatible change to a struct or a
typedef that's used as a function parameter, that breaks binary
compatibility as well.

In the future, we should try to keep this in mind during the development
cycle, and bump the minor version number the first time a
backwards-compatible change, like adding a new function, is made, and
bump the major version number the first time an incompatible change is
made. Like we do for catalog version, except that we only want to bump
the libpq version number once per release. (though we do sometimes
forget to bump the catalog version number as well)

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] proposal casting from XML[] to int[], numeric[], text[]

2007-09-28 Thread Nikolay Samokhvalov
The problem with contrib/xml2's xpath_* functions (that return
scalars) was that they are very specific. If XPath expression
evaluation returns array of values (set of XML pieces), but the
function returns only the first, significant information is lost,
while there is no any gain in speed at all.

The key idea was to create only one generic function at the first
stage -- xpath(), returning an array of XML pieces.

We would create wrappers returning int[], bool[], string[], but there
are several issues with such functions:
  - if the type of the data located on nodes that match XPath
expression differs from what is expected, what should we do?
  - in XML world, if you request for a text under some node, all
descendants should be involved in generating result string (example:
what should be returned for XML like "PostgreSQL
is a powerful, open source relational database system" if user
requests for text under "em" node? In XML world, the correct answer is
"PostgreSQL  is a powerful, open source relational database system" --
concatenation of all strings from the node itself and all its
descendants, in the correct order. Will be this expected for RDBMS
users?).

Regarding GIN indexes, alternative approach would be creating opclass
for xml[], it should be pretty simple (and better than creating
implicit CASTs for xml[]<->int[], xml[]<->bool[], etc). Can we do this
for 8.3 or it's too late? It would be very helpful feature.

Without that, the only way to have indexes is to use functional btree
indexes over XPath expression (smth like "...btree(((xpath('...',
field)[1]::text))" -- pretty ugly construction...)

On 9/25/07, Peter Eisentraut <[EMAIL PROTECTED]> wrote:
> Am Dienstag, 25. September 2007 schrieb Pavel Stehule:
> > Current result from xpath function isn't indexable. It cannot be
> > problem with possibility cast it to some base types.
>
> Nikolay might be able to remind us what happened to the proposed functions
> xpath_bool, xpath_text, etc.
>
> --
> Peter Eisentraut
> http://developer.postgresql.org/~petere/
>


-- 
Best regards,
Nikolay

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Getting to 8.3 beta1

2007-09-28 Thread Gregory Stark
"Stephen Frost" <[EMAIL PROTECTED]> writes:

> This is where I was suggesting doing something like running the
> regression tests using old client libraries linked against the new
> library.  If there's a binary-incompatible change then the path is
> clear.  If the regression tests work fine then I'd feel comfortable
> just bumping the minor version and leaving the real 'soname' alone.

Unfortunately the regression tests don't really test the library interface.
They test that various forms of SQL work but all with a single client and a
single form of authentication.

What we want to know is that things like pgadmin can connect properly to
either 8.3, 8.2, and even 8.1 using the new libraries regardless of how the
server authentication is configured. Do they work correctly if the server
tries to do password authentication, ident, kerberos, etc.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Getting to 8.3 beta1

2007-09-28 Thread Zeugswetter Andreas ADI SD

> > * Do we bump the .so major version number for libpq?  I think we
should
> > because there are two new exported functions since 8.2, and on at
least
> > some platforms there's nothing else than major number to
disambiguate
> > whether a client needs these or not.  Comments?

-1. You don't bump major if the old api can be used 1:1 with the new
lib.
New functions is not a reason for a major bump.
The major version business is to protect you from ruining currently
running
(old) programs, not from using a too old lib with newly compiled
programs. 

Andreas

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster