Re: [HACKERS] Policy on pulling in code from other projects?

2011-07-23 Thread Dave Page
On Saturday, July 23, 2011, Christopher Browne cbbro...@gmail.com wrote:
 I generally agree, Josh, but I think readline is getting pointed at a bit too 
 much.  Yeah, it's a bad one, but we also include other stuff like zlib that 
 doesn't commonly come up as an issue.
 I'd argue something just a wee bit different...
 By the time we would add in:
 - autoconf rules to detect it,
 - makefile rules to link it in
 - include file changes
 - wrappers to ensure use of pmalloc
 - Debian guys add build dependancies
 -  rpm dependencies get added
 - BSD ports dependencies
 That is likely rather more code than 1 not terribly large file of C needed to 
 do it ourselves.  And this code is rather worse, as it is in a bunch of 
 languages, spread all over.
 If we were gaining a lot of extra functionality for free it would be one 
 thing.  That is true for libssl, and likely zlib, but not here.


Also consider if the library is widely available on common distros or
not. If not, packagers are going to have to start packaging that
first, in order to build the PostgreSQL packages. This is a *huge*
issue for use if we want to use wxWidgets addon libraries with
pgAdmin.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions and experiences writing a Foreign Data Wrapper

2011-07-23 Thread Albe Laurenz
Robert Haas wrote:
On Fri, Jul 22, 2011 at 12:32 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Jul 22, 2011 at 12:13 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, if you read it that way, then CREATE USER MAPPING with an empty
 option set is a no-op: the behavior of the FDW would be the same whether
 you'd executed it or not.  Which doesn't seem to me to satisfy the
 principle of least surprise, nor the letter of the spec.

 I think what they're saying is that they expect the credentials to be
 stored in the user mapping.  But that seems like a fairly silly
 requirement, since it's not difficult to imagine wanting all of your
 local users to connect to the remote side with the same set of
 credentials ...

 But if you want that, you'd do CREATE USER MAPPING FOR PUBLIC.  What
 disturbs me about this approach is that it'd have the effect of a public
 mapping with no options existing by default, and being in fact
 impossible to remove.  Now, depending on what the FDW chooses to require
 in the way of options, that might not be insecure; but it sure seems
 like a foot-gun waiting to fire on somebody.

 Maybe.  On the other hand, I think there's a pretty strong usability
 argument against the way it works right now.

There is no specific way it works right now; in effect it's up to
the implementor of the foreign data wrapper to give these constructs
any possible meaning.
What could and should be done is document how we *intend* these things
to get used so that implementors can adhere to that.

I don't like to think of a user mapping as a means to restrict access
to the foreign data source, because in effect that is the same as
restricting access to the foreign table, which is the ACL's job.
Moreover, that would contradict the way file_fdw is currently
implemented.

After reading the standard, I'm inclined to think that lack of
user mapping is the same as having no foreign credentials. This might
be appropriate in some cases, e.g. where you give the PostgreSQL
OS user permission to connect without credentials (think trust).
That might be desirable if you want to avoid storing passwords in
system catalogs. So I think that there should be no error
user mapping not found, but instead you should get could not
authenticate from the remote.

CREATE USER MAPPING FOR PUBLIC would be a no-op in this case.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cataloguing NOT NULL constraints

2011-07-23 Thread Dean Rasheed
On 22 July 2011 22:28, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Jul 22, 2011 at 4:39 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 Excerpts from Robert Haas's message of vie jul 22 12:14:30 -0400 2011:
 On Thu, Jul 21, 2011 at 7:51 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
  I think that there probably ought to be a way to display the NOT NULL
  constraint names (perhaps through \d+). For example, if you're
  planning to support NOT VALID on top of this in the future, then there
  needs to be a way to get the constraint's name to validate it.
 
  Absolutely true.  Another thing that needs to be done here is to let the
  ALTER TABLE and ALTER DOMAIN commands use the constraint names; right
  now, they simply let you add the constraint but not specify the name.
  That should probably be revisited.

 That, at least, seems like something that should be fixed before commit.

 Hmm, which point, Dean's or mine?  Dean was saying that the name should
 be displayed by some flavor of \d;

 That might not be 100% necessary for the initial commit, but seems
 easy to fix, so why not?


Agreed.


 mine was that we need a command such
 as

 ALTER TABLE foo ALTER COLUMN bar SET NOT NULL name_of_notnull_constr

 where the last bit is what's new.

 Well, if you don't have that, I don't see how you have any chance of
 pg_dump working correctly.

Ah yes, pg_dump's dumpConstraint() needs a clause to alter a table
adding a named NOT NULL constraint (and the DOMAIN case should be
preserving the constraint's name too). So it looks like some new
syntax for ALTER TABLE to add named NOT NULL constraints is probably
needed before this can be committed.


  Though I think it should use the table
 constraint syntax:

 CONSTRAINT name_of_notnull_constr constraint_definition

 I'm not exactly sure what to propose for the constraint_definition.
 Perhaps just:

 CONSTRAINT name_of_notnull_constr NOT NULL column_name


That looks wrong to me, because a NOT NULL constraint is a column
constraint not a table constraint. The CREATE TABLE syntax explicitly
distinguishes these 2 cases, and only allows NOT NULLs in column
constraints. So from a consistency point-of-view, I think that ALTER
TABLE should follow suit.

So the new syntax could be:

ALTER TABLE table_name ALTER [COLUMN] col_name ADD column_constraint

where column_constraint is the same as in CREATE TABLE (i.e., allowing
all the other constraint types at the same time).

It looks like that approach would probably lend itself to more
code-reusability too, especially once we start adding options to the
constraint.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Policy on pulling in code from other projects?

2011-07-23 Thread Andrew Dunstan



On 07/22/2011 10:51 AM, Joshua D. Drake wrote:

On 07/21/2011 11:13 AM, Tom Lane wrote:

Joshua D. Drakej...@commandprompt.com  writes:

So I am looking intently on what it is going to take to get the URI
patch done for psql [1] and was digging around the web and have a URI
parser library. It is under the New BSD license and is strictly RFC  
RFC

3986 [2] compliant .


Surely we do not need a whole library to parse URIs.

regards, tom lane



Any other comments? I don't want to do a bunch of work just to have it 
tossesd on a technicality.





1. I think the proposed use is of very marginal value at best, and 
certainly not worth importing an external library for.


2. Even if we have the feature, we do not need to parse URIs generally. 
A small amount of hand written C code should suffice. If it doesn't that 
would itself be an argument against it.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions and experiences writing a Foreign Data Wrapper

2011-07-23 Thread Andrew Dunstan



On 07/22/2011 11:34 AM, Robert Haas wrote:




No, you can specify connection details at per-server and
per-foreign-table level too. The FDW implementation is free to accept or
reject options where-ever it wants.

Well, if we are going to take that viewpoint, then not having a user
mapping *shouldn't* be an error, for any use-case.  What would be an
error would be not having the foreign-user-name-or-equivalent specified
anywhere in the applicable options, but it's up to the FDW to notice and
complain about that.

+1.


What does the standard say?

You can get around most of the inconvenience with an empty PUBLIC user 
mapping, although it's mildly annoying if you've forgotten to make one.



cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cataloguing NOT NULL constraints

2011-07-23 Thread Robert Haas
On Sat, Jul 23, 2011 at 4:37 AM, Dean Rasheed dean.a.rash...@gmail.com wrote:
 That looks wrong to me, because a NOT NULL constraint is a column
 constraint not a table constraint. The CREATE TABLE syntax explicitly
 distinguishes these 2 cases, and only allows NOT NULLs in column
 constraints. So from a consistency point-of-view, I think that ALTER
 TABLE should follow suit.

 So the new syntax could be:

 ALTER TABLE table_name ALTER [COLUMN] col_name ADD column_constraint

 where column_constraint is the same as in CREATE TABLE (i.e., allowing
 all the other constraint types at the same time).

 It looks like that approach would probably lend itself to more
 code-reusability too, especially once we start adding options to the
 constraint.

So you'd end up with something like this?

ALTER TABLE foo ALTER COLUMN bar ADD CONSTRAINT somename NOT NULL

That works for me.  I think sticking the name of the constraint in
there at the end of the line as Alvaro proposed would be terrible for
future syntax extensibility - we'll be much less likely to paint
ourselves into a corner with something like this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Problem with pg_upgrade's directory write check on Windows

2011-07-23 Thread Bruce Momjian
Pg_upgrade writes temporary files (e.g. _dumpall output) into the
current directory, rather than a temporary directory or the user's home
directory.  (This was decided by community discussion.)

I have a check in pg_upgrade 9.1 to make sure pg_upgrade has write
permission in the current directory:

if (access(., R_OK | W_OK
#ifndef WIN32

/*
 * Do a directory execute check only on Unix because execute 
permission on
 * NTFS means can execute scripts, which we don't care about. 
Also, X_OK
 * is not defined in the Windows API.
 */
   | X_OK
#endif
   ) != 0)
pg_log(PG_FATAL,
  You must have read and write access in the current 
directory.\n);

Unfortunately, I have received a bug report from EnterpriseDB testing
that this does not trigger the FATAL exit on Windows even if the user
does not have write permission in the current directory, e.g. C:\.

I think I see the cause of the problem.  access() on Windows is
described here:

http://msdn.microsoft.com/en-us/library/1w06ktdy%28v=vs.80%29.aspx

It specifically says:

When used with directories, _access determines only whether the
specified directory exists; in Windows NT 4.0 and Windows 2000, all
directories have read and write access.
...
This function only checks whether the file and directory are read-only
or not, it does not check the filesystem security settings.  For
that you need an access token.

We do use access() in a few other places in our code, but not for
directory permission checks.

Any ideas on a solution?  Will checking stat() work?  Do I have to try
creating a dummy file and delete it?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Problem with pg_upgrade's directory write check on Windows

2011-07-23 Thread Andrew Dunstan



On 07/23/2011 08:45 AM, Bruce Momjian wrote:

Pg_upgrade writes temporary files (e.g. _dumpall output) into the
current directory, rather than a temporary directory or the user's home
directory.  (This was decided by community discussion.)

I have a check in pg_upgrade 9.1 to make sure pg_upgrade has write
permission in the current directory:

if (access(., R_OK | W_OK
#ifndef WIN32

/*
 * Do a directory execute check only on Unix because execute 
permission on
 * NTFS means can execute scripts, which we don't care about. 
Also, X_OK
 * is not defined in the Windows API.
 */
   | X_OK
#endif
   ) != 0)
pg_log(PG_FATAL,
  You must have read and write access in the current 
directory.\n);

Unfortunately, I have received a bug report from EnterpriseDB testing
that this does not trigger the FATAL exit on Windows even if the user
does not have write permission in the current directory, e.g. C:\.

I think I see the cause of the problem.  access() on Windows is
described here:

http://msdn.microsoft.com/en-us/library/1w06ktdy%28v=vs.80%29.aspx

It specifically says:

When used with directories, _access determines only whether the
specified directory exists; in Windows NT 4.0 and Windows 2000, all
directories have read and write access.
...
This function only checks whether the file and directory are read-only
or not, it does not check the filesystem security settings.  For
that you need an access token.

We do use access() in a few other places in our code, but not for
directory permission checks.

Any ideas on a solution?  Will checking stat() work?  Do I have to try
creating a dummy file and delete it?


That looks like the obvious solution - it's what came to my mind even 
before reading this sentence.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Problem with pg_upgrade's directory write check on Windows

2011-07-23 Thread Bruce Momjian
Andrew Dunstan wrote:
  We do use access() in a few other places in our code, but not for
  directory permission checks.
 
  Any ideas on a solution?  Will checking stat() work?  Do I have to try
  creating a dummy file and delete it?
 
 That looks like the obvious solution - it's what came to my mind even 
 before reading this sentence.

Well, the easy fix is to see if ALL_DUMP_FILE
(pg_upgrade_dump_all.sql) exists, and if so remove it, and if not,
create it and remove it.

Should I fix this in pg_upgrade 9.1 for Windows or just in 9.2?  The
check works fine on non-Windows.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Problem with pg_upgrade's directory write check on Windows

2011-07-23 Thread Robert Haas
On Sat, Jul 23, 2011 at 9:14 AM, Bruce Momjian br...@momjian.us wrote:
 Andrew Dunstan wrote:
  We do use access() in a few other places in our code, but not for
  directory permission checks.
 
  Any ideas on a solution?  Will checking stat() work?  Do I have to try
  creating a dummy file and delete it?

 That looks like the obvious solution - it's what came to my mind even
 before reading this sentence.

 Well, the easy fix is to see if ALL_DUMP_FILE
 (pg_upgrade_dump_all.sql) exists, and if so remove it, and if not,
 create it and remove it.

 Should I fix this in pg_upgrade 9.1 for Windows or just in 9.2?  The
 check works fine on non-Windows.

Seems worth back-patching to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions and experiences writing a Foreign Data Wrapper

2011-07-23 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 What does the standard say?

Well, there is not a statement in so many words that you have to have a
relevant USER MAPPING to use a foreign table.  But the spec does specify
that an FDW's ConnectServer function takes a UserHandle as one input
parameter and should throw an exception if that handle isn't valid.
And as far as I can tell a UserHandle can only be created from a
relevant USER MAPPING entry.  So I think the behavior I'm arguing for
would emerge from an FDW that was built using the spec-defined API.
We only have an opportunity to argue about it because we chose to
invent our own FDW API.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions and experiences writing a Foreign Data Wrapper

2011-07-23 Thread Andrew Dunstan



On 07/23/2011 10:42 AM, Tom Lane wrote:

Andrew Dunstanand...@dunslane.net  writes:

What does the standard say?

Well, there is not a statement in so many words that you have to have a
relevant USER MAPPING to use a foreign table.  But the spec does specify
that an FDW's ConnectServer function takes a UserHandle as one input
parameter and should throw an exception if that handle isn't valid.
And as far as I can tell a UserHandle can only be created from a
relevant USER MAPPING entry.  So I think the behavior I'm arguing for
would emerge from an FDW that was built using the spec-defined API.
We only have an opportunity to argue about it because we chose to
invent our own FDW API.




In that case I think I'm in favor of the suggestion of an implied empty 
user mapping for PUBLIC, as long as it can be overridden.


It does seem to be very late in the day to be arguing about such 
details, though, unless we're talking about changing it in the 9.2 cycle.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions and experiences writing a Foreign Data Wrapper

2011-07-23 Thread Tom Lane
Albe Laurenz laurenz.a...@wien.gv.at writes:
 I don't like to think of a user mapping as a means to restrict access
 to the foreign data source, because in effect that is the same as
 restricting access to the foreign table, which is the ACL's job.

No, the standard is quite clear that those are distinct things.
See the NOTE I quoted upthread.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions and experiences writing a Foreign Data Wrapper

2011-07-23 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 In that case I think I'm in favor of the suggestion of an implied empty 
 user mapping for PUBLIC, as long as it can be overridden.

But how would you do that (override it)?  All you can do is create an
explicit mapping, and then you still have a mapping that allows access
to PUBLIC.  (Or not, but if an empty one does, you're stuck.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Dropping extensions

2011-07-23 Thread Tom Lane
Marc Munro m...@bloodnok.com writes:
 In postgres 9.1 I have created 2 extensions, veil and veil_demo.  When I
 install veil, it creates a default (not very useful) version of a
 function: veil_init().

 When I create veil_demo, it replaces this version of the function with
 it's own (useful) version.

 If I drop the extension veil_demo, I am left with the veil_demo version
 of veil_init().

 Is this a feature or a bug?  Is there a work-around?

Hmm.  I don't think we have any code in there to prohibit the same
object from being made a member of two different extensions ... but this
example suggests that maybe we had better check that.

In general, though, it is not intended that extension creation scripts
use CREATE OR REPLACE, which I gather you must be doing.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: a validator for configuration files

2011-07-23 Thread Dimitri Fontaine
Florian Pflug f...@phlo.org writes:
 A variant of this would be to allow extensions (in the CREATE EXTENSION
 sense) to declare custom GUCs in their control file. Then we'd only
 need to load those files, which seems better than loading a shared
 library.

The original patch for the extensions had that feature.  It had to be
removed, though.  The control file value has to be stored in the
catalogs and now only backends can look that up once connected to a
database.  This part worked well.

  
http://git.postgresql.org/gitweb/?p=postgresql-extension.git;a=commit;h=480fa10f4ec7b495cf4f434e6f44a50b94487326
  
http://git.postgresql.org/gitweb/?p=postgresql-extension.git;a=commit;h=98d802aa1ee12c77c5270c7bc9ed7479360f35b9

IIRC the problem is that now, the postmaster is not seeing cvc at all,
and you can have there some custom parameters to set shared memory
segments, which only postmaster will take care of.  An example of that
is the pg_stat_statements contrib.

I would love that we find a way around that, btw.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] XPATH vs. server_encoding != UTF-8

2011-07-23 Thread Florian Pflug
Hi

The current thread about JSON and the ensuing discussion about the
XML types' behaviour in non-UTF8 databases made me try out how well
XPATH() copes with that situation. The code, at least, looks
suspicious - XPATH neither verifies that the server encoding is UTF-8,
not does it pass the server encoding on to libxml's xpath functions.

So I created a database with encoding ISO-8859-1 (LATIN1), and did
(which aclient encoding matching my terminal's settings)

  CREATE TABLE X (d XML);
  INSERT INTO X VALUES ('r a=ä/');

i.e, I inserted the XML document r a=auml;/, but without using
an entity reference for the german Umlaut-A. Then I attempted to extract
the length of r's attribute a with the XPATH /r/@a, both with the XPath
function string-length (which works now! yay!) and with postgres'
LENGTH() function.

  SELECT
(XPATH('string-length(/r/@a)', d))[1] AS xpath_length,
LENGTH((XPATH('/r/@a', d))[1]::text) AS pg_length
  FROM X;

The XPATH() function itself doesn't complain, but libxml does - it expects
UTF-8 encoded data, and screams bloody murder when it encounters the
ISO-8859-1-encoded Umlaut-A

  ERROR:  could not parse XML document
  DETAIL:  line 1: Input is not proper UTF-8, indicate encoding !
  Bytes: 0xE4 0x22 0x2F 0x3E
  r a=ä/

That might seem fine on the surface - we did, after all, error out instead
of producing potentially non-sensical results. However, libxml's ability to
detect this error relies on it's ability to distinguish between UTF-8 and
non-UTF-8 encoded strings. Which, of course, doesn't work in the general case.

So for my next try, I deliberately set client_encoding to ISO-8859-1, even
though my terminal uses UTF-8, removed all data from table X, and did

  INSERT INTO X VALUES ('r a=ä/');

again. The effect is that is that X now contains ISO-8859-1 encoded data
which *happens* to look like valid UTF-8. After changing the client_encoding
back to UTF-8, the value we just inserted looks like that

  r a=ä/

Now I invoked the XPATH query from above again.

  SELECT
(XPATH('string-length(/r/@a)', d))[1] AS xpath_length,
LENGTH((XPATH('/r/@a', d))[1]::text) AS pg_length
  FROM X;

As predicted, it doesn't raise an error this time, since libxml is unable
to distinguish the ISO-8859-1 string 'r a=ä/' from valid UTF-8. But the
result is still wrongs, since the string-length() function counts 'ä' as just
one character, when it reality it are of course contains two.

 xpath_length | pg_length 
--+---
 1| 2

The easiest way to fix this would be to make XPATH() flat-out refuse to
do anything if the server encoding isn't UTF-8. But that seems a bit harsh -
things actually do work correctly as long as the XML document contains only
ASCII characters, and existing applications might depend on that.

So what I think we should do is tell libxml that the encoding is ASCII
if the server encoding isn't UTF-8. With that change, the query above
produces

  ERROR:  could not parse XML document
  DETAIL:  encoder error

which seems sane. Replacing the data in X with ASCII-only data makes the
error go away, and the result is then correct also.

  DELETE FROM X;
  INSERT INTO X VALUES ('r a=a/');
  SELECT
(XPATH('string-length(/r/@a)', d))[1] AS xpath_length,
LENGTH((XPATH('/r/@a', d))[1]::text) AS pg_length
  FROM X;

gives

 xpath_length | pg_length 
--+---
 1| 1

Proof-of-concept patch attached, but doesn't yet include documentation
updates.

Comments? Thoughts? Suggestions?

best regards,
Florian Pflug


xpath_nonutf8.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XPATH vs. server_encoding != UTF-8

2011-07-23 Thread Florian Pflug
[Resent with pgsql-hackers re-added to the recipient list.
I presume you didn't remove it on purpose]

On Jul23, 2011, at 18:11 , Joey Adams wrote:
 On Sat, Jul 23, 2011 at 11:49 AM, Florian Pflug f...@phlo.org wrote:
 So what I think we should do is tell libxml that the encoding is ASCII
 if the server encoding isn't UTF-8. With that change, the query above
 produces
 
 I haven't had time to digest this situation, but there is a function
 called pg_encoding_to_char for getting a string representation of the
 encoding.  However, it might not produce a string that libxml
 understands in all cases.
 
 Would it be better to tell libxml the server encoding, whatever it may be?

Ultimately, yes. However, I figured if it was as easy as translating our
encoding names to those of libxml, the current code would probably do that
instead of converting the XML to UTF-8 before validating it.
(Validation and XPATH processing use a different code path there!)

I'm also not aware of any actual complaints about XPATH's restriction
to UTF-8, and it's not a case that I personally care for, so I'm
a bit hesitant to put in the time and energy required to extend it to
other encodings.

But once I had stumbled over this, I didn't want to ignore it all together,
so looked for simple way to make the current behaviour more bullet-proof.
The patch accomplishes that, I think, and without any major change in
behaviour. You only observe the difference if you indeed have non-UTF-8
XMLs which look like valid UTF-8.

 In the JSON encoding discussion, the last idea (the one I was planning
 to go with) was to allow non-ASCII characters in any server encoding
 (like ä in ISO-8859-1), but not allow non-ASCII escapes (like \u00E4)
 unless the server encoding is UTF-8.

Yeah, that's how I understood your proposal, and it seems sensible.

 I think your patch would more
 closely match the opposite: allow any escapes, but only allow ASCII
 text if the server encoding is not UTF-8.

Yeah, but only for XPATH(). XML input validation uses a different
code path, and seems to convert the XML to UTF-8 before verifying
it's well-formedness with libxml (as you already discovered previously).

The difference between JSON and XML here is that the XML types has to
live with libxml's idiosyncrasies and restrictions. If we could make
libxml use our encoding and text handling infrastructure, the UTF-8
restrictions would probably not exist. But as it stands, libxml has
it's own machinery for dealing with encodings...

I wonder, BTW, what happens if you attempt to store an XML containing a
character not representable in UNICODE. If the conversion to UTF-8 simply
replaces it with a placeholder, we'd be fine, since just a replacement
cannot affect the well-formedness of an XML. If OTOH it raised an error,
that'd be a bit unfortunate...

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Dropping extensions

2011-07-23 Thread Marc Munro
On Sat, 2011-07-23 at 11:08 -0400, Tom Lane wrote:
  If I drop the extension veil_demo, I am left with the veil_demo version
  of veil_init().
 
  Is this a feature or a bug?  Is there a work-around?
 
 Hmm.  I don't think we have any code in there to prohibit the same
 object from being made a member of two different extensions ... but this
 example suggests that maybe we had better check that.
 
 In general, though, it is not intended that extension creation scripts
 use CREATE OR REPLACE, which I gather you must be doing.

That's right.  Ultimately I'd like to be able to create a number of
extensions, all further extending the base functionality of veil, with
each one further extending veil_init().  I could consider a more
generalised callback mechanism but that adds more complexity and
overhead without really buying me anything functionally.  I will look
into it though.

While it would be great to be able to return functions to their previous
definition automatically, other simpler mechanisms might suffice.  For
me, being able to run a post-drop script would probably be adequate.
For now, I will just add some notes to the documentation.

Thanks for the response.

__
Marc



signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] Policy on pulling in code from other projects?

2011-07-23 Thread Joshua D. Drake

On 07/22/2011 05:00 PM, Josh Berkus wrote:

Arguments in favor of coding from scratch:

1) Does not introduce new dependencies into postgresql-client packages.
(note how much of a problem Readline has been)


Readline has license issues, this doesn't.



2) keeps psql as lightweight as possible



Agreed on that one.



3) We don't need to be able to parse any potential URI, just the ones we
accept.


True.



Arguments against:

a) waste of time coding a parser

b) eventual feature creep as people ask us for more and more syntax support


Yep.



Overall, though, I'd say that argument (1) against is pretty powerful.
Count the number of complaints and build bug reports about readline
libedit on the lists.



That is a good argument.



--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
The PostgreSQL Conference - http://www.postgresqlconference.org/
@cmdpromptinc - @postgresconf - 509-416-6579

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Policy on pulling in code from other projects?

2011-07-23 Thread Joshua D. Drake

On 07/23/2011 03:39 AM, Andrew Dunstan wrote:


1. I think the proposed use is of very marginal value at best, and
certainly not worth importing an external library for.

2. Even if we have the feature, we do not need to parse URIs generally.
A small amount of hand written C code should suffice. If it doesn't that
would itself be an argument against it.


Well the one area that I think this might be useful in the future is the 
pg_hba.conf but that is for a whole other thread.


JD



cheers

andrew





--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
The PostgreSQL Conference - http://www.postgresqlconference.org/
@cmdpromptinc - @postgresconf - 509-416-6579

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XPATH vs. server_encoding != UTF-8

2011-07-23 Thread Peter Eisentraut
On lör, 2011-07-23 at 17:49 +0200, Florian Pflug wrote:
 The current thread about JSON and the ensuing discussion about the
 XML types' behaviour in non-UTF8 databases made me try out how well
 XPATH() copes with that situation. The code, at least, looks
 suspicious - XPATH neither verifies that the server encoding is UTF-8,
 not does it pass the server encoding on to libxml's xpath functions.

This issue is on the Todo list, and there are some archive links there.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Looks like we can't declare getpeereid on Windows anyway.

2011-07-23 Thread Andrew Dunstan



On 06/02/2011 05:27 PM, Tom Lane wrote:

Looks like we can't declare getpeereid on Windows anyway.

... for lack of the uid_t and gid_t typedefs.  Per buildfarm.




This has given rise to a bunch of warnings on Windows. I'm confused 
about how we can compile with a signature that includes uid_t and gid_t 
but not have a prototype because of those. In any case, why are we even 
compiling it on Windows since it's only called when HAVE_UNIX_SOCKETS is 
true?


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XPATH vs. server_encoding != UTF-8

2011-07-23 Thread Florian Pflug
On Jul23, 2011, at 22:49 , Peter Eisentraut wrote:

 On lör, 2011-07-23 at 17:49 +0200, Florian Pflug wrote:
 The current thread about JSON and the ensuing discussion about the
 XML types' behaviour in non-UTF8 databases made me try out how well
 XPATH() copes with that situation. The code, at least, looks
 suspicious - XPATH neither verifies that the server encoding is UTF-8,
 not does it pass the server encoding on to libxml's xpath functions.
 
 This issue is on the Todo list, and there are some archive links there.

Thanks for the pointer, but I think the discussion there doesn't
really apply here.

First, I didn't suggest (or implement) full support for XPATH() together
with server encodings other than UTF-8. My suggested patch simply
closes a hole in the implementation of the current behaviour. Instead of
relying on libxml to be able to detect that the encoding isn't UTF-8, it
relies on it only to detect that the encoding isn't ASCII. Since supported
server encodings are supersets of ASCII, the latter is trivial.

xml.c also seems to have changed quite a bite since this was last
discussed. Tom Lane argued against the proposed patch on the grounds
that there are many more places in xml.c which pass strings to libxml
without charset conversion. However, looking at it now, it seems that
all XML validation goes through xml_parse(), which actually converts
the XML to UTF-8. Only XPATH contains a separate code path, and chooses
to ignore encoding issues all together.

best regards,
Florian Pflug




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: pgbench cpu overhead (was Re: [HACKERS] lazy vxid locks, v1)

2011-07-23 Thread Jeff Janes
On Mon, Jun 13, 2011 at 7:03 AM, Stefan Kaltenbrunner
ste...@kaltenbrunner.cc wrote:
 On 06/13/2011 01:55 PM, Stefan Kaltenbrunner wrote:

 [...]

 all those tests are done with pgbench running on the same box - which
 has a noticable impact on the results because pgbench is using ~1 core
 per 8 cores of the backend tested in cpu resoures - though I don't think
 it causes any changes in the results that would show the performance
 behaviour in a different light.

 actuall testing against sysbench with the very same workload shows the
 following performance behaviour:

 with 40 threads(aka the peak performance point):

 pgbench:        223308 tps
 sysbench:       311584 tps

 with 160 threads (backend contention dominated):

 pgbench:        57075
 sysbench:       43437


 so it seems that sysbench is actually significantly less overhead than
 pgbench and the lower throughput at the higher conncurency seems to be
 cause by sysbench being able to stress the backend even more than
 pgbench can.


 for those curious - the profile for pgbench looks like:

 samples  %        symbol name
 29378    41.9087  doCustom
 17502    24.9672  threadRun
 7629     10.8830  pg_strcasecmp
 5871      8.3752  compareVariables
 2568      3.6633  getVariable
 2167      3.0913  putVariable
 2065      2.9458  replaceVariable
 1971      2.8117  parseVariable
 534       0.7618  xstrdup
 278       0.3966  xrealloc
 137       0.1954  xmalloc

Hi Stefan,

How was this profile generated?  I get a similar profile using
--enable-profiling and gprof, but I find it not believable.  The
complete absence of any calls to libpq is not credible.  I don't know
about your profiler, but with gprof they should be listed in the call
graph even if they take a negligible amount of time.  So I think
pgbench is linking to libpq libraries that do not themselves support
profiling (I have no idea how that could happen though).  If the calls
graphs are not getting recorded correctly, surely the timing can't be
reliable either.

(I also tried profiling pgbench with perf, but in that case I get
nothing other than kernel and libc calls showing up.  I don't know
what that means)

To support this, I've dummied up doCustom so that does all the work of
deciding what needs to happen, executing the metacommands,
interpolating the variables into the SQL string, but then simply
refrains from calling the PQ functions to send and receive the query
and response.  (I had to make a few changes around the select loop in
threadRun to support this).

The result is that the dummy pgbench is charged with only 57% more CPU
time than the stock one, but it gets over 10 times as many
transactions done.  I think this supports the notion that the CPU
bottleneck is not in pgbench.c, but somewhere in the libpq or the
kernel.


Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON

2011-07-23 Thread Joey Adams
Also, should I forbid the escape \u (in all database encodings)?

Pros:

 * If \u is forbidden, and the server encoding is UTF-8, then
every JSON-wrapped string will be convertible to TEXT.

 * It will be consistent with the way PostgreSQL already handles text,
and with the decision to use database-encoded JSON strings.

 * Some applications choke on strings with null characters.  For
example, in some web browsers but not others, if you pass
Hello\uworld to document.write() or assign it to a DOM object's
innerHTML, it will be truncated to Hello.  By banning \u, users
can catch such rogue strings early.

 * It's a little easier to represent internally.

Cons:

 * Means JSON type will accept a subset of the JSON described in
RFC4627.  However, the RFC does say An implementation may set limits
on the length and character contents of strings, so we can arguably
get away with banning \u while being law-abiding citizens.

 * Being able to store U+–U+00FF means users can use JSON strings
to hold binary data: by treating it as Latin-1.

- Joey

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON

2011-07-23 Thread Robert Haas
On Fri, Jul 22, 2011 at 10:36 PM, Joey Adams joeyadams3.14...@gmail.com wrote:
 Interesting.  This leads to a couple more questions:

  * Should the JSON data type (eventually) have an equality operator?

+1.

  * Should the JSON input function alphabetize object members by key?

I think it would probably be better if it didn't.  I'm wary of
overcanonicalization.  It can be useful to have things come back out
in more or less the format you put them in.

 If we canonicalize strings and numbers and alphabetize object members,
 then our equality function is just texteq.  The only stumbling block
 is canonicalizing numbers.  Fortunately, JSON's definition of a
 number is its decimal syntax, so the algorithm is child's play:

  * Figure out the digits and exponent.
  * If the exponent is greater than 20 or less than 6 (arbitrary), use
 exponential notation.

 The problem is: 2.718282e-1000 won't equal 0 as may be expected.  I
 doubt this matters much.

I don't think that 2.718282e-100 SHOULD equal 0.

 If, in the future, we add the ability to manipulate large JSON trees
 efficiently (e.g. by using an auxiliary table like TOAST does), we'll
 probably want unique members, so enforcing them now may be prudent.

I doubt you're going to want to reinvent TOAST, but I do think there
are many advantages to forbidding duplicate keys.  ISTM the question
is whether to throw an error or just silently discard one of the k/v
pairs.  Keeping both should not be on the table, IMHO.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Problem with pg_upgrade's directory write check on Windows

2011-07-23 Thread Bruce Momjian
Robert Haas wrote:
 On Sat, Jul 23, 2011 at 9:14 AM, Bruce Momjian br...@momjian.us wrote:
  Andrew Dunstan wrote:
   We do use access() in a few other places in our code, but not for
   directory permission checks.
  
   Any ideas on a solution? ?Will checking stat() work? ?Do I have to try
   creating a dummy file and delete it?
 
  That looks like the obvious solution - it's what came to my mind even
  before reading this sentence.
 
  Well, the easy fix is to see if ALL_DUMP_FILE
  (pg_upgrade_dump_all.sql) exists, and if so remove it, and if not,
  create it and remove it.
 
  Should I fix this in pg_upgrade 9.1 for Windows or just in 9.2? ?The
  check works fine on non-Windows.
 
 Seems worth back-patching to me.

Attached patch applied and backpatched to 9.1.  I was able to test both
code paths on my BSD machine by modifying the ifdefs.  I will have
EnterpriseDB do further testing.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/exec.c b/contrib/pg_upgrade/exec.c
new file mode 100644
index a76c06e..3493696
*** a/contrib/pg_upgrade/exec.c
--- b/contrib/pg_upgrade/exec.c
***
*** 16,21 
--- 16,24 
  static void check_data_dir(const char *pg_data);
  static void check_bin_dir(ClusterInfo *cluster);
  static void validate_exec(const char *dir, const char *cmdName);
+ #ifdef WIN32
+ static int win32_check_directory_write_permissions(void);
+ #endif
  
  
  /*
*** verify_directories(void)
*** 97,113 
  
  	prep_status(Checking current, bin, and data directories);
  
- 	if (access(., R_OK | W_OK
  #ifndef WIN32
! 
! 	/*
! 	 * Do a directory execute check only on Unix because execute permission on
! 	 * NTFS means can execute scripts, which we don't care about. Also, X_OK
! 	 * is not defined in the Windows API.
! 	 */
! 			   | X_OK
  #endif
- 			   ) != 0)
  		pg_log(PG_FATAL,
  		  You must have read and write access in the current directory.\n);
  
--- 100,110 
  
  	prep_status(Checking current, bin, and data directories);
  
  #ifndef WIN32
! 	if (access(., R_OK | W_OK | X_OK) != 0)
! #else
! 	if (win32_check_directory_write_permissions() != 0)
  #endif
  		pg_log(PG_FATAL,
  		  You must have read and write access in the current directory.\n);
  
*** verify_directories(void)
*** 119,124 
--- 116,147 
  }
  
  
+ #ifdef WIN32
+ /*
+  * win32_check_directory_write_permissions()
+  *
+  *	access() on WIN32 can't check directory permissions, so we have to
+  *	optionally create, then delete a file to check.
+  *		http://msdn.microsoft.com/en-us/library/1w06ktdy%28v=vs.80%29.aspx
+  */
+ static int
+ win32_check_directory_write_permissions(void)
+ {
+ 	int fd;
+ 
+ 	/*
+ 	 *	We open a file we would normally create anyway.  We do this even in
+ 	 *	'check' mode, which isn't ideal, but this is the best we can do.
+ 	 */	
+ 	if ((fd = open(GLOBALS_DUMP_FILE, O_RDWR | O_CREAT, S_IRUSR | S_IWUSR))  0)
+ 		return -1;
+ 	close(fd);
+ 
+ 	return unlink(GLOBALS_DUMP_FILE);
+ }
+ #endif
+ 
+ 
  /*
   * check_data_dir()
   *

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers