date:20090712

Re: [HACKERS] [pgsql-www] commitfest.postgresql.org

2009-07-12 Thread Robert Haas

On Thu, Jul 9, 2009 at 2:20 PM, Robert Haas wrote:
> On Jul 9, 2009, at 12:16 PM, Tom Lane  wrote:
>
>> Brendan Jurd  writes:
>>>
>>> We're now about a week away from the start of the July 2009
>>> commitfest, and we need to make a decision about whether to start
>>> using http://commitfest.postgresql.org to manage it, or punt to the
>>> next commitfest and continue to use the wiki for July.
>>
>> While reorganizing my bookmarks for this I realized that there is a
>> fairly significant bit of functionality that's entirely missing from
>> the new app.  With the wiki page, you could conveniently see what had
>> been done lately by examining the page history.  I don't see any
>> equivalent capability in the new app.  I find this fairly significant,
>> as evidenced by the fact that I'd gone so far as to set up a bookmark
>> for the history view.  I'm not particularly wedded to the wiki page
>> history in terms of what it looks like or how it functions, but I do
>> feel a need to know what other people have done recently
>
> I'll fix this. Give me a couple days; my Internet access here at the family
> vacation spot is not compatible with "git push".

OK, this is a little bit quick-and-dirty, and I'm sure I'll get some,
ah, gentle suggestions for improvement, but I've added an activity log
to the app.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Maintenance Policy?

2009-07-12 Thread Bruce Momjian

Greg Stark wrote:
> On Mon, Jul 13, 2009 at 2:07 AM, Bruce Momjian wrote:
> > This might open the larger question of: ?What do we actually _promise_
> > users?
> 
> The discussion had already covered that ground. If someone wants a
> promise they'll have to fork over money to one of the companies which
> sell such things.
> 
> That's why Josh's last email where he said just that we *didn't* plan
> to support releases for longer than 5 years is much better than the
> other attempts to say how long we *did* plan to support releases.

Interesting distinction.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New types for transparent encryption

2009-07-12 Thread tomas

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Mon, Jul 13, 2009 at 01:22:30PM +0900, Itagaki Takahiro wrote:
> 
> Sam Mason  wrote:
> 
> > As others have said, handling encryption client side would seem to offer
> > many more benefits (transparently within libpq offering easy adoption).
> 
> Libpq is not the only driver. Do we need to develop transparent decryption
> for each drivers? (libpq, JDBC, npgsql, py-postgresql, ...)

Just define a protocol. Of course there is more work in that, so yes,
this is one point going against client-side.

> Also, if you disallow server-side decode, you cannot create indexes on
> encrypted values because the same values are not always encrypted to the
> same codes. (Indexes will sort keys based on order of decoded values.)

Definitely another point against client-side. *If* there is some random
element in encryption (salt, IV, whatever), you can't index on an
encrypted field. If there isn't, the encryption will be possibly weak
(being amenable at least to a rainbow-table attack).

> I think there is no difference between client-side decryption and
> clinet-supplied password as far as clinet-server communication is
> encrypted (i.e, SSL login).

There definitely is a difference. If someone hi-jacks the running server
(trojan, privilege escalation), s/he still doesn'nt get at the data if
they only can be decrypted client-side. OTOH, with server-side
decryption, all bets are off in this case, since the keys are lying
around there (maybe somewhat obfuscated, but still accessible).

But this has already been hashed out in another thread AFAIR.

Regards
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFKWr5FBcgs9XrR2kYRAkbdAJ9JN4mXGE1uD5EGCzWgZh4dsfCPpwCfaTew
2uD3F59+Gm1wR/jnYChvF+M=
=WWhL
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New types for transparent encryption

2009-07-12 Thread Itagaki Takahiro

Sam Mason  wrote:

> As others have said, handling encryption client side would seem to offer
> many more benefits (transparently within libpq offering easy adoption).

Libpq is not the only driver. Do we need to develop transparent decryption
for each drivers? (libpq, JDBC, npgsql, py-postgresql, ...)

Also, if you disallow server-side decode, you cannot create indexes on
encrypted values because the same values are not always encrypted to the
same codes. (Indexes will sort keys based on order of decoded values.)

I think there is no difference between client-side decryption and
clinet-supplied password as far as clinet-server communication is
encrypted (i.e, SSL login).

> Should the password be this widely shared? it would seem to make more
> sense if it was a write-only variable and never exposed outside the
> crypto module.

We can use an user-defined GUC variables as a write-only variable.
When we supply some show_hook function to GUC variable,
SET still works but SHOW only shows '' and hides real passwords.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] SE-PgSQL/lite rev.2163

2009-07-12 Thread KaiGai Kohei

Robert Haas wrote:
> 2009/7/12 KaiGai Kohei :
>> Robert, thanks for your comments.
>>
>> Robert Haas wrote:
>>> 2009/7/10 KaiGai Kohei :
 The SE-PostgreSQL patches are updated as follows:

 [1/5] 
 http://sepgsql.googlecode.com/files/sepgsql-01-sysatt-8.5devel-r2163.patch
 [2/5] 
 http://sepgsql.googlecode.com/files/sepgsql-02-core-8.5devel-r2163.patch
 [3/5] 
 http://sepgsql.googlecode.com/files/sepgsql-03-gram-8.5devel-r2163.patch
 [4/5] 
 http://sepgsql.googlecode.com/files/sepgsql-04-tests-8.5devel-r2163.patch
 [5/5] 
 http://sepgsql.googlecode.com/files/sepgsql-05-docs-8.5devel-r2163.patch

 List of updates:
 * Patch set was organized to a few ones which provides only core features.
 * Code base was upgraded to the latest CVS HEAD.
 * Some of features in the fullset edition were separated, to focus on
  the core feature of SE-PostgreSQL at the first commit fest.
>>> I took a look at this a little bit.  It looks as if you are still
>>> treating the Security Label as a special attribute of the tuple, which
>>> seems completely unnecessary given that this patch set is not
>>> attempting to implement row-level security.  It seems to me that all
>>> you should need is regular old columns pg_class.relseclabel,
>>> pg_attribute.attseclabel, etc; it also seems to me that would simplify
>>> the code.
>> Are you saying that whole of the pg_security mechanism also should be
>> postponed to the second commit fest, not only system column's definitions?
> 
> I'm not sure what you mean by "the whole of the pg_security
> mechanism".  But I believe there was negative feedback previously
> about the idea of sandwhiching the security label into the tuple
> header instead of making it an ordinary (non-system) column in a
> system catalog table.

It means the mechanism to translate a security identifier and its
text representation each other.
The reason why I separated the pg_security mechanism was to reduce
scale of the patch, because we agreed to postpone row-level stuffs
in the v8.4 development cycle. Please note that only four relations
needed to have a capability to assign security labels (pg_database,
pg_class, pg_attribute and pg_proc).
However, when row-level security is available, consumption of the
storage by security labels in raw-text cannot be ignored, although
massive number of database objects shares limited number of security
labels.

>>> As a general comment, I don't have much confidence that your original
>>> design for row-level security is a good one.  I think that needs to be
>>> thought about very carefully before anything is implemented.  I note
>>> that your original design called for a system catalog lookup on each
>>> row returned, which is basically saying that all security-label
>>> filtering will be implemented as a nested loop with inner index scan.
>>> It seems to me that if we ever implement row-level security (which is
>>> far from a sure thing) we're certainly going to want to allow the
>>> planner to make other decisions, such as sorting the tuples by
>>> security label and performing a merge join against the pg_security
>>> catalog, or hashing the pg_security catalog and performing a hash
>>> join, or using an index on the security label column to perform a
>>> bitmap index scan, or any other technique that the planner already
>>> knows how to consider.  I say all this not to encourage you to spend
>>> more time on row-level security now (because I think that would be
>>> premature) but to encourage you to abandon all the design decisions in
>>> this patch that are presuming a particular design for row-level
>>> security and focus on making the features that are actually in this
>>> patch set as lean and robust as possible.
>> I'm confusing a bit for the comments.
>> The row-level access control stuff (which is managed in my repository
>> now) does not need to look up the pg_security system catalog every time.
>> I guess you believe SE-PgSQL looks up the system catalog to fetch security
>> label in text form, and calls in-kernel SELinux to make its decision for
>> every tuples. However, it is incorrect.
>>
>> The userspace avc (security/sepgsql/avc.c) routines enable to cache access
>> control decisions recently used, and return its result for the required
>> pair of security identifier (not a text form) and type of actions (like
>> db_table:{select}), if it hits a cache entries.
>> It means SE-PgSQL can return its decisions using only security identifier
>> in most cases.
> 
> Perhaps you're not making a kernel call for each row (good!), but I
> think you are still assuming that you're going to fetch the rows
> first, without any sepgsql-specific decision making, and then perform
> an operation of some kind on each row to see whether to filter it.  If
> so, what I'm saying is that's bad.
> 
> Suppose we have two security contexts A and B, and two users Alice and
> Bob.  Alice can see only tuples in security context

Re: [HACKERS] [PATCH] SE-PgSQL/lite rev.2163

2009-07-12 Thread Robert Haas

2009/7/12 KaiGai Kohei :
> Robert, thanks for your comments.
>
> Robert Haas wrote:
>> 2009/7/10 KaiGai Kohei :
>>> The SE-PostgreSQL patches are updated as follows:
>>>
>>> [1/5] 
>>> http://sepgsql.googlecode.com/files/sepgsql-01-sysatt-8.5devel-r2163.patch
>>> [2/5] 
>>> http://sepgsql.googlecode.com/files/sepgsql-02-core-8.5devel-r2163.patch
>>> [3/5] 
>>> http://sepgsql.googlecode.com/files/sepgsql-03-gram-8.5devel-r2163.patch
>>> [4/5] 
>>> http://sepgsql.googlecode.com/files/sepgsql-04-tests-8.5devel-r2163.patch
>>> [5/5] 
>>> http://sepgsql.googlecode.com/files/sepgsql-05-docs-8.5devel-r2163.patch
>>>
>>> List of updates:
>>> * Patch set was organized to a few ones which provides only core features.
>>> * Code base was upgraded to the latest CVS HEAD.
>>> * Some of features in the fullset edition were separated, to focus on
>>>  the core feature of SE-PostgreSQL at the first commit fest.
>>
>> I took a look at this a little bit.  It looks as if you are still
>> treating the Security Label as a special attribute of the tuple, which
>> seems completely unnecessary given that this patch set is not
>> attempting to implement row-level security.  It seems to me that all
>> you should need is regular old columns pg_class.relseclabel,
>> pg_attribute.attseclabel, etc; it also seems to me that would simplify
>> the code.
>
> Are you saying that whole of the pg_security mechanism also should be
> postponed to the second commit fest, not only system column's definitions?

I'm not sure what you mean by "the whole of the pg_security
mechanism".  But I believe there was negative feedback previously
about the idea of sandwhiching the security label into the tuple
header instead of making it an ordinary (non-system) column in a
system catalog table.

>> As a general comment, I don't have much confidence that your original
>> design for row-level security is a good one.  I think that needs to be
>> thought about very carefully before anything is implemented.  I note
>> that your original design called for a system catalog lookup on each
>> row returned, which is basically saying that all security-label
>> filtering will be implemented as a nested loop with inner index scan.
>> It seems to me that if we ever implement row-level security (which is
>> far from a sure thing) we're certainly going to want to allow the
>> planner to make other decisions, such as sorting the tuples by
>> security label and performing a merge join against the pg_security
>> catalog, or hashing the pg_security catalog and performing a hash
>> join, or using an index on the security label column to perform a
>> bitmap index scan, or any other technique that the planner already
>> knows how to consider.  I say all this not to encourage you to spend
>> more time on row-level security now (because I think that would be
>> premature) but to encourage you to abandon all the design decisions in
>> this patch that are presuming a particular design for row-level
>> security and focus on making the features that are actually in this
>> patch set as lean and robust as possible.
>
> I'm confusing a bit for the comments.
> The row-level access control stuff (which is managed in my repository
> now) does not need to look up the pg_security system catalog every time.
> I guess you believe SE-PgSQL looks up the system catalog to fetch security
> label in text form, and calls in-kernel SELinux to make its decision for
> every tuples. However, it is incorrect.
>
> The userspace avc (security/sepgsql/avc.c) routines enable to cache access
> control decisions recently used, and return its result for the required
> pair of security identifier (not a text form) and type of actions (like
> db_table:{select}), if it hits a cache entries.
> It means SE-PgSQL can return its decisions using only security identifier
> in most cases.

Perhaps you're not making a kernel call for each row (good!), but I
think you are still assuming that you're going to fetch the rows
first, without any sepgsql-specific decision making, and then perform
an operation of some kind on each row to see whether to filter it.  If
so, what I'm saying is that's bad.

Suppose we have two security contexts A and B, and two users Alice and
Bob.  Alice can see only tuples in security context A, and Bob can see
only tuples in security context B.  If I create an index on the
security ID of a table with row-level security, (a) then will it work?
and (b) if one of the users issues a command like "select * for
table", will the planner consider a bitmap-index-scan using that index
rather than a sequential scan of the entire table?

> I think what you suggested is useful, if SE-PgSQL needs security label
> in text form on making its decision. However, it seems to me your comments
> bases on incorrect assumption. Could you point to me, if I incorrectly
> understood the intentions of your comments.
>
>> Another problem that I have with this patch set is that it STILL
>> doesn't have parity with the DAC permis

Re: [HACKERS] Maintenance Policy?

2009-07-12 Thread Greg Stark

On Mon, Jul 13, 2009 at 2:07 AM, Bruce Momjian wrote:
> This might open the larger question of:  What do we actually _promise_
> users?

The discussion had already covered that ground. If someone wants a
promise they'll have to fork over money to one of the companies which
sell such things.

That's why Josh's last email where he said just that we *didn't* plan
to support releases for longer than 5 years is much better than the
other attempts to say how long we *did* plan to support releases.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Maintenance Policy?

2009-07-12 Thread Bruce Momjian

Greg Sabino Mullane wrote:
[ There is text before PGP section. ]
> 
> -BEGIN PGP SIGNED MESSAGE-
> Hash: RIPEMD160
> 
> 
> > For what it's worth I find it hard to believe anyone's really
> > surprised by this. Nearly all other open source projects stop
> > supporting old branches as soon as there's a newer branch is released.
> 
> I'm not surprised at all. Our product holds data - and that's an
> extremely valuable resource to end users (e.g. companies). Nobody wants
> to risk problems and/or suffer long downtimes. Our complete lack of an
> in-place upgrade is what is really making us do the extra effort to support
> old versions. Thankfully, it looks like we've finally started down the
> road to a serious attempt at an upgrade process.
> 
> For what it's worth, I think our release history and current necessarily
> ad-hoc and somewhat arbitrary release process makes it difficult to make
> anything but the vaguest statement on dates, and I'd rather we didn't.

This might open the larger question of:  What do we actually _promise_
users?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Maintenance Policy?

2009-07-12 Thread Greg Sabino Mullane


-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


> For what it's worth I find it hard to believe anyone's really
> surprised by this. Nearly all other open source projects stop
> supporting old branches as soon as there's a newer branch is released.

I'm not surprised at all. Our product holds data - and that's an
extremely valuable resource to end users (e.g. companies). Nobody wants
to risk problems and/or suffer long downtimes. Our complete lack of an
in-place upgrade is what is really making us do the extra effort to support
old versions. Thankfully, it looks like we've finally started down the
road to a serious attempt at an upgrade process.

For what it's worth, I think our release history and current necessarily
ad-hoc and somewhat arbitrary release process makes it difficult to make
anything but the vaguest statement on dates, and I'd rather we didn't.

- --
Greg Sabino Mullane g...@turnstep.com
End Point Corporation
PGP Key: 0x14964AC8 200907122044
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkpahQkACgkQvJuQZxSWSsjehACg7208VOSWEoJuHWMORnhAg82t
IugAn0vSGBI9qUvAUDb3msMeyRzjjuy7
=tcmE
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Maintenance Policy?

2009-07-12 Thread Greg Stark

On Sun, Jul 12, 2009 at 11:53 PM, Josh Berkus wrote:
>
> Well, what I'm really concerned about is letting people know that we expect
> to *stop* providing update versions after 5 years.

That seems like a reasonable thing to say and you've just said it more
simply than any of your previous proposals. I suggest you just go with
the above.

> Every year, this seems to take some people by surprise.

For what it's worth I find it hard to believe anyone's really
surprised by this. Nearly all other open source projects stop
supporting old branches as soon as there's a newer branch is released.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] SE-PgSQL/lite rev.2163

2009-07-12 Thread KaiGai Kohei

Robert, thanks for your comments.

Robert Haas wrote:
> 2009/7/10 KaiGai Kohei :
>> The SE-PostgreSQL patches are updated as follows:
>>
>> [1/5] 
>> http://sepgsql.googlecode.com/files/sepgsql-01-sysatt-8.5devel-r2163.patch
>> [2/5] 
>> http://sepgsql.googlecode.com/files/sepgsql-02-core-8.5devel-r2163.patch
>> [3/5] 
>> http://sepgsql.googlecode.com/files/sepgsql-03-gram-8.5devel-r2163.patch
>> [4/5] 
>> http://sepgsql.googlecode.com/files/sepgsql-04-tests-8.5devel-r2163.patch
>> [5/5] 
>> http://sepgsql.googlecode.com/files/sepgsql-05-docs-8.5devel-r2163.patch
>>
>> List of updates:
>> * Patch set was organized to a few ones which provides only core features.
>> * Code base was upgraded to the latest CVS HEAD.
>> * Some of features in the fullset edition were separated, to focus on
>>  the core feature of SE-PostgreSQL at the first commit fest.
> 
> I took a look at this a little bit.  It looks as if you are still
> treating the Security Label as a special attribute of the tuple, which
> seems completely unnecessary given that this patch set is not
> attempting to implement row-level security.  It seems to me that all
> you should need is regular old columns pg_class.relseclabel,
> pg_attribute.attseclabel, etc; it also seems to me that would simplify
> the code.

Are you saying that whole of the pg_security mechanism also should be
postponed to the second commit fest, not only system column's definitions?

> As a general comment, I don't have much confidence that your original
> design for row-level security is a good one.  I think that needs to be
> thought about very carefully before anything is implemented.  I note
> that your original design called for a system catalog lookup on each
> row returned, which is basically saying that all security-label
> filtering will be implemented as a nested loop with inner index scan.
> It seems to me that if we ever implement row-level security (which is
> far from a sure thing) we're certainly going to want to allow the
> planner to make other decisions, such as sorting the tuples by
> security label and performing a merge join against the pg_security
> catalog, or hashing the pg_security catalog and performing a hash
> join, or using an index on the security label column to perform a
> bitmap index scan, or any other technique that the planner already
> knows how to consider.  I say all this not to encourage you to spend
> more time on row-level security now (because I think that would be
> premature) but to encourage you to abandon all the design decisions in
> this patch that are presuming a particular design for row-level
> security and focus on making the features that are actually in this
> patch set as lean and robust as possible.

I'm confusing a bit for the comments.
The row-level access control stuff (which is managed in my repository
now) does not need to look up the pg_security system catalog every time.
I guess you believe SE-PgSQL looks up the system catalog to fetch security
label in text form, and calls in-kernel SELinux to make its decision for
every tuples. However, it is incorrect.

The userspace avc (security/sepgsql/avc.c) routines enable to cache access
control decisions recently used, and return its result for the required
pair of security identifier (not a text form) and type of actions (like
db_table:{select}), if it hits a cache entries.
It means SE-PgSQL can return its decisions using only security identifier
in most cases.

I think what you suggested is useful, if SE-PgSQL needs security label
in text form on making its decision. However, it seems to me your comments
bases on incorrect assumption. Could you point to me, if I incorrectly
understood the intentions of your comments.

> Another problem that I have with this patch set is that it STILL
> doesn't have parity with the DAC permissions scheme (despite previous
> requests to make it have parity).  For example, you're checking
> privileges like db_column:{drop}, which have no analogue in our
> current permissions scheme.  I think this has been discussed several
> times before, and it seems that you still haven't chosen to fully take
> that advice, which I suspect is going to be an absolute bar to getting
> this committed.  (I am not a committer, of course, so take whatever I
> say with a grain of salt, but that's my opinion for what it's worth.)
> It seems to me that if you have REAL parity with the existing
> permissions scheme, it shouldn't be necessary to add additional,
> separate sepgsql checks in every place currently has a standard
> permissions check.  Instead, you should be able to put all of the
> logic into functions like pg_class_aclmask().  This will be:

I moved several security hooks to the pg_xxx_aclmask() because these
permissions to be checked in same places.
However, both of security models also have different definitions.
For example, when we create a new table, dac checks ACL_CREATE on
the namespace (it may be equivalent to db_schema:{add_object}),
but MAC ch

Re: [HACKERS] Upgrading our minimum required flex version for 8.5

2009-07-12 Thread Tom Lane

Andrew Dunstan  writes:
> Josh Berkus wrote:
>> Oh, I didn't think about the Flex version.  That's going to be a far 
>> more widespread problem.  OSX 10.5, for example, still ships with 
>> 2.5.33.  I suspect that a *lot* of OSes won't have anything up-to-date.

> That's the version Tom is proposing to make the minimum.

Yeah, 2.5.33 is fine (in fact it's what I installed on my HPUX box to
replace 2.5.4).  AFAICS OSX 10.5 is fine, but 10.4 will need a newer
flex version.  This is not anything very new to Mac users.  10.4 also
shipped with bison 1.28, which is too old to build Postgres and has been
too old for many years now.  So anyone building from CVS on that
platform has already installed at least one nondefault tool.

(For comparison's sake: flex 2.5.4 was released in 1996.  2.5.31, which
I'm proposing to make the new minimum, was released in 2003.  Bison
1.875, our current minimum for that tool, was released in 2002, and
we made it the minimum required version in 2003.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Upgrading our minimum required flex version for 8.5

2009-07-12 Thread Andrew Dunstan




Josh Berkus wrote:



This is ready to go except for changing the minimum flex version test
in configure (and associated documentation).  I see that a good-sized
fraction of the buildfarm is still on flex 2.5.4 and will therefore go
red when this goes in.  Should I hold off a bit longer, or is committing
the best way to get their attention?


Oh, I didn't think about the Flex version.  That's going to be a far 
more widespread problem.  OSX 10.5, for example, still ships with 
2.5.33.  I suspect that a *lot* of OSes won't have anything up-to-date.




That's the version Tom is proposing to make the minimum.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Upgrading our minimum required flex version for 8.5

2009-07-12 Thread Josh Berkus




This is ready to go except for changing the minimum flex version test
in configure (and associated documentation).  I see that a good-sized
fraction of the buildfarm is still on flex 2.5.4 and will therefore go
red when this goes in.  Should I hold off a bit longer, or is committing
the best way to get their attention?


Oh, I didn't think about the Flex version.  That's going to be a far 
more widespread problem.  OSX 10.5, for example, still ships with 
2.5.33.  I suspect that a *lot* of OSes won't have anything up-to-date.


--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Maintenance Policy?

2009-07-12 Thread Josh Berkus




For an open source project, would such a statement really mean
anything more than "we'll provide support as long as some
community members feel like it, and we guess that's about 5 years"?


Well, what I'm really concerned about is letting people know that we 
expect to *stop* providing update versions after 5 years.  Every year, 
this seems to take some people by surprise.


--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Upgrading our minimum required flex version for 8.5

2009-07-12 Thread Andrew Dunstan




Tom Lane wrote:

I see that a good-sized
fraction of the buildfarm is still on flex 2.5.4 and will therefore go
red when this goes in.  Should I hold off a bit longer, or is committing
the best way to get their attention?

  


Probably the latter. I will update my various members. I see that 2.5.4 
is the default on my FBSD install, which is the latest, so for this and 
maybe some other current platforms we'll be imposing a bit of extra 
burden to build, but I guess that's the price of progress.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Summary: flex versions, bugs, warnings, etc

2009-07-12 Thread Tom Lane

Before I forget, here is a quick brain dump on $SUBJECT.

The latest release of flex is currently 2.5.35.  2.5.33 is pretty
widespread as well, and there's at least one 2.5.34 in the buildfarm.
There was no 2.5.32.  2.5.31 and (possibly) earlier versions contain
an extremely nasty security problem CVE-2006-0459, which results in the
generated scanner containing a buffer-overflow problem that might be
exploitable for arbitrary code execution.  However, this is possible
only if the scanner definition uses REJECT or trailing context, which
AFAIK none of the scanners in the PG sources do, so the security issue
is not directly a problem for us.  It appears that at least some Linux
distros are still shipping 2.5.31 with a patch for the security issue;
there is one such machine in the buildfarm.

I conclude that we should set the configure version cutoff at 2.5.31,
so as to avoid breaking things for people using those distros.  However,
the flex sourceforge site no longer distributes 2.5.31, so there's no
very easy way to get hold of it for testing purposes.  If it turns out
to have any problems as reported by the buildfarm, I recommend we just
bump the minimum to 2.5.33 rather than try to debug the issue.

Both 2.5.33 and 2.5.35 generate scanners that cause some compilation
warnings when using our preferred gcc flags and the flex options
we'll be needing for a reentrant lexer.  In particular I get this:

In file included from gram.y:11141:
scan.c: In function 'yy_try_NUL_trans':
scan.c:15722: warning: unused variable 'yyg'

which doesn't appear to be fixable without patching flex.
I've filed a bug about it upstream
https://sourceforge.net/tracker/?func=detail&aid=2820457&group_id=97492&atid=618177
but I fear we shall just have to live with it until upstream releases
a version with a fix.

These versions also neglect to generate prototypes for two global
functions yyget_column() and yyset_column(), which apparently are
part of a feature so new it isn't even documented yet :-(.  That
results in warnings thanks to -Wmissing-prototypes.  However, these
two warnings are easily worked around by adding extern's to the
scan.l file, so I've done that rather than annoy upstream with two
cosmetic bugs filed on the same day.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Maintenance Policy?

2009-07-12 Thread Ron Mayer

Josh Berkus wrote:
> I'd suggest that we publish an official policy, with the following dates
> for "EOL":
> 7.4   2009-08-01  ...
> 8.4   2014-08-01

What would such forward-looking statements even mean for a
community-driven project?

I assume for a commercial product, such a statement would mean
something like "I could get my money back or sue for breach of
contract or similar if the vendor stops providing support before
such a date."

For an open source project, would such a statement really mean
anything more than "we'll provide support as long as some
community members feel like it, and we guess that's about 5 years"?

If so, what?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Upgrading our minimum required flex version for 8.5

2009-07-12 Thread Tom Lane

Andrew Dunstan  writes:
> If we're going to have a reentrant lexer, I think we should go the whole 
> nine yards. I agree that a couple of percent slowdown on just the lexing 
> and parsing will be lost in the noise. So +1 from me.

I found a couple of places where a few cycles could be shaved.  The
current version of the patch (attached) seems indistinguishable in
speed from 8.4 on my HPPA box, though still a percent or two slower on
my x86_64 box.

This is ready to go except for changing the minimum flex version test
in configure (and associated documentation).  I see that a good-sized
fraction of the buildfarm is still on flex 2.5.4 and will therefore go
red when this goes in.  Should I hold off a bit longer, or is committing
the best way to get their attention?

regards, tom lane

binWO56Td5sD2.bin
Description: reentrant-parser-1.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logging configuration changes

2009-07-12 Thread Robert Haas

On Sun, Jul 12, 2009 at 3:55 PM, Peter Eisentraut wrote:
> On occasion, I would have found it useful if a SIGHUP didn't only log *that*
> it reloaded the configuration files, but also logged *what* had changed
> (postgresql.conf changes in particular, not so much interested in
> pg_hba.conf).  Especially in light of the common mistake of forgetting to
> uncomment a changed value, this would appear to be useful.
>
> Comments?

Sounds neat.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] SE-PgSQL/lite rev.2163

2009-07-12 Thread Robert Haas

2009/7/10 KaiGai Kohei :
> The SE-PostgreSQL patches are updated as follows:
>
> [1/5] 
> http://sepgsql.googlecode.com/files/sepgsql-01-sysatt-8.5devel-r2163.patch
> [2/5] http://sepgsql.googlecode.com/files/sepgsql-02-core-8.5devel-r2163.patch
> [3/5] http://sepgsql.googlecode.com/files/sepgsql-03-gram-8.5devel-r2163.patch
> [4/5] 
> http://sepgsql.googlecode.com/files/sepgsql-04-tests-8.5devel-r2163.patch
> [5/5] http://sepgsql.googlecode.com/files/sepgsql-05-docs-8.5devel-r2163.patch
>
> List of updates:
> * Patch set was organized to a few ones which provides only core features.
> * Code base was upgraded to the latest CVS HEAD.
> * Some of features in the fullset edition were separated, to focus on
>  the core feature of SE-PostgreSQL at the first commit fest.

I took a look at this a little bit.  It looks as if you are still
treating the Security Label as a special attribute of the tuple, which
seems completely unnecessary given that this patch set is not
attempting to implement row-level security.  It seems to me that all
you should need is regular old columns pg_class.relseclabel,
pg_attribute.attseclabel, etc; it also seems to me that would simplify
the code.

As a general comment, I don't have much confidence that your original
design for row-level security is a good one.  I think that needs to be
thought about very carefully before anything is implemented.  I note
that your original design called for a system catalog lookup on each
row returned, which is basically saying that all security-label
filtering will be implemented as a nested loop with inner index scan.
It seems to me that if we ever implement row-level security (which is
far from a sure thing) we're certainly going to want to allow the
planner to make other decisions, such as sorting the tuples by
security label and performing a merge join against the pg_security
catalog, or hashing the pg_security catalog and performing a hash
join, or using an index on the security label column to perform a
bitmap index scan, or any other technique that the planner already
knows how to consider.  I say all this not to encourage you to spend
more time on row-level security now (because I think that would be
premature) but to encourage you to abandon all the design decisions in
this patch that are presuming a particular design for row-level
security and focus on making the features that are actually in this
patch set as lean and robust as possible.

Another problem that I have with this patch set is that it STILL
doesn't have parity with the DAC permissions scheme (despite previous
requests to make it have parity).  For example, you're checking
privileges like db_column:{drop}, which have no analogue in our
current permissions scheme.  I think this has been discussed several
times before, and it seems that you still haven't chosen to fully take
that advice, which I suspect is going to be an absolute bar to getting
this committed.  (I am not a committer, of course, so take whatever I
say with a grain of salt, but that's my opinion for what it's worth.)
It seems to me that if you have REAL parity with the existing
permissions scheme, it shouldn't be necessary to add additional,
separate sepgsql checks in every place currently has a standard
permissions check.  Instead, you should be able to put all of the
logic into functions like pg_class_aclmask().  This will be:

- Less code.
- Easier to maintain.

With the current design, every time someone makes a change to how DAC
permissions are checked, they have to think about the proper sepgsql
treatment as well.  That seems like a recipe for security bugs.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] fix: plpgsql: return query and dropped columns problem

2009-07-12 Thread Pavel Stehule

Hello

there is fix for bug Re: [BUGS] BUG #4907: stored procedures and changed tables

regards
Pavel Stehule


2009/7/10 Sergey Burladyan :
> Sergey Burladyan  writes:
>
>> Alvaro Herrera  writes:
>>
>> > Michael Tenenbaum wrote:
>> >
>> > > If I have a stored procedure that returns a set of records of a table, I 
>> > > get
>> > > an error message that the procedure's record is the wrong type after I
>> > > change some columns in the table.
>> > >
>> > > Deleting the procedure then rewriting the procedure does not help.  The 
>> > > only
>> > > thing that works is deleting both the stored procedure and the table and
>> > > starting over again.
>> >
>> > Does it work if you disconnect and connect again?
>>
>> No, example:
>
> More simple:
>
> PostgreSQL 8.4.0 on i486-pc-linux-gnu, compiled by GCC gcc-4.3.real (Debian 
> 4.3.3-13) 4.3.3, 32-bit
>
>  create table t (i int);
>  alter table t add v text; alter table t drop i;
>  create function foo() returns setof t language plpgsql as $$begin return 
> query select * from t; end$$;
>  select foo();
> ERROR:  42804: structure of query does not match function result type
> ПОДРОБНО:  Number of returned columns (1) does not match expected column 
> count (2).
> КОНТЕКСТ:  PL/pgSQL function "foo" line 1 at RETURN QUERY
> РАСПОЛОЖЕНИЕ:  validate_tupdesc_compat, pl_exec.c:5143
>
> So, function with RETURNS SETOF tbl does not work if it created after ALTER 
> TABLE
>
> 8.3.7 too:
>
> PostgreSQL 8.3.7 on i486-pc-linux-gnu, compiled by GCC gcc-4.3.real (Debian 
> 4.3.3-5) 4.3.3
>
>  create table t (i int);
>  alter table t add v text; alter table t drop i;
>  create function foo() returns setof t language plpgsql as $$begin return 
> query select * from t; end$$;
>  select * from foo();
> ERROR:  42804: structure of query does not match function result type
> КОНТЕКСТ:  PL/pgSQL function "foo" line 1 at RETURN QUERY
> РАСПОЛОЖЕНИЕ:  exec_stmt_return_query, pl_exec.c:2173
>
>
> --
> Sergey Burladyan
>
> --
> Sent via pgsql-bugs mailing list (pgsql-b...@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs
>
*** ./src/pl/plpgsql/src/pl_exec.c.orig	2009-07-12 17:22:57.268901328 +0200
--- ./src/pl/plpgsql/src/pl_exec.c	2009-07-12 16:57:37.037896969 +0200
***
*** 2284,2289 
--- 2284,2294 
  {
  	Portal		portal;
  	uint32		processed = 0;
+ 	int			i;
+ 	bool	dropped_columns = false;
+ 	Datum	*dvalues;
+ 	bool	*nulls;
+ 	int		natts;
  
  	if (!estate->retisset)
  		ereport(ERROR,
***
*** 2308,2318 
  
  	validate_tupdesc_compat(estate->rettupdesc, portal->tupDesc,
     "structure of query does not match function result type");
  
  	while (true)
  	{
  		MemoryContext old_cxt;
- 		int			i;
  
  		SPI_cursor_fetch(portal, true, 50);
  		if (SPI_processed == 0)
--- 2313,2330 
  
  	validate_tupdesc_compat(estate->rettupdesc, portal->tupDesc,
     "structure of query does not match function result type");
+ 	natts = estate->rettupdesc->natts;
+ 	
+ 	if (natts > portal->tupDesc->natts)
+ 	{
+ 		dropped_columns = true;
+ 		dvalues = (Datum *) palloc0(natts * sizeof(Datum));
+ 		nulls = (bool *) palloc(natts * sizeof(bool));
+ 	}
  
  	while (true)
  	{
  		MemoryContext old_cxt;
  
  		SPI_cursor_fetch(portal, true, 50);
  		if (SPI_processed == 0)
***
*** 2323,2335 
  		{
  			HeapTuple	tuple = SPI_tuptable->vals[i];
  
! 			tuplestore_puttuple(estate->tuple_store, tuple);
  			processed++;
  		}
  		MemoryContextSwitchTo(old_cxt);
  
  		SPI_freetuptable(SPI_tuptable);
  	}
  
  	SPI_freetuptable(SPI_tuptable);
  	SPI_cursor_close(portal);
--- 2335,2374 
  		{
  			HeapTuple	tuple = SPI_tuptable->vals[i];
  
! 			if (!dropped_columns)
! tuplestore_puttuple(estate->tuple_store, tuple);
! 			else
! 			{
! int		anum;
! int		j = 0;
! bool	isnull;
! 
! for (anum = 0; anum < natts; anum++)
! {
! 	if (estate->rettupdesc->attrs[anum]->attisdropped)
! 		nulls[anum] = true;
! 	else
! 	{
! 		dvalues[anum] = SPI_getbinval(tuple, portal->tupDesc,
! 		++j, &isnull);
! 		nulls[anum] = isnull;
! 	}
! }
! tuple = heap_form_tuple(estate->rettupdesc, dvalues, nulls);
! tuplestore_puttuple(estate->tuple_store, tuple);
! 			}
  			processed++;
  		}
  		MemoryContextSwitchTo(old_cxt);
  
  		SPI_freetuptable(SPI_tuptable);
  	}
+ 	
+ 	if (dropped_columns)
+ 	{
+ 		pfree(dvalues);
+ 		pfree(nulls);
+ 	}
  
  	SPI_freetuptable(SPI_tuptable);
  	SPI_cursor_close(portal);
***
*** 5127,5132 
--- 5166,5172 
  validate_tupdesc_compat(TupleDesc expected, TupleDesc returned, const char *msg)
  {
  	int			i;
+ 	int		j = 0;
  	const char *dropped_column_type = gettext_noop("N/A (dropped column)");
  
  	if (!expected || !returned)
***
*** 5134,5153 
  (errcode(ERRCODE_DATATYPE_MISMATCH),
   errmsg("%s", _(msg;
  
- 	if (expected->natts != returned->natts)
- 		ereport(ERRO

[HACKERS] Logging configuration changes

2009-07-12 Thread Peter Eisentraut

On occasion, I would have found it useful if a SIGHUP didn't only log *that* 
it reloaded the configuration files, but also logged *what* had changed 
(postgresql.conf changes in particular, not so much interested in 
pg_hba.conf).  Especially in light of the common mistake of forgetting to 
uncomment a changed value, this would appear to be useful.

Comments?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] concurrent index builds unneeded lock?

2009-07-12 Thread Greg Stark

On Sun, Jul 12, 2009 at 4:42 PM, Tom Lane wrote:
>
> I'm kind of wondering how big the use case for that really is.
> AFAICT the point of a concurrent build is to (re)build an index
> without incurring too much performance penalty for foreground
> query processing.  So how often are you really going to want
> to fire off several of them in parallel?  If you can afford to
> saturate your machine with indexing work, you could use plain
> index builds.

I don't really see those as comparable cases. Firing off multiple
concurrent index builds only requires lots of available I/O
throughput; using plain index builds requires a maintenance window
where any updates to the table is shut down.

Being able to run multiple concurrent index builds just means being
able to roll out a schema change more quickly. It doesn't let you do
anything that was impossible before.

Another thing that's annoyed me about our current support for
concurrent index builds is that you can't run multiple concurrent
builds on the same table. Since they all take the strangely named
ShareUpdateExclusiveLock you can only run one at a time. Fixing that
would require introducing a new, uh, ShareUpdateSharedLock(?) which
conflicts with the vacuum lock but not itself. It didn't seem worth
introducing a new lock type at the time but with syncscanning and the
evidence people are actually doing this I'm starting to wonder.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] concurrent index builds unneeded lock?

2009-07-12 Thread Tom Lane

Actually ... why do we have to have that third wait step at all?
Doesn't the indcheckxmin mechanism render it unnecessary, or couldn't
we adjust the comparison xmin to make it unnecessary?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] concurrent index builds unneeded lock?

2009-07-12 Thread Tom Lane

Greg Stark  writes:
> So I think we're back to looking at a special case for concurrent
> index builds to not wait on other concurrent index builds.

I'm kind of wondering how big the use case for that really is.
AFAICT the point of a concurrent build is to (re)build an index
without incurring too much performance penalty for foreground
query processing.  So how often are you really going to want
to fire off several of them in parallel?  If you can afford to
saturate your machine with indexing work, you could use plain
index builds.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] concurrent index builds unneeded lock?

2009-07-12 Thread Greg Stark

On Sun, Jul 12, 2009 at 4:17 PM, Tom Lane wrote:
> The requirement wasn't just on not changing SQL data though.  To make
> use of this you'd also have to forbid indexed functions from *reading*
> other tables.  Which is something we discourage because of the risk that
> the results aren't really immutable, but we don't forbid it; and there
> are obvious use-cases.

Well it's my fault but the discussion kind of mutated in the middle.
For the original use case I think only changing SQL data would
actually be a threat. The concurrent index build only has to wait out
any transactions which might update the table without updating the
index. Which, even if there are volatile functions in the index
expression, index where clause, or index operators, they aren't really
likely to do.

The other thing is that the worst case if they do is you end up with a
corrupted index which is missing entries or has duplicate entries.
That's the same risk you always have if you have volatile functions
mismarked and used in your index definition.

For the mutated discussion where I was trying to find a mechanism that
would be more generally useful that's not the case. Vacuum needs to
know whether you ever plan to *read* from a table in the future. But
that's not what concurrent index builds need to know.

So I think we're back to looking at a special case for concurrent
index builds to not wait on other concurrent index builds.
-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] concurrent index builds unneeded lock?

2009-07-12 Thread Tom Lane

Josh Berkus  writes:
> On 7/11/09 3:50 AM, Greg Stark wrote:
>> Hm. Actually maybe not. What if the index is an expression index and
>> the expression includes a function which does an SQL operation? I'm
>> not sure how realistic that is since to be a danger that SQL operation
>> would have to be an insert, update, or delete which is not just
>> bending the rules.

> It's not realistic at all.  People are only supposed to use IMMUTABLE 
> functions for experession indexes; if they declare a volatile function 
> as immutable, then it's their own lookout if they corrupt their data.

The requirement wasn't just on not changing SQL data though.  To make
use of this you'd also have to forbid indexed functions from *reading*
other tables.  Which is something we discourage because of the risk that
the results aren't really immutable, but we don't forbid it; and there
are obvious use-cases.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Upgrading our minimum required flex version for 8.5

2009-07-12 Thread Pavel Stehule

2009/7/12 Tom Lane :
> Pavel Stehule  writes:
>> 2009/7/12 Tom Lane :
>>> If we're going to go for reentrancy
>>> I think we should fix both components.
>
>> when we don't use reentrant grammar, then we cannot use main sql parser in 
>> SQL?
>
> It wouldn't be a problem for the immediate application I have in mind,
> which is to re-use the core lexer in plpgsql.  But it does seem like
> it might be a problem down the road as plpgsql gets smarter.
>

it's bad. I thing so integration main parser into plpgsql should be
the most important feature of plpgsql from trapping exception time. I
have to ask - we need it necessary reetrant grammer? We need
integration only in complilation time - for CREATE FUNCTION statement.
Can be nonreetrant grammer problem (but we have to store some info
from validation time somewhere - maybe in probin column) ?

>                        regards, tom lane
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-12 Thread Tom Lane

Andres Freund  writes:
> Has anybody tried Geqo without ERX in recent time?

No, I don't think so.  Anything that's ifdef'd out at the moment has
been that way for quite a few years, and so is likely broken anyhow :-(

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-12 Thread Andres Freund

On Sunday 12 July 2009 16:44:59 Tom Lane wrote:
> Andres Freund  writes:
> > On Saturday 11 July 2009 20:33:14 Tom Lane wrote:
> >> This ties into something I was thinking about earlier: the planner is
> >> potentially recursive (eg, it might call a user-defined function that
> >> contains a plannable query), and it'd probably be a good idea if the
> >> behavior of GEQO wasn't sensitive to that.  So the RNG's state needs to
> >> be kept in PlannerInfo or some associated structure, not be global.
> >
> > Hm. I looked a bit into this and I dont see a real problem with a global
> > random state if that one gets reinitialized on every geqo() invocation.
> > If I understood the code correctly - which is not sure at all -  while
> > make_rel_from_joinlist is itself recursively the actual join search code
> > is not recursive. Correct?
> I wouldn't count on that.  GEQO is not recursive with respect to a
> particular query, but there's still the risk of the planner deciding
> to call out to some user-defined code while it does selectivity
> estimates.  So the planner as a whole has to be re-entrant.

> Now you could probably argue that the impact of extra RNG resets on
> the overall behavior is small enough that it doesn't matter.  But if
> you really want to promise consistent GEQO behavior then I think the
> RNG state has to be local to a particular planner instantiation.
I just did not see that it could call selectivity estimate functions. This 
will mean adding a additional Paramer (PlannerInfo) to most of the geqo 
functions, but I don't see a problem there.

Has anybody tried Geqo without ERX in recent time?

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-12 Thread Tom Lane

Andres Freund  writes:
> On Saturday 11 July 2009 20:33:14 Tom Lane wrote:
>> This ties into something I was thinking about earlier: the planner is
>> potentially recursive (eg, it might call a user-defined function that
>> contains a plannable query), and it'd probably be a good idea if the
>> behavior of GEQO wasn't sensitive to that.  So the RNG's state needs to
>> be kept in PlannerInfo or some associated structure, not be global.

> Hm. I looked a bit into this and I dont see a real problem with a global 
> random state if that one gets reinitialized on every geqo() invocation. If I 
> understood the code correctly - which is not sure at all -  while 
> make_rel_from_joinlist is itself recursively the actual join search code is 
> not recursive. Correct?

I wouldn't count on that.  GEQO is not recursive with respect to a
particular query, but there's still the risk of the planner deciding
to call out to some user-defined code while it does selectivity
estimates.  So the planner as a whole has to be re-entrant.

Now you could probably argue that the impact of extra RNG resets on
the overall behavior is small enough that it doesn't matter.  But if
you really want to promise consistent GEQO behavior then I think the
RNG state has to be local to a particular planner instantiation.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Upgrading our minimum required flex version for 8.5

2009-07-12 Thread Tom Lane

Pavel Stehule  writes:
> 2009/7/12 Tom Lane :
>> If we're going to go for reentrancy
>> I think we should fix both components.

> when we don't use reentrant grammar, then we cannot use main sql parser in 
> SQL?

It wouldn't be a problem for the immediate application I have in mind,
which is to re-use the core lexer in plpgsql.  But it does seem like
it might be a problem down the road as plpgsql gets smarter.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Upgrading our minimum required flex version for 8.5

2009-07-12 Thread Andrew Dunstan




Tom Lane wrote:

As best I can tell after some casual testing on a couple of machines,
the actual bottom line is that "raw_parser" (ie, the bison and flex
processing) is going to be a couple of percent slower with a reentrant
grammar and lexer, for typical queries involving a lot of short tokens.
Now this disappears into the noise as soon as you include parse analysis
(let alone planning and execution), but it is possible to measure the
slowdown in a test harness that calls raw_parser only.

A possible compromise that I think would avoid most or all of the
slowdown is to make the lexer reentrant but not the grammar (so that
yylval and yylloc remain as global variables instead of being parameters
to yylex).  I haven't actually benchmarked that, though.  It strikes
me as a fairly silly thing to do.  If we're going to go for reentrancy
I think we should fix both components.

I'm willing to live with the small slowdown.  Comments?


  



If we're going to have a reentrant lexer, I think we should go the whole 
nine yards. I agree that a couple of percent slowdown on just the lexing 
and parsing will be lost in the noise. So +1 from me.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-12 Thread Andres Freund

Hi Tom,

On Saturday 11 July 2009 20:33:14 Tom Lane wrote:
> Andres Freund  writes:
> > I just realized doing it in a naive way (in geqo()) causes the state to
> > be reset multiple times during one query- at every invocation of
> > make_rel_from_joinlist.
> >
> > Does anybody see a problem with that?
>
> I think that's probably what we want.  Otherwise, you'd have a situation
> where two identical subproblems might get planned differently.
>
> This ties into something I was thinking about earlier: the planner is
> potentially recursive (eg, it might call a user-defined function that
> contains a plannable query), and it'd probably be a good idea if the
> behavior of GEQO wasn't sensitive to that.  So the RNG's state needs to
> be kept in PlannerInfo or some associated structure, not be global.
Hm. I looked a bit into this and I dont see a real problem with a global 
random state if that one gets reinitialized on every geqo() invocation. If I 
understood the code correctly - which is not sure at all -  while 
make_rel_from_joinlist is itself recursively the actual join search code is 
not recursive. Correct?
Thus it would be enough to reset the seed on every geqo() invocation...

Any counter arguments?

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New types for transparent encryption

2009-07-12 Thread Sam Mason

On Tue, Jul 07, 2009 at 05:35:28PM +0900, Itagaki Takahiro wrote:
> Our manual says we can use pgcrypto functions or encrypted filesystems
> for data encryption.
> http://www.postgresql.org/docs/8.4/static/encryption-options.html
> 
> However, they are not always the best approaches in some cases.
> 
> For pgcrypto functions, user's SQL must contain keyword strings
> and they need to consider which column is encrypted. Users complaint
> that that they want to treat encrypted values as if not-encrypted.

As others have said, handling encryption client side would seem to offer
many more benefits (transparently within libpq offering easy adoption).

> passward() and options() are SQL functions and we can re-define them
> if needed. The default implementations are to refer custom GUC variables
> (pgcrypto.password and pgcrypto.options) so that encryption are done
> only in database server and applications don't have to know the details.

Should the password be this widely shared? it would seem to make more
sense if it was a write-only variable and never exposed outside the
crypto module.  You wouldn't even need to be a super-user to collect all
the passwords otherwise, just create a function that has the name of
something common and have it stash the password aware somewhere.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Upgrading our minimum required flex version for 8.5

2009-07-12 Thread Pavel Stehule

2009/7/12 Tom Lane :
> Andrew Dunstan  writes:
>> Tom Lane wrote:
>>> Andrew Dunstan  writes:
 I think it would need to be benchmarked. My faint recollection is that
 the re-entrant lexers are slower.
>>>
>>> The flex documentation states in so many words:
>>>    The option `--reentrant' does not affect the performance of the scanner.
>>> Do you feel a need to verify their claim?
>
>> No, I'll take their word for it. I must have been thinking of something
>> else.
>
> As I got further into this, it turned out that Andrew's instinct was
> right: it does need to be benchmarked.  Although the inner loops of the
> lexer seem to be the same with or without --reentrant, once you buy into
> the whole nine yards of --reentrant, --bison-bridge, and a "pure" bison
> parser, you find out that the lexer's API changes: there are more
> parameters to yylex() than there used to be.  It's also necessary to
> pass around a yyscanner pointer to all the subroutines in scan.l.  (But
> on the other hand, this eliminates accesses to global variables, which
> are often not that cheap.)  So the "no performance impact" claim isn't
> telling the whole truth.
>
> As best I can tell after some casual testing on a couple of machines,
> the actual bottom line is that "raw_parser" (ie, the bison and flex
> processing) is going to be a couple of percent slower with a reentrant
> grammar and lexer, for typical queries involving a lot of short tokens.
> Now this disappears into the noise as soon as you include parse analysis
> (let alone planning and execution), but it is possible to measure the
> slowdown in a test harness that calls raw_parser only.
>
> A possible compromise that I think would avoid most or all of the
> slowdown is to make the lexer reentrant but not the grammar (so that
> yylval and yylloc remain as global variables instead of being parameters
> to yylex).  I haven't actually benchmarked that, though.  It strikes
> me as a fairly silly thing to do.  If we're going to go for reentrancy
> I think we should fix both components.

when we don't use reentrant grammar, then we cannot use main sql parser in SQL?

Pavel

>
> I'm willing to live with the small slowdown.  Comments?
>
>                        regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

37 matches

Mail list logo