date:20091028

Re: [HACKERS] Syntax for partitioning

2009-10-28 Thread Pavel Stehule

2009/10/29 Itagaki Takahiro :
> I'd like to improve partitioning feature in 8.5.
> Kedar-san's previous work is wonderful, but I cannot see any updated patch.
> http://archives.postgresql.org/message-id/bd8134a40906080702s96c90a9q3bbb581b9bd0d...@mail.gmail.com
>
> So, I'll take over the work if there are no ones to do it.
> I'm thinking to add syntax support first. Table partitioning was
> proposed many times, but it is still not applied into core.
> The reason is it is too difficult to make perfect partitioning
> feature at once. I think syntax support is a good start.
>
> First, I will add syntax for CREATE TABLE, ALTER TABLE ADD/DROP PARTITION.
> The syntax is borrowed from from Oracle and MySQL. Their characteristics
> are using "LESS THAN" in range partitioning. The keyword "PARTITION" is
> added to the full-reserved keyword list to support ADD/DROP PARTITION.
>
> Those syntax is merely a syntax sugar for INHERITS with CHECK. Declarations
> are translated into CHECK constraints. I have a plan to adjust pg_dump to
> dump definitions of partitioning in the correct format, but the actual
> implementation will be still based on constraint exclusion. In addition,
> hash partitioning is not implemented; syntax is parsed but "not implemented"
> error are raised for now.
>
> Here is syntax I propose:
> 
> ALTER TABLE table_name ADD PARTITION name ...;
> ALTER TABLE table_name DROP PARTITION [IF EXISTS] name [CASCADE | RESTRICT];
>
> Range partitioning:
>  CREATE TABLE table_name ( columns )
>    PARTITION BY RANGE ( a_expr )
>    (
>      PARTITION name VALUES LESS THAN [(] const [)],
>      PARTITION name VALUES LESS THAN [(] MAXVALUE [)] -- overflow partition
>    );
>
> List partitioning:
>  CREATE TABLE table_name ( columns )
>    PARTITION BY LIST ( a_expr )
>    (
>      PARTITION name VALUES [IN] ( const [, ...] ),
>      PARTITION name VALUES [IN] [(] DEFAULT [)]       -- overflow partition
>    );
>
> Hash partitioning:
>  CREATE TABLE table_name ( columns )
>    PARTITION BY HASH ( a_expr )
>    PARTITIONS num_partitions;
>
>  CREATE TABLE table_name ( columns )
>    PARTITION BY HASH ( a_expr )
>    (
>      PARTITION name,
>      ...
>    );
>
> Note:
>  * Each partition can have optional WITH (...) and TABLESPACE clauses.
>  * '(' and ')' are optional to support both Oracle and MySQL syntax.
> 
>
> Comments welcome.

+1

Pavel

>
> Regards,
> ---
> ITAGAKI Takahiro
> NTT Open Source Software Center
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Syntax for partitioning

2009-10-28 Thread Itagaki Takahiro

I'd like to improve partitioning feature in 8.5.
Kedar-san's previous work is wonderful, but I cannot see any updated patch.
http://archives.postgresql.org/message-id/bd8134a40906080702s96c90a9q3bbb581b9bd0d...@mail.gmail.com

So, I'll take over the work if there are no ones to do it.
I'm thinking to add syntax support first. Table partitioning was
proposed many times, but it is still not applied into core.
The reason is it is too difficult to make perfect partitioning
feature at once. I think syntax support is a good start.

First, I will add syntax for CREATE TABLE, ALTER TABLE ADD/DROP PARTITION.
The syntax is borrowed from from Oracle and MySQL. Their characteristics
are using "LESS THAN" in range partitioning. The keyword "PARTITION" is
added to the full-reserved keyword list to support ADD/DROP PARTITION.

Those syntax is merely a syntax sugar for INHERITS with CHECK. Declarations
are translated into CHECK constraints. I have a plan to adjust pg_dump to
dump definitions of partitioning in the correct format, but the actual
implementation will be still based on constraint exclusion. In addition,
hash partitioning is not implemented; syntax is parsed but "not implemented"
error are raised for now.

Here is syntax I propose:

ALTER TABLE table_name ADD PARTITION name ...;
ALTER TABLE table_name DROP PARTITION [IF EXISTS] name [CASCADE | RESTRICT];

Range partitioning:
  CREATE TABLE table_name ( columns )
PARTITION BY RANGE ( a_expr )
(
  PARTITION name VALUES LESS THAN [(] const [)],
  PARTITION name VALUES LESS THAN [(] MAXVALUE [)] -- overflow partition
);

List partitioning:
  CREATE TABLE table_name ( columns )
PARTITION BY LIST ( a_expr )
(
  PARTITION name VALUES [IN] ( const [, ...] ),
  PARTITION name VALUES [IN] [(] DEFAULT [)]   -- overflow partition
);

Hash partitioning:
  CREATE TABLE table_name ( columns )
PARTITION BY HASH ( a_expr )
PARTITIONS num_partitions;

  CREATE TABLE table_name ( columns )
PARTITION BY HASH ( a_expr )
(
  PARTITION name,
  ...
);

Note:
  * Each partition can have optional WITH (...) and TABLESPACE clauses.
  * '(' and ')' are optional to support both Oracle and MySQL syntax.


Comments welcome.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: push AFTER-trigger execution into ModifyTable node

2009-10-28 Thread Robert Haas

On Wed, Oct 28, 2009 at 8:45 PM, Tom Lane  wrote:
> Marko Tiikkaja  writes:
>> Like we've discussed before, WITH (.. RETURNING ..) is probably most
>> useful for moving rows from one table to another.  When you're moving a
>> lot of rows around, there's some point where I believe this execution
>> strategy will be a lot slower than the traditional approach due to
>> storing the RETURNING results on disk.  I've been thinking that in some
>> cases we could inline the CTE for this to actually be a quite
>> significant performance benefit, so I'm not too fancy about the approach
>> you're suggesting.
>
> Well, this is what we need to nail down *now*.  Are we going to say that
> use of WITH(RETURNING) means you forfeit all guarantees about order of
> trigger firing?  Short of that, I don't believe that it is sane to think
> about pipelining such things.  And if we do do that, it sounds like a
> security hole to me, because the owner of the trigger isn't the one who
> agreed to forfeit predictability.

I don't see why either behavior would be a security hole; we get to
define how the system behaves, and users have to write their triggers
to cope with that behavior.  We don't want to throw random roadbocks
in the way of sanity, but users are not entitled to assume that no
future major release of PG will have semantics that are in any way
different from whichever release they're now running, especially for
features that don't even exist in the current release.  If you have a
specific concern here, maybe you could provide an example.

To be honest, I'm not entirely comfortable with either behavior.
Pipelining a delete out of one table into an insert into another table
seems VERY useful to me, and I'd like us to have a way to do that.  On
the other hand, in more complex cases, the fact that the effects of a
statement are normally not visible to that statement could lead to
some fairly confusing behavior, especially when triggers are involved.
 So I don't really know what the right thing is.  What I really want
is to provide both behaviors, but I'm not sure there's any sensible
way to do that, and even if there were it's not clear to me that users
will know which one they want.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: push AFTER-trigger execution into ModifyTable node

2009-10-28 Thread Tom Lane

Marko Tiikkaja  writes:
> Like we've discussed before, WITH (.. RETURNING ..) is probably most
> useful for moving rows from one table to another.  When you're moving a
> lot of rows around, there's some point where I believe this execution
> strategy will be a lot slower than the traditional approach due to
> storing the RETURNING results on disk.  I've been thinking that in some
> cases we could inline the CTE for this to actually be a quite
> significant performance benefit, so I'm not too fancy about the approach
> you're suggesting.

Well, this is what we need to nail down *now*.  Are we going to say that
use of WITH(RETURNING) means you forfeit all guarantees about order of
trigger firing?  Short of that, I don't believe that it is sane to think
about pipelining such things.  And if we do do that, it sounds like a
security hole to me, because the owner of the trigger isn't the one who
agreed to forfeit predictability.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: push AFTER-trigger execution into ModifyTable node

2009-10-28 Thread Marko Tiikkaja


Tom Lane wrote:

So, before I go off and do that work: anybody have an objection to this
line of development?  The main implication of changing to this approach
is that we'll be nailing down the assumption that each WITH (command
RETURNING) clause acts very much like a separate statement for
trigger purposes: it will fire BEFORE STATEMENT triggers at start,
and AFTER STATEMENT triggers at end, and actually execute all
non-deferred AFTER triggers, before we move on to executing the next
WITH clause or the main query.


Like we've discussed before, WITH (.. RETURNING ..) is probably most
useful for moving rows from one table to another.  When you're moving a
lot of rows around, there's some point where I believe this execution
strategy will be a lot slower than the traditional approach due to
storing the RETURNING results on disk.  I've been thinking that in some
cases we could inline the CTE for this to actually be a quite
significant performance benefit, so I'm not too fancy about the approach
you're suggesting.


Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] WIP: push AFTER-trigger execution into ModifyTable node

2009-10-28 Thread Tom Lane

In http://archives.postgresql.org/message-id/26545.1255140...@sss.pgh.pa.us
I suggested that we should push the actual execution (not just queuing)
of non-deferred AFTER triggers into the new ModifyTable plan node.
The attached patch does that, and seems like a nice improvement since it
removes knowledge of trigger handling from a number of other places.
However the original objective was to allow EXPLAIN to associate trigger
runtimes with ModifyTable nodes, and I realized that this patch doesn't
accomplish that --- the trigger stats are still accumulated in the
executor-wide EState, not in the ModifyTable node.  Right at the moment
we could cheat and have EXPLAIN print the trigger stats under
ModifyTable anyway, because there can be only one ModifyTable in any
plan tree.  But that will fall down as soon as we try to let INSERT
RETURNING and friends execute within WITH clauses.

After poking around a bit I think it should be possible to keep the
trigger instrumentation data in the ModifyTable node instead of in
EState, and thereby allow EXPLAIN to know which node to blame the
trigger time on.  This will require passing an extra parameter down
from nodeModifyTable into the trigger code, but none of those call paths
are very long.  Still, it'll be a significantly more invasive patch by
the time it's done than what you see here.

So, before I go off and do that work: anybody have an objection to this
line of development?  The main implication of changing to this approach
is that we'll be nailing down the assumption that each WITH (command
RETURNING) clause acts very much like a separate statement for
trigger purposes: it will fire BEFORE STATEMENT triggers at start,
and AFTER STATEMENT triggers at end, and actually execute all
non-deferred AFTER triggers, before we move on to executing the next
WITH clause or the main query.

regards, tom lane

Index: src/backend/commands/explain.c
===
RCS file: /cvsroot/pgsql/src/backend/commands/explain.c,v
retrieving revision 1.192
diff -c -r1.192 explain.c
*** src/backend/commands/explain.c	12 Oct 2009 18:10:41 -	1.192
--- src/backend/commands/explain.c	28 Oct 2009 22:24:42 -
***
*** 19,25 
  #include "commands/defrem.h"
  #include "commands/explain.h"
  #include "commands/prepare.h"
- #include "commands/trigger.h"
  #include "executor/instrument.h"
  #include "optimizer/clauses.h"
  #include "optimizer/planner.h"
--- 19,24 
***
*** 354,363 
  
  	INSTR_TIME_SET_CURRENT(starttime);
  
- 	/* If analyzing, we need to cope with queued triggers */
- 	if (es->analyze)
- 		AfterTriggerBeginQuery();
- 
  	/* Select execution options */
  	if (es->analyze)
  		eflags = 0;/* default run-to-completion flags */
--- 353,358 
***
*** 383,402 
  	ExplainPrintPlan(es, queryDesc);
  
  	/*
! 	 * If we ran the command, run any AFTER triggers it queued.  (Note this
! 	 * will not include DEFERRED triggers; since those don't run until end of
! 	 * transaction, we can't measure them.)  Include into total runtime.
  	 */
  	if (es->analyze)
  	{
- 		INSTR_TIME_SET_CURRENT(starttime);
- 		AfterTriggerEndQuery(queryDesc->estate);
- 		totaltime += elapsed_time(&starttime);
- 	}
- 
- 	/* Print info about runtime of triggers */
- 	if (es->analyze)
- 	{
  		ResultRelInfo *rInfo;
  		bool		show_relname;
  		int			numrels = queryDesc->estate->es_num_result_relations;
--- 378,389 
  	ExplainPrintPlan(es, queryDesc);
  
  	/*
! 	 * Print info about runtime of triggers.  (Note this will not include
! 	 * DEFERRED triggers; since those don't run until end of transaction, we
! 	 * can't measure them.)
  	 */
  	if (es->analyze)
  	{
  		ResultRelInfo *rInfo;
  		bool		show_relname;
  		int			numrels = queryDesc->estate->es_num_result_relations;
Index: src/backend/commands/portalcmds.c
===
RCS file: /cvsroot/pgsql/src/backend/commands/portalcmds.c,v
retrieving revision 1.81
diff -c -r1.81 portalcmds.c
*** src/backend/commands/portalcmds.c	7 Oct 2009 16:27:18 -	1.81
--- src/backend/commands/portalcmds.c	28 Oct 2009 22:24:42 -
***
*** 264,270 
  			PG_TRY();
  			{
  CurrentResourceOwner = portal->resowner;
- /* we do not need AfterTriggerEndQuery() here */
  ExecutorEnd(queryDesc);
  FreeQueryDesc(queryDesc);
  			}
--- 264,269 
***
*** 371,377 
  		 * Now shut down the inner executor.
  		 */
  		portal->queryDesc = NULL;		/* prevent double shutdown */
- 		/* we do not need AfterTriggerEndQuery() here */
  		ExecutorEnd(queryDesc);
  		FreeQueryDesc(queryDesc);
  
--- 370,375 
Index: src/backend/commands/trigger.c
===
RCS file: /cvsroot/pgsql/src/backend/commands/trigger.c,v
retrieving revision 1.256
diff -c -r1.256 trigger.c
*** src/backend/commands/trigger.c

Re: [HACKERS] Show schema size with \dn+

2009-10-28 Thread Guillaume Lelarge

Le mercredi 28 octobre 2009 à 15:11:31, Anders Steinlein a écrit :
> Is there any interest in expanding \dn+ to show schema size, similar
> to table sizes using \dt+ in 8.4? We use separate schemas for each
> user, so this would allow us to quickly look up the sizes of each
> user's data.
> 
> I have little experience with C and none with the PostgreSQL code base
> -- where should I look to have a go at this?
> 

I would say source file src/bin/psql/describe.c, function listSchemas.


-- 
Guillaume.
 http://www.postgresqlfr.org
 http://dalibo.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Simon Riggs

On Tue, 2009-10-27 at 00:38 -0400, Greg Smith wrote:
> new feature

One additional point that would be useful is a way to match up the usage
of custom_variable_classes with this new style of .conf file processing.

At the moment if you wish to add a custom variable class everybody needs
to edit the *same* parameter. Finding which one to edit could be a
little difficult with a whole directory to search in.

I propose a new form of processing for that variable: each new parameter
instance is added to last one, rather than replacing it.
e.g.
custom_variable_class = 'x'
custom_variable_class = 'y'
custom_variable_class = 'z'
is equivalent to
custom_variable_classes = 'x,y,z'

That allows NewFeatureX to drop in a file called "newfeaturex.conf",
which looks like this

custom_variable_class = 'newfeaturex'
newfeaturex.param1 = x
newfeaturex.param2 = y
newfeaturex.param3 = z

This requires no editing of any other files, just a straight drop in.
That will make it much easier to produce real installers/deinstallers
for add-in modules.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1

2009-10-28 Thread Hannu Krosing

On Wed, 2009-10-28 at 15:31 -0400, Tom Lane wrote:
> Hannu Krosing  writes:
> > I had never checked the docs for hash functions, but I had assumed, that
> > internal functions are prefixed by pg_ and anything else is public, free
> > to use functionality.
> 
> Sure, it's free to use.  It's not free to assume that we promise never
> to change it.
>
> > Changing hash functions also makes in-place upgrades a lot harder, as
> > they can't be done incrementally anymore for tables which use hash
> > indexes.
> 
> Hash indexes are so far from being production-grade that this argument
> is not significant.

AFAIK in-place upgrade is also not quite production-grade, so this was
meant as a forward-looking note for next time the hashxxx functions will
change.

>   regards, tom lane
-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Andrew Dunstan




Greg Smith wrote:

On Wed, 28 Oct 2009, Josh Berkus wrote:


It's the basic and unsolvable issue of how do you have a file which is
both perfectly human-readable-and-editable *and* perfectly
machine-readable-and-editable at the same time.


Let's see...if I remember correctly from the last two rounds of this 
discussion, this is the point where someone pops up and says that 
switching to XML for the postgresql.conf will solve this problem. 
Whoever does that this time goes into the ring with Kevin and I, but 
they don't get a club.  (All fight proceeds to benefit SPI of course).





That's precisely why I didn't get into this discussion (you guys are scary).

It really does seem like deja vu all over again. I'm usually good for a 
given argument once.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1

2009-10-28 Thread Hannu Krosing

On Wed, 2009-10-28 at 12:51 -0700, Jeff Davis wrote:
> On Wed, 2009-10-28 at 21:09 +0200, Hannu Krosing wrote:
> > Is at least the fact that they "are undocumented, have changed in the
> > past, and are likely to change again in the future" documented ?
> 
> That's a little confusing to me: how do we document that something is
> undocumented? And where do we stop?

My previous e-mail message documents that the undocumentedness is
undocumented, so no need to go any further here ;)

Though undocumented, the hash functions are easily discoverable by doing

\df *hash*

in psql

> > Hashing is a quite fundamental thing in computing, so I was quite
> > surprised to find out it had silently changed. 
> 
> There are many reasons to use a hash, and we don't want people to use
> these functions for the wrong purpose. 

I don't think that not documenting a hash function helps here at all.

> I have seen people use a
> performance hash for security purposes before, and I had to demonstrate
> some hash collisions to show why that was a bad idea. 

I've seen people use CRC32 as hash and then hit a collisions in 15 tries
with quite large keys.

> So, if we do provide documented functions, it should be done carefully.

Any user-visible behavior change should be done carefully, even if the
original behavior is not documented.

Careful upgrade of hasxxx functions would have kept the old functions,
and introduced the new ones with _v2 suffix, and then used these in
appropriate places. then kept the old ones for a few versions, with
maybe a deprecation warning and then moved them to contrib for a few
more versions.

Doing it this way could leave them "undocumented" and still not break
peoples applications in mysterious ways.

> Trying to develop and document a set of standardized, stable hash
> functions covering a wide range of possible use cases sounds like it may
> be better served by an extension.

I guess there are enough security/crypt/strong hashes in pgcrypto
package so that should not be a problem.

-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Robert Haas

On Wed, Oct 28, 2009 at 4:52 PM, Greg Smith  wrote:
> On Wed, 28 Oct 2009, Robert Haas wrote:
>
>> It would be completely logical to break up the configuration file into
>> subfiles by TOPIC.  That would complicate things for tool-writers
>> because they would need to get each setting into the proper file, and
>> we currently don't have any infrastructure for that.
>
> Already done:
>
> # select name,category from pg_settings limit 1;
>       name       |                             category
> --+---
>  add_missing_from | Version and Platform Compatibility / Previous PostgreSQL
> Versions
>
> You could make one per category, and pgtune for example already knows all
> this info.  The somewhat arbitrary category assignments Josh put things into
> are what Peter was complaining about upthread.  Questions like "is
> 'effective_cache_size' a memory parameters or an optimizer one?" show why
> this is not trivial to do well.

I stand corrected.  I think the basic thrust of the paragraph stands -
this is not why people are asking for the feature, or if it is that
hasn't been articulated or discussed.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Greg Smith


On Wed, 28 Oct 2009, Robert Haas wrote:


It would be completely logical to break up the configuration file into
subfiles by TOPIC.  That would complicate things for tool-writers
because they would need to get each setting into the proper file, and
we currently don't have any infrastructure for that.


Already done:

# select name,category from pg_settings limit 1;
   name   | category
--+---
 add_missing_from | Version and Platform Compatibility / Previous 
PostgreSQL Versions


You could make one per category, and pgtune for example already knows all 
this info.  The somewhat arbitrary category assignments Josh put things 
into are what Peter was complaining about upthread.  Questions like "is 
'effective_cache_size' a memory parameters or an optimizer one?" show why 
this is not trivial to do well.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Robert Haas

On Wed, Oct 28, 2009 at 4:24 PM, Josh Berkus  wrote:
>
>> Let's see...if I remember correctly from the last two rounds of this
>> discussion, this is the point where someone pops up and says that
>> switching to XML for the postgresql.conf will solve this problem.
>> Whoever does that this time goes into the ring with Kevin and I, but
>> they don't get a club.  (All fight proceeds to benefit SPI of course).
>
> XML is s last-week.  JSON!
>
> Oh, wait, we're PostgreSQL, we're not that mainstream.  YAML!

Definitely ASN.1

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] alpha2 bundled -- please verify

2009-10-28 Thread Devrim GÜNDÜZ

On Wed, 2009-10-21 at 23:26 +0300, Peter Eisentraut wrote:
> Alpha2 has been bundled and is available at
> 
> http://developer.postgresql.org/~petere/alpha/
> 
> Please check that it is sane.

Peter, could you please provide md5sum for alpha2 tarball? I know we
missed in alpha1. It would be better if the original packager would
upload the md5sum.

Also, I'll be happy if you can also upload .bz2 file (for the lazy RPM
packages who does not want to play with his spec files a lot).

Regards,
-- 
Devrim GÜNDÜZ, RHCE
Command Prompt - http://www.CommandPrompt.com 
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz

signature.asc
Description: This is a digitally signed message part

Re: [HACKERS] inefficient use of relation extension?

2009-10-28 Thread Alvaro Herrera

Tom Lane wrote:
> Alvaro Herrera  writes:

> > I regularly (several times a day) see five or six processes all with
> > pg_locks locktype=extend granted=f on the same table, waiting for a long
> > time.
> 
> I'm not sure what's causing that, but I *seriously* doubt that adding
> new pages to FSM right away would make it better.  What it sounds like
> is someone is getting hung up while holding the lock.  You should try to
> investigate who's got the lock when this happens, and what they're doing
> or waiting for.

After some more research, these facts have arisen:

- the relation extension lock in question is on a toast table
- the entries stored in that table are long enough that they need more
than one page
- the BufMappingLock is seen as severely contended among the various
processes trying to extend the table

So the dozen+ processes fight the BufMapping lwlock while attempting to
extend the toast table, to get a free buffer for the new page; do this
one page at a time, and then go back to do the same thing over and over.

Shared_buffers is large (4.6 GB) and I'm not clear if this is just
inefficiency in the 8.1 bufmgr code, or that bgwriter is not aggresive
enough about clearing pages from the LRU end of the buffer pool.

I'm not really sure what the right way to attack this problem is, but
getting off 8.1 has now gotten a priority.

sinval was tracked too and we found out that it's not an issue at all.

Thanks everyone.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Josh Berkus


> Let's see...if I remember correctly from the last two rounds of this
> discussion, this is the point where someone pops up and says that
> switching to XML for the postgresql.conf will solve this problem.
> Whoever does that this time goes into the ring with Kevin and I, but
> they don't get a club.  (All fight proceeds to benefit SPI of course).

XML is s last-week.  JSON!

Oh, wait, we're PostgreSQL, we're not that mainstream.  YAML!

--Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Robert Haas

On Wed, Oct 28, 2009 at 3:28 PM, Tom Lane  wrote:
> Josh Berkus  writes:
>> Kevin,
>>> Perhaps the ease of writing something like that with sed or perl has
>>> caused me to underestimate the effort required in C.  I am curious
>>> whether you actually mean that, or said it for rhetorical effect.
>
>> I actually mean that.  It *looks* easy in perl, and in fact *is* easy
>> for *your* postgresql.conf which you control.  But writing a parser for
>> every postgresql.conf which exists in the world, no matter how someone
>> has hacked it up creatively?  No matter how they've handled upgrades?
>
> The issue isn't even with writing a parser.  The conf file is certainly
> machine-parsable; guc-file.l is an existence proof, and the relatively
> slow rate of change of that file indicates that it's been a reasonably
> stable format over time.

Right.

> The issue is that if you want to modify the
> file while preserving comments, commenting out superseded entries,
> putting new entries where the user would expect to find them, etc etc,
> you have a hard AI problem in front of you.

Right.  In other words, it's not possible.  You can drive yourself
crazy trying to approximate it, but it is a hopeless waste of time.

> This is why Robert keeps
> harping on the default commentary being a problem --- if you removed all
> comments (and didn't care about ordering etc), it would be easier.

Yes - and we even had some consensus that this was a good idea, at one
point.  Maybe not complete, precise consensus on every detail, but
certainly enough to have a conversation about it.

> But short of telling people who prefer $EDITOR to get lost, that's not
> going to work.

This is where I get off the train.  Preferring $EDITOR is not the same
thing as feeling that we need 500 lines of comments in the default
file.  There may be some people who hold both opinions, of course.

> I think the point of the discussions here is that we want to establish
> an alternate config file (or set of config files) in which the
> expectations are different: no promise to preserve any comments, no
> intention to be human-friendly for editing, etc.  In one sense this
> would be the same machine-readable format, since the backend is still
> going to parse it with guc-file.l; but changing the human expectations
> can make the editing problem much simpler.

I don't think this idea is without merit, but I don't think it's a
silver bullet, either.  If you can change the human expectations for
some file that gets processed along with postgresql.conf, you can
change the expectations for postgresql.conf itself.  In fact, you'll
have to: adding more files is BY DEFINITION going to change the
interpretation of postgresql.conf.  It will either be the magic bullet
file that overrides the other file, or visca versa - rather than, as
is the case in a default install today, being THE configuration file.

One of the issues that we need to face is: how many new files?  There
seems to be some sentiment that we can just "drop in" new files and
expect things to work.  I think that's likely to lead to chaos.
Having TWO files - one for $EDITOR and one for tools - may be
manageable.  There will be some user confusion as to which one is in
charge, but there are only two choices, so maybe it's not too bad.
But having a whole directory full of files emitted by different tools
sounds like a disaster, and therefore it seems to me that there is no
getting around the need to have a tool which can merge new settings
into an existing configuration file.

It would be completely logical to break up the configuration file into
subfiles by TOPIC.  That would complicate things for tool-writers
because they would need to get each setting into the proper file, and
we currently don't have any infrastructure for that.  But that's not
why people want this feature anyway.  What they want is to be able to
deposit new settings and have them take effect without parsing a
config file.  But they can do that today.  Just open postgresql.conf
for append, write a newline in case the file didn't already end with
one, write your settings, and close the file.  Your settings win
because they are last.  The problem is - now you've left a mess for
someone else to clean up.  Overtime duplicates will accumulate.  The
order of settings won't be preserved.  Comments won't be properly
updated.  But writing to another file doesn't actually fix any of
that.  Merging settings (either in postgresql.conf or in a separate
persistent.conf) does, at least in part.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Greg Smith


On Wed, 28 Oct 2009, Josh Berkus wrote:


It's the basic and unsolvable issue of how do you have a file which is
both perfectly human-readable-and-editable *and* perfectly
machine-readable-and-editable at the same time.


Let's see...if I remember correctly from the last two rounds of this 
discussion, this is the point where someone pops up and says that 
switching to XML for the postgresql.conf will solve this problem. 
Whoever does that this time goes into the ring with Kevin and I, but they 
don't get a club.  (All fight proceeds to benefit SPI of course).


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1

2009-10-28 Thread Jeff Davis

On Wed, 2009-10-28 at 21:09 +0200, Hannu Krosing wrote:
> Is at least the fact that they "are undocumented, have changed in the
> past, and are likely to change again in the future" documented ?

That's a little confusing to me: how do we document that something is
undocumented? And where do we stop?

> Hashing is a quite fundamental thing in computing, so I was quite
> surprised to find out it had silently changed. 

There are many reasons to use a hash, and we don't want people to use
these functions for the wrong purpose. I have seen people use a
performance hash for security purposes before, and I had to demonstrate
some hash collisions to show why that was a bad idea. So, if we do
provide documented functions, it should be done carefully.

Trying to develop and document a set of standardized, stable hash
functions covering a wide range of possible use cases sounds like it may
be better served by an extension.

Regards,
Jeff Davis 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1

2009-10-28 Thread Tom Lane

Kenneth Marshall  writes:
> On Wed, Oct 28, 2009 at 03:31:17PM -0400, Tom Lane wrote:
>> Hash indexes are so far from being production-grade that this argument
>> is not significant.

> In addition that change from 8.3 -> 8.4 to store only the hash and not
> the value in the index means that a reindex would be required in any event.

Indeed, and I fully expect there will be some more on-disk format
changes required before we get to the point where hash indexes are
actually interesting for production.  If we start insisting that they
be in-place-upgradable now, we will pretty much guarantee that they
never become useful enough to justify the restriction :-(

(As examples, the hash bucket size probably needs revisiting,
and we ought to think very hard about whether we shouldn't switch
to 64-bit hash values.  And that's not even considering some of the
more advanced suggestions that have been made.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1

2009-10-28 Thread Kenneth Marshall

On Wed, Oct 28, 2009 at 03:31:17PM -0400, Tom Lane wrote:
> Hannu Krosing  writes:
> > I had never checked the docs for hash functions, but I had assumed, that
> > internal functions are prefixed by pg_ and anything else is public, free
> > to use functionality.
> 
> Sure, it's free to use.  It's not free to assume that we promise never
> to change it.
> 
> > Changing hash functions also makes in-place upgrades a lot harder, as
> > they can't be done incrementally anymore for tables which use hash
> > indexes.
> 
> Hash indexes are so far from being production-grade that this argument
> is not significant.
> 
>   regards, tom lane

In addition that change from 8.3 -> 8.4 to store only the hash and not
the value in the index means that a reindex would be required in any event.

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1

2009-10-28 Thread Tom Lane

Hannu Krosing  writes:
> I had never checked the docs for hash functions, but I had assumed, that
> internal functions are prefixed by pg_ and anything else is public, free
> to use functionality.

Sure, it's free to use.  It's not free to assume that we promise never
to change it.

> Changing hash functions also makes in-place upgrades a lot harder, as
> they can't be done incrementally anymore for tables which use hash
> indexes.

Hash indexes are so far from being production-grade that this argument
is not significant.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Tom Lane

Josh Berkus  writes:
> Kevin,
>> Perhaps the ease of writing something like that with sed or perl has
>> caused me to underestimate the effort required in C.  I am curious
>> whether you actually mean that, or said it for rhetorical effect.

> I actually mean that.  It *looks* easy in perl, and in fact *is* easy
> for *your* postgresql.conf which you control.  But writing a parser for
> every postgresql.conf which exists in the world, no matter how someone
> has hacked it up creatively?  No matter how they've handled upgrades?

The issue isn't even with writing a parser.  The conf file is certainly
machine-parsable; guc-file.l is an existence proof, and the relatively
slow rate of change of that file indicates that it's been a reasonably
stable format over time.  The issue is that if you want to modify the
file while preserving comments, commenting out superseded entries,
putting new entries where the user would expect to find them, etc etc,
you have a hard AI problem in front of you.  This is why Robert keeps
harping on the default commentary being a problem --- if you removed all
comments (and didn't care about ordering etc), it would be easier.
But short of telling people who prefer $EDITOR to get lost, that's not
going to work.

I think the point of the discussions here is that we want to establish
an alternate config file (or set of config files) in which the
expectations are different: no promise to preserve any comments, no
intention to be human-friendly for editing, etc.  In one sense this
would be the same machine-readable format, since the backend is still
going to parse it with guc-file.l; but changing the human expectations
can make the editing problem much simpler.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Greg Stark

On Wed, Oct 28, 2009 at 12:08 PM, Josh Berkus  wrote:
>> Perhaps the ease of writing something like that with sed or perl has
>> caused me to underestimate the effort required in C.  I am curious
>> whether you actually mean that, or said it for rhetorical effect.
>
> I actually mean that.  It *looks* easy in perl, and in fact *is* easy
> for *your* postgresql.conf which you control.  But writing a parser for
> every postgresql.conf which exists in the world, no matter how someone
> has hacked it up creatively?  No matter how they've handled upgrades?
> For every version of PostgreSQL?  That requires writing a full parser
> with grammar and near-turing capabilities.

I think we're getting distracted by the basic parsing questions. These
are actually solvable -- pgadmin solves them today even.

I think the bigger problem is the semantic questions. If I've lowered
random_page_cost and your tool decides it should raise
sequential_page_cost should it raise random_page_cost proportionally
from my setting or to the absolute value it calculates? When it does
will I be annoyed to see my settings overwritten? What if I set some
of the cpu_* parameters based on my random_page_cost setting and now
that you've overwritten my random_page_cost setting they're all out of
whack?

And not all programs editing these files will be equally intelligent.
Say I post a snippet of configuration and say to drop it in wholesale
into your postgresql.conf.d. Then the user runs autotune which drops a
configuration in after it which overrides those settings. Then later I
post an updated snippet and the user replaces the original snippet. If
they're in separate files and he put the snippet in before the
autotune configuration then he doesn't have to worry if the new
snippet contains things which autotune overrode. They'll still
override the new settings.

If you keep them separate then the actual settings may not be in sync
but at least I can see each group of settings and understand what they
were trying to do.  The precedence remains the same. It also means the
database could give warnings if files are overriding earlier files if
we want that.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1

2009-10-28 Thread Hannu Krosing

On Wed, 2009-02-11 at 11:22 -0500, Tom Lane wrote:
> Asko Oja  writes:
> > Did this change hashtext() visible to users? We have been using it quite
> > widely for partitioning our databases. If so then it should be marked quite
> > visibly in release notes as there might be others who will be hit by this.
> 
> The hash functions are undocumented, have changed in the past, and are
> likely to change again in the future.  If you are using them in a way
> that depends on them to give the same answers across versions, you'd
> better stop.

Is at least the fact that they "are undocumented, have changed in the
past, and are likely to change again in the future" documented ?

I'm sure this is something that has hit unwary users in the past and
will hit again in the future, so some words about it in the doc's would
be appropriate.

search for "hashtext" on
http://www.postgresql.org/docs/8.4/interactive/index.html returned no
results, so I guess even theyr "undocumented, will surprise you" status
is not documented.

Hashing is a quite fundamental thing in computing, so I was quite
surprised to find out it had silently changed. 

I had never checked the docs for hash functions, but I had assumed, that
internal functions are prefixed by pg_ and anything else is public, free
to use functionality.

Changing hash functions also makes in-place upgrades a lot harder, as
they can't be done incrementally anymore for tables which use hash
indexes.

>   regards, tom lane
> 
-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Josh Berkus

Kevin,

> Perhaps the ease of writing something like that with sed or perl has
> caused me to underestimate the effort required in C.  I am curious
> whether you actually mean that, or said it for rhetorical effect.

I actually mean that.  It *looks* easy in perl, and in fact *is* easy
for *your* postgresql.conf which you control.  But writing a parser for
every postgresql.conf which exists in the world, no matter how someone
has hacked it up creatively?  No matter how they've handled upgrades?
For every version of PostgreSQL?  That requires writing a full parser
with grammar and near-turing capabilities.

> Well, I wouldn't vote against it since it seems to do me no harm; I
> was just confused at the repeated assertion that update-in-place was
> such a hard problem. 

It's the basic and unsolvable issue of how do you have a file which is
both perfectly human-readable-and-editable *and* perfectly
machine-readable-and-editable at the same time.

--Josh Berkus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Greg Stark

On Wed, Oct 28, 2009 at 10:28 AM, Greg Smith  wrote:
> The postgresql.conf file being modified is generated by initdb, and it's
> already being customized per install by the initdb-time rules like detection
> for maximum supported shared_buffers. It isn't one of the files installed by
> the package manager where the logic you're describing kicks in.  The
> conflict case would show up, to use a RHEL example, if I edited a
> /etc/sysconfig/postgresql file and then a changed version of that file
> appeared upstream.  Stuff in PGDATA is all yours and not tracked as a config
> file.

Well putting configuration files in PGDATA is itself a packaging
violation. I'm talking about /etc/postgresql.conf. Yes it's possible
for packages to simply opt out of the configuration file management
which at least means they're not actively causing problems -- but it's
a cheat, it means it's giving up on providing the user with useful
upgrades of configuration files.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Tom Lane

Greg Smith  writes:
> If as you say the only right way to do this is to use the flex logic, that 
> just reinforced how high the bar is for someone who wants to write a tool 
> that modifies the file.

Yup, exactly.  Personally I think that trying to auto-modify
postgresql.conf is insane.  The whole and entire reason behind this
discussion is that we want the tools modifying OTHER files, for which
we will establish different and much simpler rules for what is allowed.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Kevin Grittner

Josh Berkus  wrote:

> The precedence issues you (and Robert) are citing are no different
> from what we have currently in a single file.

I think that's *why* we're mentioning it.  This would seem to be the
juncture to look for ways to improve that, not just settle for "no
worse" -- but perhaps that's not possible.

> If someone here thinks writing a tool which reliably parses and
> re-writes a hand-written PostgresQL.conf and runs on all the OSes we
> support is *easy*, then please write it for me!  I'll happly use
> such a tool.  But after wasting a couple dozen hours on the problem,
> I won't write one.

Perhaps the ease of writing something like that with sed or perl has
caused me to underestimate the effort required in C.  I am curious
whether you actually mean that, or said it for rhetorical effect.

> Otherwise, please let us have our directory so that we can
> experiment with easy-to-write-and-revise autoconfig tools.

Well, I wouldn't vote against it since it seems to do me no harm; I
was just confused at the repeated assertion that update-in-place was
such a hard problem.  Some of the people saying that seem to regularly
eat problems which seem much harder than that (to me, anyway) for
lunch.  That seemed to suggest there could be other reasons for
wanting the directory approach which weren't getting proper focus.  If
we solve the wrong problem, the solution is likely to be suboptimal
for the real issues.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Greg Smith


On Wed, 28 Oct 2009, Greg Stark wrote:


It's also a blatant violation of packaging rules for Debian if not
every distribution. If you edit the user's configuration file then
there's no way to install a modified default configuration file. You
can't tell the automatic modifications apart from the user's
modifications. So the user will get a prompt asking if he wants the
new config file or to keep his modifications which he never remembered
making.


The postgresql.conf file being modified is generated by initdb, and it's 
already being customized per install by the initdb-time rules like 
detection for maximum supported shared_buffers. It isn't one of the files 
installed by the package manager where the logic you're describing kicks 
in.  The conflict case would show up, to use a RHEL example, if I edited a 
/etc/sysconfig/postgresql file and then a changed version of that file 
appeared upstream.  Stuff in PGDATA is all yours and not tracked as a 
config file.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Greg Smith


On Wed, 28 Oct 2009, Tom Lane wrote:

Why in the world are you looking at initdb?  The standard reference for 
postgresql.conf-reading code, by definition, is guc-file.l.  I think the 
odds of building something that works right, without borrowing that same 
flex logic, are about nil.


initdb was the only sample around that actually makes changes to the 
postgresql.conf.  It's also a nice simple standalone program that's easy 
to borrow pieces from, which guc-file.l is not.  That's the reason it 
looks tempting at first.


If as you say the only right way to do this is to use the flex logic, that 
just reinforced how high the bar is for someone who wants to write a tool 
that modifies the file.  Periodically we get people who show up saying 
"hey, I'd like to write a little [web|cli|gui] tool to help people update 
their postgresql.conf file", and when the answer they get incudes "first 
you need to implement this grammar..." that's scares off almost all of 
them.  It didn't work on me because I used to write compilers for fun 
before flex existed.  But even I just skimmed it and pragmatically wrote a 
simpler postgresql.conf parser implementation that worked well enough to 
get a working prototype out the door, rather than properly the whole 
grammar.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Josh Berkus

Kevin,

> I'm talking about how the decision should be made as to which takes
> precedence.  It's fine to document which one *was* chosen, but that
> doesn't eliminate the problem of conflicting settings making a mess. 
> Someone else (Robert maybe?) gave an explicit example of how three
> files could have overlapping settings.  Of course, *my* tool will name
> its configuration file "!.conf".

Hey, if a DBA wants to do that, then it's fine with me.  They can check
pg_settings afterwards to find out which was chosen.

The precedence issues you (and Robert) are citing are no different from
what we have currently in a single file.  I absolutely can't tell you
the number of hacked-up postgresql.conf files I've seen with the same
setting appearing in more than 3 places.  And with the conf file being
over 1000 lines long, it's easy to miss that someone or some tool added
another instance of the variable at the bottom.

Plus we already support includes of single files.  Why is an include of
a directory controversial?  If someone doesn't want to use it, they
don't have to.

If someone here thinks writing a tool which reliably parses and
re-writes a hand-written PostgresQL.conf and runs on all the OSes we
support is *easy*, then please write it for me!  I'll happly use such a
tool.  But after wasting a couple dozen hours on the problem, I won't
write one.

Otherwise, please let us have our directory so that we can experiment
with easy-to-write-and-revise autoconfig tools.

--Josh Berkus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Where's the docs?

2009-10-28 Thread Josh Berkus

On 10/27/09 9:25 PM, Tom Lane wrote:
> Josh Berkus  writes:
>> OK, this is the genuine failure; the syntax is missing for column triggers:
> 
>> CREATE TRIGGER name { BEFORE | AFTER } { event [ OR ... ] }
>> ON table [ FOR [ EACH ] { ROW | STATEMENT } ]
>> EXECUTE PROCEDURE function_name ( arguments )
> 
> It's embedded in "event", which isn't spelled out here.
> 

Yeah, I couldn't figure it out from the docs, which means that other
people won't be able to.  Doc patch coming if I ever finish this server
migration.

--Josh Berkus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Greg Stark

On Wed, Oct 28, 2009 at 2:37 AM, Dimitri Fontaine
 wrote:
> That's why I'm proposing the following API at file level:

That's exactly the same as putting them all in the same file, only a
different syntax. It still requires that any program understand what
every other program was trying to do.

>> It's much simpler and more reliable to have each program generate a
>> separate file.
>
> On the viewpoint of the program itself only. For the DBA, that soon
> becomes a nightmare because the same GUC could come from any number of
> tools and the precedence rules, even explicit and as easy as
> alphanumeric orderding (which locale already?), make it error prone.

But the DBA *wants* to control those precedence rules. The automatic
software certainly can't unless they know what other automatic
software exists in the world -- or will exist in the future.

> I really want to insist on having only ONE location for settings from
> tools (all of them) and one location for manual/local editing.
>
>> time it's generated. It doesn't have to worry about anything else
>> parsing or making sense of the file except the database server itself.
>
> But it'll never know if the settings it just generated are superseded by
> some other tool's configuration file.

That's precisely what makes things simpler. The less each module has
to know about each other module the simpler and more reliable it will
be. I actually would suggest that they check the current "source" by
checking with postgres, just to give a warning.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Greg Stark

On Wed, Oct 28, 2009 at 7:33 AM, Alvaro Herrera
 wrote:
> Greg Smith escribió:
>
>> This sounds familiar...oh, that's right, this is almost the same
>> algorithm pgtune uses.  And it sucks,

It's also a blatant violation of packaging rules for Debian if not
every distribution. If you edit the user's configuration file then
there's no way to install a modified default configuration file. You
can't tell the automatic modifications apart from the user's
modifications. So the user will get a prompt asking if he wants the
new config file or to keep his modifications which he never remembered
making.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Tom Lane

Greg Smith  writes:
> The sketched out design I have for a contrib/pgtune in C presumes that I'd 
> start by refactoring the relevant bits from initdb into a library for both 
> programs to use.  But the initdb code doesn't care about preserving 
> existing values when making changes to them; it just throws in its new 
> settings and moves along.  So what's there already only handles about half 
> the annoying parts most people would expect a tuning tool that reads the 
> existing file and operates on it to do.

> Also, I wouldn't be surprised to find that it chokes on some real-world 
> postgresql.conf files.  The postgresql.conf.sample it's being fed is 
> fairly pristine.

Indeed.  Why in the world are you looking at initdb?  The standard
reference for postgresql.conf-reading code, by definition, is
guc-file.l.  I think the odds of building something that works right,
without borrowing that same flex logic, are about nil.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Greg Smith


On Wed, 28 Oct 2009, Alvaro Herrera wrote:


Huh, isn't this code in initdb.c already?


The sketched out design I have for a contrib/pgtune in C presumes that I'd 
start by refactoring the relevant bits from initdb into a library for both 
programs to use.  But the initdb code doesn't care about preserving 
existing values when making changes to them; it just throws in its new 
settings and moves along.  So what's there already only handles about half 
the annoying parts most people would expect a tuning tool that reads the 
existing file and operates on it to do.


Also, I wouldn't be surprised to find that it chokes on some real-world 
postgresql.conf files.  The postgresql.conf.sample it's being fed is 
fairly pristine.  A tuning tool that intends to read any postgresql.conf 
it's fed can't always assume it's in exactly standard form.  I've recently 
started collecting complicated postgresql.conf lines that crashed my 
Python code as people submit bug reports with those.  You might be 
surprised at all of the places people put whitespace at.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] FOR UPDATE versus outer joins

2009-10-28 Thread Tom Lane

While I'm fooling with the FOR UPDATE code ...

Currently, you can't apply FOR UPDATE to a relation that's on the inner
side of an outer join, eg

regression=# select * from a left join b using(aa) for update;
ERROR:  SELECT FOR UPDATE/SHARE cannot be applied to the nullable side of an 
outer join

It would be a trivial code change to make this work by just not locking
any row in b when a null-extended join row is formed.  The argument
against that was that FOR UPDATE is supposed to guarantee that the same
rows can be fetched again, and it's not clear that the same
null-extended row would be formed if the join were repeated.  In
particular, if the a row's join key has been changed by a concurrent
update, we would still return the null-extended row, but there might now
be rows in b that it can join to; which we won't find since we aren't
repeating the whole join but only rechecking particular join pairs.

Do people still find that argument convincing, or would it be better to
remove the restriction and let the code do the best it can?  It seems to
me that allowing FOR UPDATE with outer joins may be more useful than not
allowing it, even given the caveat.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Show schema size with \dn+

2009-10-28 Thread Anders Steinlein

Is there any interest in expanding \dn+ to show schema size, similar  
to table sizes using \dt+ in 8.4? We use separate schemas for each  
user, so this would allow us to quickly look up the sizes of each  
user's data.


I have little experience with C and none with the PostgreSQL code base  
-- where should I look to have a go at this?


-- a.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Kevin Grittner

Alvaro Herrera  wrote:
> Kevin Grittner wrote:
> 
>> But I think that's where the rub is -- when you
>> have more than one source for information, what's the precedence? 
> 
> This is not a problem nowadays because that info is in pg_settings.
> File name and line number.

I'm talking about how the decision should be made as to which takes
precedence.  It's fine to document which one *was* chosen, but that
doesn't eliminate the problem of conflicting settings making a mess. 
Someone else (Robert maybe?) gave an explicit example of how three
files could have overlapping settings.  Of course, *my* tool will name
its configuration file "!.conf".

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Andrew Dunstan




Alvaro Herrera wrote:

Greg Smith escribió:

  

I was thinking that the algorithm would be something like: "Read
the old postgresql.conf and write it back out to a new file line
by line
  

This sounds familiar...oh, that's right, this is almost the same
algorithm pgtune uses.  And it sucks, and it's a pain to covert the
tool into C because of it, and the fact that you have to write this
sort of boring code before you can do a single line of productive
work is one reason why we don't have more tools available; way too
much painful grunt work to write.



Huh, isn't this code in initdb.c already?  Since it's BSD-licensed (or
is it MIT?) you could just have lifted it.  Surely this isn't the reason
the tool isn't written in C.

  


In any case, initdb has to be in C for portability reasons (I'm more 
aware of this than most ;-) ), but other tools don't unless the server 
has to rely on them.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Alvaro Herrera

Kevin Grittner wrote:

> But I think that's where the rub is -- when you
> have more than one source for information, what's the precedence? 

This is not a problem nowadays because that info is in pg_settings.
File name and line number.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Alvaro Herrera

Greg Smith escribió:

> >I was thinking that the algorithm would be something like: "Read
> >the old postgresql.conf and write it back out to a new file line
> >by line
> 
> This sounds familiar...oh, that's right, this is almost the same
> algorithm pgtune uses.  And it sucks, and it's a pain to covert the
> tool into C because of it, and the fact that you have to write this
> sort of boring code before you can do a single line of productive
> work is one reason why we don't have more tools available; way too
> much painful grunt work to write.

Huh, isn't this code in initdb.c already?  Since it's BSD-licensed (or
is it MIT?) you could just have lifted it.  Surely this isn't the reason
the tool isn't written in C.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Kevin Grittner

Forgive me for jumping in again on discussion of a feature I might
never use, but as an "outside observer" something doesn't make sense
to me.

Josh Berkus  wrote:

> If you require that a tool (or SET PERISTENT) parse through a file
> in order to change one setting, then you've just doubled or tripled
> the code size of the tool, as well as added a host of failure
> conditions which wouldn't have existed otherwise.

Not if there is one implementation of which is distributed with
PostgreSQL.  Give it a clean API and a command-line application (for
scripting in non-C languages) and this is a non-issue.  This really
seems like a red herring.

I know it would be more lines in C than a bash script; but really,
think about how little work this would be for any script which has
grep and sed available -- at least if you assume it shouldn't follow
include statements.  But I think that's where the rub is -- when you
have more than one source for information, what's the precedence? 
That question doesn't go away with the proposed feature.  It seems
that in reading this thread I've seen a lot of conflicting notions on
how it *should* work, with a handwavy assertion that it doesn't matter
because the DBA can sort it all out.  But then will the tools always
do what people expect?

It seems like there's a significant base of users who want their
database product to self-configure; and there's clearly a significant
base of professional DBAs who want to be able to hand-tune for a
variety of reasons.  I assume that addressing these disparate needs is
one of the goals here?  As well as an easy way to drop in
configuration for additional features?  The directory seems to make
sense for the latter, but seems horrible to me for the former.  It
turns the risk of a spaghetti configuration file into a sorcerer's
apprentice collection of competing, conflicting files which are a
worse mess that the spaghetti.

Perhaps there should be two separate features for the two separate use
cases?

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SE-PgSQL developer documentation (Re: Reworks for Access Control facilities (r2363))

2009-10-28 Thread Alvaro Herrera

KaiGai Kohei escribió:
> Alvaro Herrera wrote:

> >Now, let's assume that COPY data includes the security context for each
> >tuple in the output.
> 
> When we support row-level security, it will be necessary to backup and
> restore the security context for each tuples.

Oh, right, that part is being left out.  Sorry.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SE-PgSQL developer documentation (Re: Reworks for Access Control facilities (r2363))

2009-10-28 Thread KaiGai Kohei


Alvaro Herrera wrote:

KaiGai Kohei escribió:


There are two cases when we create a new object.

1) create a new object without any explicit security context.
If we don't have statement support, it is the only case.
In this case, SELinux computes a default security context to be assigned
on the new object. It depends on the client's security context.
Then, it checks "create" permission on a pair of the client's security
context and the default security context. If not allowed, an error will
be raised.


So, following this path, it is possible to write pg_dump support without
a explicit security contexts: you have to make pg_dump write out the
different tuples under different users.  So you'd have more than one
data object in the dump output for each table, one for every existing
security context.  This seems extremely difficult and intrusive however.

It seems that having explicit security contexts in statements is
necessary for this area to be reasonably simple.


Yes, it may be possible to restore the tables without statement, if we switch
OS-user's privilege for each tables, but unreasonable and unrealistic.


Now, let's assume that COPY data includes the security context for each
tuple in the output.


When we support row-level security, it will be necessary to backup and
restore the security context for each tuples.
What I'm talking about is how we specify the security context of the new
tables. If we can have statement support, it will be specified as follows:

  CREATE TABLE t ( a int primary key, b text)
  SECURITY_CONTEXT = 'system_u:object_r:sepgsql_ro_table_t:unclassified';


How is that data restored?  Would you need to
grant super-SEPostgres privileges to the user restoring the data?


We need to restore the backup by the user who has privileges to create
database objects dumped at least. But no needs to have super-privilege.

For example, if all the dumped table are labeled as either "unclassified"
or "classified" but not "secret", all the needed privilege is to create
"unclassified" and "classified" tables, not "secret" table.

However, I expect that "unconfined" domain does the backup/restore works
expect for especially secure system. I don't think the default security
policy (such as ones bundled with Fedora) should restrict DBA's privileges
connected from the shell process.

Thanks,
--
KaiGai Kohei 

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Where's the docs?

2009-10-28 Thread Tom Lane

Peter Eisentraut  writes:
> On Wed, 2009-10-28 at 00:25 -0400, Tom Lane wrote:
>> Josh Berkus  writes:
>>> OK, this is the genuine failure; the syntax is missing for column triggers:

>>> CREATE TRIGGER name { BEFORE | AFTER } { event [ OR ... ] }
>>> ON table [ FOR [ EACH ] { ROW | STATEMENT } ]
>>> EXECUTE PROCEDURE function_name ( arguments )

>> It's embedded in "event", which isn't spelled out here.

> I know this is a bit suboptimal, but I couldn't think of a better way
> without cluttering up to many things.

Expanding "event" in-place definitely wouldn't be very readable.  The
only alternative that seems sane is to do something like what we do
in SELECT:

CREATE TRIGGER ...

where "event" is:

INSERT | UPDATE [ ( column [,...] ) ] | DELETE | TRUNCATE

I am not convinced that it's worth it, but maybe.  Certainly the current
psql "\h CREATE TRIGGER" display is not helpful at reminding you where
to put the column names.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SE-PgSQL developer documentation (Re: Reworks for Access Control facilities (r2363))

2009-10-28 Thread Alvaro Herrera

KaiGai Kohei escribió:

> There are two cases when we create a new object.
> 
> 1) create a new object without any explicit security context.
> If we don't have statement support, it is the only case.
> In this case, SELinux computes a default security context to be assigned
> on the new object. It depends on the client's security context.
> Then, it checks "create" permission on a pair of the client's security
> context and the default security context. If not allowed, an error will
> be raised.

So, following this path, it is possible to write pg_dump support without
a explicit security contexts: you have to make pg_dump write out the
different tuples under different users.  So you'd have more than one
data object in the dump output for each table, one for every existing
security context.  This seems extremely difficult and intrusive however.

It seems that having explicit security contexts in statements is
necessary for this area to be reasonably simple.

Now, let's assume that COPY data includes the security context for each
tuple in the output.  How is that data restored?  Would you need to
grant super-SEPostgres privileges to the user restoring the data?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SE-PgSQL developer documentation (Re: Reworks for Access Control facilities (r2363))

2009-10-28 Thread KaiGai Kohei


Heikki Linnakangas wrote:

KaiGai Kohei wrote:

Heikki Linnakangas wrote:

KaiGai Kohei wrote:

Robert Haas wrote:

2009/10/27 KaiGai Kohei :

- no statement support to specify security context.
 (It makes impossible to add support in pg_dump. Is it really OK?)

I doubt that anything without pg_dump support would be even vaguely OK...

In my previous experience, it enabled to reduce 300-400 lines of the patch.
But here is no more sense than the 300-400 lines.

In my honest, I like to include a feature to specify an explicit security
context in the patch from the begining.
(It also allows to attach test cases with more variations.)

Can you explain why that's required for pg_dump support? I was thinking
that there would be no explicit security labels on objects, and
permissions would be checked based on other inherent properties of the
object, like owner, name, schema etc.

In SELinux model, security context is the only property which can be
used to decision making based on the security policy.
It never uses any other properties, like owner, name, ...


The security context doesn't necessary need to be given explicitly.
Things like network ports, files in filesystems that don't support
security labels are assigned a security context based on some external
policy.

Hmm, I guess the whole feature becomes completely pointless if all
objects always have their default labels, and can't be changed. So I
guess we need that.

I think this discussion started when I wondered why we can't put the
SE-pgsql check for creating an object (e.g table) into
pg_namespace_aclcheck() without changing the signature. The reason you
gave is that we need the security context of the new table being created
to decide if creating such a table is allowed. But assuming that the new
table inherits the security context of the schema it's created in,
pg_namespace_aclcheck() *does* have all the necessary information: it
knows the namespace which determines the new object's security context.
As long as we don't provide syntax to define the security context in the
CREATE command, we're fine, even if there's an ALTER command to change
the security context of the object after the creation.


What I pointed out is just a part of matters if we try to deploy SE-PgSQL
hooks within aclchk.c routines.

For example, pg_namespace_aclcheck() with ACL_CREATE is not only invoked
just before creation of a new table. It is also called when we create
a new function, type, conversion and so on.

For example, pg_namespace_aclcheck() does not take an argument to deliver
the column definitions of new table. When columns are inherited from the
parent table, we have to copy the security context of the parent column,
but we can know the column's definition inside of the pg_namespace_aclcheck().
(It needs to be called after MergeAttributes(), but pg_namespace_aclcheck()
is called before that.)

For example, SE-PgSQL model distinguish "setattr" permission from "drop".
But pg_class_ownercheck() is used for both ALTER and DROP statement.
So, we cannot know which permission should be applied inside from the
pg_class_ownercheck().

For example, ...

At the first commit fest, I was suggested to change definitions of the default
PG access control routines to deliver needed information for both DAC and MAC,
if pg_xxx_aclcheck() is not suitable for SELinux model.
Then, I developed a junk in the result. :(


I'm not sure how much of a difference that detail makes in the big
scheme of things, I'm just trying to find ways to make the patch
minimally invasive..


Basically, I don't think we should change something pg_xxx_aclcheck() and
pg_xxx_ownercheck() routines, because it well implements the default PG model.
If we try to call DAC and MAC from the common entry points, it requires us
many of pain, as we could learn from our hard experience.

What I would like to suggest is to put MAC hook on the strategic points.
The hooks are just invocations of sepgsql_*() functions, so does not need
much of reworks on the core routines.
I believe this is the minimally invasive way.

Linux kernel is one of the best practice. It deploys hooks to call MAC checks
with needed information (such as inode, task_struct, ...) on the strategic
points of the kernel. Basically, DAC and MAC works orthogonally, so it is quite
natural design.

Its specifications are documented in the source code clearly, so folks without
special attentions for security also can know what information should be given
and what result will be returned.
What I would like to suggest is a similar approach. So, now I'm working to write
a documentation from the viewpoint of developer, and coding SE-PgSQL routines
with comments about its specifications.

Thanks,
--
KaiGai Kohei 

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SE-PgSQL developer documentation (Re: Reworks for Access Control facilities (r2363))

2009-10-28 Thread Heikki Linnakangas

KaiGai Kohei wrote:
> Heikki Linnakangas wrote:
>> KaiGai Kohei wrote:
>>> Robert Haas wrote:
 2009/10/27 KaiGai Kohei :
> - no statement support to specify security context.
>  (It makes impossible to add support in pg_dump. Is it really OK?)
 I doubt that anything without pg_dump support would be even vaguely OK...
>>> In my previous experience, it enabled to reduce 300-400 lines of the patch.
>>> But here is no more sense than the 300-400 lines.
>>>
>>> In my honest, I like to include a feature to specify an explicit security
>>> context in the patch from the begining.
>>> (It also allows to attach test cases with more variations.)
>> Can you explain why that's required for pg_dump support? I was thinking
>> that there would be no explicit security labels on objects, and
>> permissions would be checked based on other inherent properties of the
>> object, like owner, name, schema etc.
> 
> In SELinux model, security context is the only property which can be
> used to decision making based on the security policy.
> It never uses any other properties, like owner, name, ...

The security context doesn't necessary need to be given explicitly.
Things like network ports, files in filesystems that don't support
security labels are assigned a security context based on some external
policy.

Hmm, I guess the whole feature becomes completely pointless if all
objects always have their default labels, and can't be changed. So I
guess we need that.

I think this discussion started when I wondered why we can't put the
SE-pgsql check for creating an object (e.g table) into
pg_namespace_aclcheck() without changing the signature. The reason you
gave is that we need the security context of the new table being created
to decide if creating such a table is allowed. But assuming that the new
table inherits the security context of the schema it's created in,
pg_namespace_aclcheck() *does* have all the necessary information: it
knows the namespace which determines the new object's security context.
As long as we don't provide syntax to define the security context in the
CREATE command, we're fine, even if there's an ALTER command to change
the security context of the object after the creation.

I'm not sure how much of a difference that detail makes in the big
scheme of things, I'm just trying to find ways to make the patch
minimally invasive..

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Robert Haas

On Tue, Oct 27, 2009 at 11:40 PM, Josh Berkus  wrote:
> On 10/27/09 8:24 PM, Robert Haas wrote:
>> read the old postgresql.conf and
>> write it back out to a new file line by line.  If, in the process of
>> doing this, you find a setting for the variable you're trying to
>> change, then write out the new line in place of the original line.
>
> You've hit the problem on the head right there.  The requirement to do
> something like that is *exactly* the problem which makes writing
> config-management tools hard/impossible.
>
> If you require that a tool (or SET PERISTENT) parse through a file in
> order to change one setting, then you've just doubled or tripled the
> code size of the tool, as well as added a host of failure conditions
> which wouldn't have existed otherwise.

I think you're just trading one set of failure conditions for another.
 Now instead of having one unparseable configuration file you're going
to have a whole pile of them with possibly-conflicting settings.

> You're hearing from the people who are working on tools: requiring that
> any tool parse a hand-written config file is a non-starter.

Yep: and I'm baffled by that, because I understand neither why it's
hard nor what the reasonable alternatives are.  The algorithm I just
proposed can be implemented by a very short Perl script.  But my
bafflement doesn't (and isn't intended to) prevent others from
implementing what they like.  As Tom is fond of saying (and it's 10x
more true of me), I'm not the only vote here.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parsing config files in a directory

2009-10-28 Thread Dimitri Fontaine

Greg Stark  writes:
> On Tue, Oct 27, 2009 at 8:40 PM, Josh Berkus  wrote:
>> You're hearing from the people who are working on tools: requiring that
>> any tool parse a hand-written config file is a non-starter.
>
> It can be done, pgadmin actually does it currently. But I totally
> agree it's a bad idea.
>
> But the difficulty of parsing the handwritten stuff is not the only
> reason it's a bad idea. Any time you have multiple pieces of software,
> to say nothing of humans, editing the same file you're going to have
> headaches. They need to agree on everything and be able to handle
> anything any other program generates. Such a file would be a kind of
> API itself.

That's why I'm proposing the following API at file level:
 - 1 file per GUC
 - file name is {class.}guc_name.conf
 - first line only contains value of setting
 - rest of the file contains comments

Now any tool can see current value for itself, and change it, keeping
the old one as comment is easy too:
 $ myguc=`cat postgresql.conf.d/my_guc.conf`
 $ (echo newvalue; echo $myguc) > postgresql.conf.d/my_guc.conf

Furthermore, extensions are required to use a custom class, so they will
need to edit custom_variable_classes then their own files. Any tool
could support editing those files too, it's rather easy until you want
to provide specific wizard kind knowledge to the user.

A dedicated facility to add a new class to custom_variable_classes GUC
could be devised later, but doesn't feel like it's in this patch
playground.

> It's much simpler and more reliable to have each program generate a
> separate file. 

On the viewpoint of the program itself only. For the DBA, that soon
becomes a nightmare because the same GUC could come from any number of
tools and the precedence rules, even explicit and as easy as
alphanumeric orderding (which locale already?), make it error prone.

I really want to insist on having only ONE location for settings from
tools (all of them) and one location for manual/local editing.

> time it's generated. It doesn't have to worry about anything else
> parsing or making sense of the file except the database server itself.

But it'll never know if the settings it just generated are superseded by
some other tool's configuration file. With my proposal the SET
PERSISTENT command can easily warn user: as soon as current source for
the GUC is NOT postgresql.conf.d you know you're not affecting anything,
it's been the DBA choice to manually set something else.

It if happens you are the DBA, you can go edit postgresql.conf to
comment out the GUC and enjoy your new tool suite.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SE-PgSQL developer documentation (Re: Reworks for Access Control facilities (r2363))

2009-10-28 Thread KaiGai Kohei

Heikki Linnakangas wrote:
> KaiGai Kohei wrote:
>> Robert Haas wrote:
>>> 2009/10/27 KaiGai Kohei :
 - no statement support to specify security context.
  (It makes impossible to add support in pg_dump. Is it really OK?)
>>> I doubt that anything without pg_dump support would be even vaguely OK...
>> In my previous experience, it enabled to reduce 300-400 lines of the patch.
>> But here is no more sense than the 300-400 lines.
>>
>> In my honest, I like to include a feature to specify an explicit security
>> context in the patch from the begining.
>> (It also allows to attach test cases with more variations.)
> 
> Can you explain why that's required for pg_dump support? I was thinking
> that there would be no explicit security labels on objects, and
> permissions would be checked based on other inherent properties of the
> object, like owner, name, schema etc.

In SELinux model, security context is the only property which can be
used to decision making based on the security policy.
It never uses any other properties, like owner, name, ...

There are two cases when we create a new object.

1) create a new object without any explicit security context.
If we don't have statement support, it is the only case.
In this case, SELinux computes a default security context to be assigned
on the new object. It depends on the client's security context.
Then, it checks "create" permission on a pair of the client's security
context and the default security context. If not allowed, an error will
be raised.

2) create a new object with an explicit security context.
In this case, the given explicit security context will be assigned.
SELinux checks "create" permission on a pair of the client's security
context and the given explicit security context. If not allowed, an error
will be raised.

Please note that SELinux assigns a security context on the managed object
in either cases.

If we don't have any statement support, there are no way to specify
an explicit security context on the new object in creation.
It also means we cannot recover the security context of objects correctly.

Thanks,
-- 
OSS Platform Development Division, NEC
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SE-PgSQL developer documentation (Re: Reworks for Access Control facilities (r2363))

2009-10-28 Thread Heikki Linnakangas

KaiGai Kohei wrote:
> Robert Haas wrote:
>> 2009/10/27 KaiGai Kohei :
>>> - no statement support to specify security context.
>>>  (It makes impossible to add support in pg_dump. Is it really OK?)
>> I doubt that anything without pg_dump support would be even vaguely OK...
> 
> In my previous experience, it enabled to reduce 300-400 lines of the patch.
> But here is no more sense than the 300-400 lines.
> 
> In my honest, I like to include a feature to specify an explicit security
> context in the patch from the begining.
> (It also allows to attach test cases with more variations.)

Can you explain why that's required for pg_dump support? I was thinking
that there would be no explicit security labels on objects, and
permissions would be checked based on other inherent properties of the
object, like owner, name, schema etc.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

55 matches

Mail list logo