Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-05-29 Thread Markus Wanner

Hi,

Quoting Robert Haas robertmh...@gmail.com:

That's not the best news I've had today...


Sorry :-(


To me they sound complex and inconvenient.  I guess I'm kind of
mystified by why we can't make this work reliably.  Other than the
broken tags issue we've discussed, it seems like the only real issue
should be how to group changes to different files into a single
commit.  Once you do that, you should be able to construct a
well-defined, total function f : cvs-file, cvs-revision - git
commit which is surjective on the space of git commits.  In fact it
might be a good idea to explicitly construct this mapping and drop it
into a database table somewhere so that people can sanity check it as
much as they wish.  Why is this harder than I think it is?


Well, as CVS doesn't guarantee any consistency between files, you end  
up with silly situations more often than you think. One of the  
simplest possible example is something like:


  commit 1: fileA @ 1.1, fileB @ 1.2
  commit 2: fileA @ 1.2, fileB @ 1.1

Seen from fileA, it's obvious that commit 1 (@1.1) comes before commit  
2 (@1.2), but seen from fileB it's the exact opposite. The most  
promising approach to solve these problems seems to be based on Graph  
Theory, where you work with a graph of dependencies from fileA @ 1.1  
to fileA @ 1.2.


To resolve the above situation, you'd have split a blob of  
single-file commits into two end-result commits (for monotone / git).  
In the above example, you'd have two options to resolve the conflict:


  commit 1a: fileA @ 1.1
  commit 2:  fileA @ 1.2, fileB @ 1.1
  commit 1b: fileA @ 1.2

Or:

  commit 2a: fileB @ 1.1
  commit 1: fileA @ 1.1, fileB @ 1.2
  commit 2b: fileB @ 1.2

(Note that often enough, these have actually been separate commits in  
CVS as well, there's just no way to represent that. And no, timestamps  
are simply not reliable enough).


Now add tags, branches and cyclic dependencies involving many files  
and many 100 commits to the example above and you start to get an idea  
of the complexity of the problem in general.


See my description and diagrams of the steps used for cvs_import in  
monotone at [1] or follow descriptions of how cvs2svn works internally.


A few numbers about a conversion I'm trying for testing my algorithm  
and heuristics. It's converting a pretty recent snapshot of the  
Postgres repository:


 * running at 100% CPU time since: April, 17
 * Total number of files involved: 6'847
 * total number of blobs (before splitting): 28'010
 * blobs split due to cyclic dependencies: 12'801

Admittedly, my algorithm isn't optimized at all. However, I'm focusing  
on good results rather than speed of conversion.


Also note, that monotone uses SQLite, so it actually stores the  
results of this conversion in an SQL database, as you proposed.  
Recently, a git_export command has been added, so that's definitely  
worth a try for converting CVS to git. However, I fear cvs2git is more  
mature.


Regards

Markus Wanner

[1]: a description of the various steps in conversion from CVS to monotone:
http://www.monotone.ca/wiki/CvsImport/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode string literals versus the world

2009-05-29 Thread Peter Eisentraut
On Friday 29 May 2009 06:31:23 Bruce Momjian wrote:
 Peter Eisentraut wrote:
  On Tuesday 05 May 2009 03:01:05 Tom Lane wrote:
   Peter Eisentraut pete...@gmx.net writes:
On Tuesday 14 April 2009 21:34:51 Peter Eisentraut wrote:
I think we can handle that and the cases Tom presents by erroring
out when the U syntax is used with stdstr off.
   
Proposed patch for that attached.
  
   I have not been able to think of any security hole in that proposal,
   so this patch seems acceptable to me.  I wonder though whether any
   corresponding change is needed in psql's lexer, and if so how should
   it react exactly to the rejection case.
 
  I had thought about that as well, but concluded that no additional change
  is necessary.
 
  Note that the *corresponding* change would be psql complaining I don't
  like what you entered, versus the just-committed behavior that psql is
  indifferent and the server complains I don't like what you sent me.
 
  In any case, the point of the change is to prevent confusion in client
  programs, so if we had to patch psql to make sense, then the change would
  have been pointless in the first place.

 I assume there is no TODO here.

No, it should be fine.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Dimitri Fontaine

Hi,

Le 29 mai 09 à 02:32, Robert Haas a écrit :
On Thu, May 28, 2009 at 3:32 PM, Andrew Dunstan  
and...@dunslane.net wrote:

Tom Lane wrote:

Robert Haas robertmh...@gmail.com writes:
It also seems to me that we're getting seriously sidetracked from  
the

dependency-tracking part of this project which seems to me to be a
much deeper and more fundamental issue.

I thought that part was a pretty simple problem, actually.  Have an
object representing the module, make sure each component object in  
the

module has an AUTO dependency link to that object.  Where's the
difficulty?


I think it's a simple problem too...  except for the not-so-small
detail of who is going to implement it.


I kind of said I'd do it, but it's going to be my first attempt to  
patch backend code. Fortunately, Tom Dunstan did already a big chunk  
of the work, but without user design approval first. I'm trying to  
have user design voted, then I hope to reuse as much as Tom Dunstan's  
work as possible :)

And Stephen Frost proposed to be helping too.

Maybe we could also open the road for a new way of contributing: have  
someone discuss the user design on hackers until a consensus raises,  
then have a developer happily code it without having to care about the  
politics of it. :)


Well, yes. Honestly, I think all this search_path stuff is a red  
herring. We
are once again in danger of over-designing this instead of doing  
the simple

thing first (namely, don't worry about the search_path).


Right.



My feeling is that current way of using extensions is tightly coupled  
with search_path, and I'm not sure providing a SQL visible extension  
object with dependancies will make this problem any easier.
Now I agree that we certainly can complete the extension support  
project without having a single thought about schemas and search_path,  
this problem can be postponed. I figured out it could guide some  
extension user API design, but let's pretend all of this is orthogonal.


Still, extension users will want to have a default schema where the  
extension is installed, and a way to override it, right?


Moving to extension user design per-se on Tuesday, trying to avoid  
schema discussions while doing so.


Regards,
--
dim


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-29 Thread Peter Eisentraut
On Friday 29 May 2009 04:26:35 Bruce Momjian wrote:
 Added to TODO:
   |Improve bytea COPY format

   * http://archives.postgresql.org/pgsql-hackers/2009-05/msg00192.php

Btw., I have started to write some code for that.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Python 3.0 does not work with PL/Python

2009-05-29 Thread Peter Eisentraut
On Friday 29 May 2009 03:53:17 Alvaro Herrera wrote:
 Bruce Momjian escribió:
  Peter Eisentraut wrote:
   On Monday 06 April 2009 02:10:59 James Pye wrote:
Any thoughts on the acceptability of a complete rewrite for Python 3?
  
   http://www.joelonsoftware.com/articles/fog69.html
 
  You usually have to rewrite when you have not done refactoring as part
  of development;  PGDG does refactoring regularly.

 Except that plpython stagnates, save for minor hacks here and there.

But that doesn't mean that there is anything wrong with it.  Of course there 
is, but those are isolated problems that can be fixed when desired.  For 
example, it might just be that those who use it don't have use of INOUT 
parameters or table returns.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Python 3.0 does not work with PL/Python

2009-05-29 Thread Peter Eisentraut
On Friday 29 May 2009 04:06:14 Andrew Dunstan wrote:
 Otherwise, I'm not too keen simply to throw Python 2.x overboard until
 it's no longer common on platforms people are likely to want to install
 Postgres on, if that's what's implied by the original question.

My guess is that we will need to keep around a Python 2.x version for at least 
another three years, meaning two or three major PostgreSQL releases.

That also means that maintaining a separate, parallel code base for a Python 3 
variant can only be acceptable if it gives major advantages.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Peter Eisentraut
On Thursday 28 May 2009 02:57:00 Josh Berkus wrote:
 Personally, if we're tracking stuff through special dependancies which
 pg_dump will be aware of anyway, I don't see why extension objects
 should go into a special schema.

But they clearly have to go into *some* schema, and it would add some clarity 
to the world if we made a recommendation which one that is.  Which is what 
some of the subproposals really come down to.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Peter Eisentraut
On Thursday 28 May 2009 15:24:21 Stephen Frost wrote:
 I'm not real happy with it either.  Sure, we can track module
 dependencies seperately, but if we go down this route then we have to
 come up with some concept of an extension namespace that different
 extension use and prefix their functions/tables/etc with to avoid
 overlap with each other.  Gee, doesn't that sound familiar

I think what this comes down to is that you need nested schemas and a global 
namespace rule.  Then you can install things into 
pg_extensions.postgis.submodule.special_type, etc.  Makes sense on paper.

Note, however, that historically all the schemes that advocated the use of 
something like /usr/local/$packagename/ for each package separately have 
failed.  And this is in spite of the fact that search path handling and 
managing facilities for file systems are somewhat more powerful than 
PostgreSQL's schema search path handling.

So unless we have any new insights in this problem that OS developers haven't 
dealt with over the last few decades, I would feel more comfortable with an 
all-in-one directory/schema approach, accompanied by a package management 
system.

(One such new insight might be the Python/Java way of deeply nested package 
naming systems where you have to manually pick out and import the pieces that 
you want.  But that might significantly change the whole schema search path 
and name resolution system.)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Peter Eisentraut
On Thursday 28 May 2009 21:38:29 Tom Lane wrote:
 Greg Stark st...@enterprisedb.com writes:
  I don't understand what storing them in different namespaces and then
  putting them all in your search_path accomplishes. You end up with the
  same mishmash of things in your namespace.

 +1 ... naming conflicts between different extensions are going to be a
 problem for people no matter what.  Sticking them in different schemas
 doesn't really fix anything, it just means that you'll hit the problems
 later instead of sooner.

Yeah, to reiterate what I posted elsewhere, perhaps it'd be a good idea to 
give up on the search path idea altogether and think more in terms of an 
import facility like Python, Java, and sometimes Perl have.

In practice, I find a search path is just a tool to cause you to find the 
wrong stuff at the wrong time, and it continues to be a cause of confusion and 
security issues both in PostgreSQL and in Unix operating systems to this day.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-05-29 Thread Peter Eisentraut
On Thursday 28 May 2009 20:03:38 Stephen Frost wrote:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
  Right.  Shall we try to spec out exactly what our conversion
  requirements are?  Here's a shot:

 [...]

  Comments?  Other considerations?

 Certainly sounds reasonable to me.  I'd be really suprised if that's
 really all that hard to accomplish.  I'd be happy to help with some
 testing too if we feel that the current git repo is in reasonable shape
 to do that testing against (or someone has another).

Sounds like writing a comprehensive test suite against Tom's spec would be the 
first step.  And then this test suite can be run against various conversion 
tools and configurations thereof.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Dimitri Fontaine

Hi,

Le 29 mai 09 à 12:18, Peter Eisentraut a écrit :
I think what this comes down to is that you need nested schemas and  
a global

namespace rule.  Then you can install things into
pg_extensions.postgis.submodule.special_type, etc.  Makes sense on  
paper.

[...]
(One such new insight might be the Python/Java way of deeply nested  
package
naming systems where you have to manually pick out and import the  
pieces that
you want.  But that might significantly change the whole schema  
search path

and name resolution system.)


We'd still need search_path in there, as Python's still using a path.  
With 'default' search_path you'd have to qualify your type as  
pg_extensions.postgis.submodule.special_type, with pg_extensions in  
search_path the following notation would find it too:  
postgis.submodule.special_type.
And if you have pg_extensions.postgis.submodule in the search_path,  
then you can use special_type without having to (nest-) schema qualify  
it.


I like this idea, which sounds compatible with what we already have  
now (meaning current semantics of search_path still apply).


Regards,
--
dim

PS: we still have to provide users with easy tools to (dynamically)  
manage search_path, don't we?
(I prefer not to start the search_path management tool ideas right  
here).


PPS: http://www.gobolinux.org/ doesn't look like it's failing. (yet?)

In GoboLinux you don't need a package database because the filesystem  
is the database: each program resides in its own directory, such as / 
Programs/Xorg-Lib/7.4 and /Programs/KDE-Libs/4.2.0.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Python 3.0 does not work with PL/Python

2009-05-29 Thread Sam Mason
On Thu, May 28, 2009 at 09:06:14PM -0400, Andrew Dunstan wrote:
 Does Python 3 have some sort of usable sandbox that would mean we could 
 have a trusted plpython?

Not sure if people are aware of object-capability based approaches to
security.  A guy called Tav has come up with some code that constrains
python (i.e. you could build a sandbox out of it) and punch holes in
it where needed (i.e. you want to be able to execute queries in the
database but otherwise not, say, touch the filesystem).  The most recent
description I've found is:

  http://tav.espians.com/paving-the-way-to-securing-the-python-interpreter.html

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Python 3.0 does not work with PL/Python

2009-05-29 Thread David Blewett
On Thu, May 28, 2009 at 9:06 PM, Andrew Dunstan and...@dunslane.net wrote:

 Does Python 3 have some sort of usable sandbox that would mean we could
 have a trusted plpython?


I brought this up last August [1]. Zope has a working sandbox that they
include in their distribution.

David Blewett

1.
http://archives.postgresql.org/message-id/9d1f8d830808041008v50104fd8p6181d5ddce85...@mail.gmail.com


Re: [HACKERS] libpq is not thread safe

2009-05-29 Thread Zdenek Kotala

Bruce Momjian píše v čt 28. 05. 2009 v 17:20 -0400:

  
  Done, patch attached and applied.
 
 I went with a warning because it seemed most appropriate, but it looks
 very large:
 
   http://developer.postgresql.org/pgdocs/postgres/libpq-connect.html
 
 Should it be a notice?

I prefer warning. It is important message and beginners usually don't
know it.

Zdenek   


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compiler warning cleanup - unitilized const variables, pointer type mismatch

2009-05-29 Thread Zdenek Kotala

Tom Lane píše v čt 28. 05. 2009 v 11:42 -0400:
 Zdenek Kotala zdenek.kot...@sun.com writes:
  I attached another cleanup patch which fixes following warnings reported
  by Sun Studio:
 
 I'm not too impressed with any of these.  The proposed added
 initializers just increase future maintenance effort without solving
 any real problem (since the variables are required by C standard to
 initialize to zero). 

Agree.

  The proposed signature change on psql_completion
 is going to replace a warning on your system with outright failures on
 other people's.

I check readline and definition is still same at least from 5.0 version.
I'm not still understand why it should failure on other systems. I
looked on revision 1.30 of the file and there is only readline-4.2
support mentioned. Is readline 4.2 the problem?

Zdenek


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compiler warning cleanup - unitilized const variables, pointer type mismatch

2009-05-29 Thread Zdenek Kotala

Tom Lane píše v čt 28. 05. 2009 v 11:57 -0400:
 ).
 
 AFAICS, Sun's compiler is just too stupid and shouldn't be emitting
 this warning.  Perhaps the right response is to file a bug report
 against the compiler.

I checked it and it is already know bug. It is new lint style check in
Sun Studio 12. Unfortunately, problem is that current workflow does not
allow to detect if code is dead or not in the verification phase. Next
sun studio release could fix it.

Zdenek


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-29 Thread Stefan Kaltenbrunner

Alvaro Herrera wrote:

Bruce Momjian wrote:

Alvaro Herrera wrote:



There are so many caveats on pg_migrator (and things that need to be
done after the migration is complete) that one starts to wonder if
people is not better off just using parallel pg_restore.  From Stefan's
reported timings I'm not sure that pg_migrator is that much of a benefit
in the first place ... unless copy mode can be made much faster.  (On
link mode it is so much faster that it's worth it, but then you don't
have an escape hatch).

That is accurate.  I doubt copy mode speed can be improved.


Why not?  Right now it's single-threaded.  Would it be faster if it ran
several copies in parallel?


I guess it would be much faster on powerful hardware - we also have to 
consider that copy mode now is a no-op really.
If it had to do any actual page conversation too it seems entirely 
possible that a parallel restore might be even faster that a single 
threaded pg_migrator in copy mode.



Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-29 Thread Zdenek Kotala

Bruce Momjian píše v čt 28. 05. 2009 v 17:42 -0400:
 Josh Berkus wrote:
  On 5/28/09 2:30 PM, Bruce Momjian wrote:
   Because no one has responded, I am going to prevent pg_migrator from
   working with a cluster that uses tsvector.  I realize this limits
   pg_migrator's usefulness, but I have to move on.
  
  I don't know how to fix the issue, certainly.
  
  Why was there a format change to tsvector anyway?  Was there an 
  important reason for this?
 
 The ordering of the lexems was changed:

The biggest problem is dictionary change. I'm not sure if it happened
but IIRC Teodor mentioned it in Ottawa. If it happened It hits down
tsvector compatibility at all.  

Zdenek


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-05-29 Thread Robert Haas
On Fri, May 29, 2009 at 2:41 AM, Markus Wanner mar...@bluegap.ch wrot Hi,
 Quoting Robert Haas robertmh...@gmail.com:
 Why is this harder than I think it is?

 One of the simplest possible example is something like:

Thanks for the explanation, I understand it better now.  I'm still
dismayed, but at least I know why I'm dismayed.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Warnings in compile

2009-05-29 Thread Michael Meskes
On Mon, May 25, 2009 at 12:10:49PM -0400, Tom Lane wrote:
 [ thinks for a bit... ]  What might be both safe and warning-free
 is to code an explicit empty statement, viz macro body as
 
   if (1) { ... } else ((void) 0)

I just tried this and yes, it quietens gcc and probably is at least as save as
the old version. Therefore I just commit this small change.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: mes...@jabber.org
Go VfL Borussia! Go SF 49ers! Use Debian GNU/Linux! Use PostgreSQL!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Andrew Dunstan



Peter Eisentraut wrote:

On Thursday 28 May 2009 02:57:00 Josh Berkus wrote:
  

Personally, if we're tracking stuff through special dependancies which
pg_dump will be aware of anyway, I don't see why extension objects
should go into a special schema.



But they clearly have to go into *some* schema, and it would add some clarity 
to the world if we made a recommendation which one that is.  Which is what 
some of the subproposals really come down to.
  


Even that's going to be hard, frankly. The usage pattern is likely to be 
too varied for any one-size-fits-all recommendation.


Proposals to allow a choice of schema at install time sound nice but in 
practice they are a recipe for massive headaches and maintenance 
nightmares, I think. It means no extension author will be able to 
hardcode the schema name in any view, function etc. Yuck.


I think almost all these difficulties could be overcome if we had some 
sort of aliasing support, so that arbitrary objects in schema a could be 
aliased in schema b.  If that were in place, best practice would 
undoubtedly be for each module to install in its own schema, and for the 
DBA to alias what is appropriate to their usage scenario. But unless 
someone wants to tackle that  I think we should leave schema management 
entirely alone, and leave it up to the extension author / DBA between them.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-29 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Bruce Momjian wrote:
 Alvaro Herrera wrote:
 Why not?  Right now it's single-threaded.  Would it be faster if it ran
 several copies in parallel?
 
 Sure, but that assumes you have parallel I/O channels;  I assume right
 now it is I/O limited.

 But so does parallel pg_restore, no?

The point of parallel pg_restore is that COPY is frequently CPU-bound to
some extent, and so you can put multiple CPUs to work by parallelizing.
I find it much less probable that multiple cp operations can be
parallelized, unless the DB is spread across multiple tablespaces and
the code is smart enough to interleave by tablespace.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Python 3.0 does not work with PL/Python

2009-05-29 Thread Andrew Dunstan



David Blewett wrote:
On Thu, May 28, 2009 at 9:06 PM, Andrew Dunstan and...@dunslane.net 
mailto:and...@dunslane.net wrote:


Does Python 3 have some sort of usable sandbox that would mean we
could have a trusted plpython?


I brought this up last August [1]. Zope has a working sandbox that 
they include in their distribution.



1. 
http://archives.postgresql.org/message-id/9d1f8d830808041008v50104fd8p6181d5ddce85...@mail.gmail.com 



How many python installations have this gadget? If the answer is not 
many then it's not much good to us, unless someone wants to create 
PL/zope-sandbox. Really, something like this should be part of a 
standard python installation.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Dimitri Fontaine

Hi,

Le 29 mai 09 à 16:11, Andrew Dunstan a écrit :
I think almost all these difficulties could be overcome if we had  
some sort of aliasing support, so that arbitrary objects in schema a  
could be aliased in schema b.  If that were in place, best practice  
would undoubtedly be for each module to install in its own schema,  
and for the DBA to alias what is appropriate to their usage scenario.


This coupled with Peter's idea of nested namespace seems a killer  
feature for me. That means the pg_extension namespace isn't a flat  
mess but a organized one, with private (internal) objects deeper into  
the hierarchy. It makes it easy to have a top-level schema per  
extension without rendering search_path impracticable.


Then you slice atop of if aliasing so that you can refer to  
pg_extension.a.part1.obj_x from say utils.a.x or even utils.x, DBA  
choice. Or simply alias schema pg_extension.a.part1 as a. Whatever.


It seems to offer the best of both worlds: we know where extensions  
are meant to end up getting installed (private (nested) sub schema(s)  
in pg_extension), and DBA has the option to mask this implementation  
detail by aliasing it all wherever needed, with a choice of granularity.


But unless someone wants to tackle that  I think we should leave  
schema management entirely alone, and leave it up to the extension  
author / DBA between them.


Well, we all know that proper extension/module/plugin packaging  
system, flexible for both authors and users, reliable and built for  
next 20 years... it won't be made in a breathe. If some expected it to  
be easy, I think it's time to revise the plans. Want to have it all in  
8.5? Still time to join ;)


If the way to have good extension packaging support in PostgreSQL  
means we need aliasing first (which I think is called synonym in the  
standard), let's work on this as a first step patch?
In theory, it's even possible to begin work on extensions without  
synonyms/alias, using the pg_extension forced place (but with nested  
namespace support) and have the alias be done in parallel or after, as  
soon as we know what we want the big picture to look like when  
finished. Incremental work, etc.


Regards,
--
dim
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-05-29 Thread Markus Wanner

Hi,

Quoting Aidan Van Dyk ai...@highrise.ca:

Ok, so seeing the interest in having a good conversion, I took a stab at
parsecvs this afternoon, probably what I consider the leading static
conversion tool.


Here are some results from a conversion with cvs2git.


It takes about 10 minutes to run my old xeon.


The conversion with cvs2git certainly took a bit longer, however, I  
don't think that matters at all. Everything below a day or two is good  
enough, IMO. What counts is the result.


The first step is running cvs2git itself:

cvs2svn Statistics:
--
Total CVS Files:  6873
Total CVS Revisions:140191
Total CVS Branches:  36057
Total CVS Tags: 457515
Total Unique Tags: 171
Total Unique Branches:  21
CVS Repos Size in KB:   377337
Total SVN Commits:   32889
First Revision Date:Tue Jul  9 08:21:07 1996
Last Revision Date: Thu May 28 22:02:10 2009

(number of files matches pretty well with my own algorithm, however,  
total svn commits is a bit lower, compared to the ~ 40'000 blobs I got).


The output of cvs2git can then be imported with git fast-import:

git-fast-import statistics:
-
Alloc'd objects: 35
Total objects:   349405 ( 19563 duplicates  )
  blobs  :   132672 (  3255 duplicates 119032 deltas)
  trees  :   183967 ( 16308 duplicates 165582 deltas)
  commits:32766 ( 0 duplicates  0 deltas)
  tags   :0 ( 0 duplicates  0 deltas)
Total branches: 194 (   664 loads )
  marks: 1073741824 (168693 unique)
  atoms:   5280
Memory total: 16532 KiB
   pools:  2860 KiB
 objects: 13671 KiB
-
pack_report: getpagesize()=   4096
pack_report: core.packedGitWindowSize = 1073741824
pack_report: core.packedGitLimit  = 8589934592
pack_report: pack_used_ctr= 124414
pack_report: pack_mmap_calls  =   3674
pack_report: pack_open_windows=  1 /  1
pack_report: pack_mapped  =  199500913 /  199500913
-


The resulting repository contains the following branches. The  
unlabeled ones contain only 1-2 files and seem rather irrelevant. In a  
next try, I'd disable their creation completely, just wanted to check.


  REL2_0B
  REL6_4
  REL6_5_PATCHES
  REL7_0_PATCHES
  REL7_1_STABLE
  REL7_2_STABLE
  REL7_3_STABLE
  REL7_4_STABLE
  REL8_0_0
  REL8_0_STABLE
  REL8_1_STABLE
  REL8_2_STABLE
  REL8_3_STABLE
  Release_1_0_3
  WIN32_DEV
  ecpg_big_bison
* master
  unlabeled-1.44.2   - from src/backend/commands/tablecmds.c
  unlabeled-1.51.2   - from src/test/regress/expected/alter_table.out
  unlabeled-1.59.2   - from src/backend/executor/execTuples.c
  unlabeled-1.87.2   - from src/backend/executor/nodeAgg.c
  unlabeled-1.90.2   - from src/backend/parser/parse_target.c and
 src/backend/access/common/tupdesc.c

Comparison of the head of each branch between git and CVS (modulo CVS  
keyword expansion, which I've filtered out):


ecpg_big_bison.diff:  0 files changed
master.diff:  0 files changed
REL2_0B.diff: 0 files changed
REL6_4.diff:  0 files changed
REL6_5_PATCHES.diff:  0 files changed
REL7_0_PATCHES.diff:  0 files changed
REL7_1_STABLE.diff:   0 files changed
REL7_2_STABLE.diff:   0 files changed
REL7_3_STABLE.diff:   0 files changed
REL7_4_STABLE.diff:   0 files changed
REL8_0_0.diff:0 files changed
REL8_0_STABLE.diff:   0 files changed
REL8_1_STABLE.diff:   0 files changed
REL8_2_STABLE.diff:   0 files changed
REL8_3_STABLE.diff:   0 files changed
Release_1_0_3.diff:   0 files changed
WIN32_DEV.diff:   0 files changed

I plan to compare the tags as well and test what branch they are in,  
but so far cvs2git seems to hold its promises. I'll report back again  
within the next few days.


Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-05-29 Thread Aidan Van Dyk
* Markus Wanner mar...@bluegap.ch [090529 11:06]:
 Hi,

 Comparison of the head of each branch between git and CVS (modulo CVS  
 keyword expansion, which I've filtered out):

How did you filter it out, and without the filtering out, how does it
do?

 I plan to compare the tags as well and test what branch they are in, but 
 so far cvs2git seems to hold its promises. I'll report back again within 
 the next few days.

It definitely seems to have figured out the REL8_0_0 confusing that
tripped up parsecvs.  If I'm stuck on another windows project some time
in the near future, I'll try and look into why parsecvs trips up on
those 3 files from REL8_0_0 branch ;-)

a.

-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Tom Lane
Dimitri Fontaine dfonta...@hi-media.com writes:
 Le 29 mai 09 à 16:11, Andrew Dunstan a écrit :
 I think almost all these difficulties could be overcome if we had  
 some sort of aliasing support, so that arbitrary objects in schema a  
 could be aliased in schema b.  If that were in place, best practice  
 would undoubtedly be for each module to install in its own schema,  
 and for the DBA to alias what is appropriate to their usage scenario.

 This coupled with Peter's idea of nested namespace seems a killer  
 feature for me.

What it sounds like to me is an amazingly complicated gadget with
absolutely no precedent of successful use anywhere.  We'll spend a year
fooling with the details of this and be no closer to actually solving
the problem at hand, namely getting a simple workable extension
packaging facility.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-05-29 Thread Markus Wanner

Hi,

Quoting Aidan Van Dyk ai...@highrise.ca:

* Markus Wanner mar...@bluegap.ch [090529 11:06]:

Comparison of the head of each branch between git and CVS (modulo CVS
keyword expansion, which I've filtered out):


How did you filter it out


With perl some regexes.


and without the filtering out, how does it do?


Uh.. why is that of interest? With content hashing, these keywords do  
more harm than good.


I'd have to check again, but there certainly are differences here and there.

Regards

Markus Wanner


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plperl error format vs plpgsql error format vs pgTAP

2009-05-29 Thread Kevin Field
On May 28, 5:19 pm, da...@kineticode.com (David E. Wheeler) wrote:
 On May 28, 2009, at 12:53 PM, Kevin Field wrote:

  Can pgTap check for a regex instead if just a string?

  That's the other option, if the pgTAP author is willing...if the
  SQLSTATE thing doesn't work out I guess we'll have to go down that
  road.

 Patches welcome. ;-)

http://github.com/theory/pgtap/tree/master/

 I'm getting a new version ready to release as I type.

Thanks, great to know.  :)  Although, I do think changing plperl is
the more proper option, so I'm going to try there first...

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-29 Thread Tom Lane
Zdenek Kotala zdenek.kot...@sun.com writes:
 The biggest problem is dictionary change. I'm not sure if it happened
 but IIRC Teodor mentioned it in Ottawa. If it happened It hits down
 tsvector compatibility at all.  

No more than changing dictionary behavior in an existing installation.
What was stated when the issue came up during 8.3 development is that
you don't normally need to worry about small changes in dictionary
behavior because overall text search behavior will still be close
enough.  (I seem to recall that I'd complained that any change in
dictionary behavior would invalidate indexes based on the dictionary,
and this was the answer.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Andrew Dunstan



Tom Lane wrote:

Dimitri Fontaine dfonta...@hi-media.com writes:
  

Le 29 mai 09 à 16:11, Andrew Dunstan a écrit :

I think almost all these difficulties could be overcome if we had  
some sort of aliasing support, so that arbitrary objects in schema a  
could be aliased in schema b.  If that were in place, best practice  
would undoubtedly be for each module to install in its own schema,  
and for the DBA to alias what is appropriate to their usage scenario.
  


  
This coupled with Peter's idea of nested namespace seems a killer  
feature for me.



What it sounds like to me is an amazingly complicated gadget with
absolutely no precedent of successful use anywhere.  We'll spend a year
fooling with the details of this and be no closer to actually solving
the problem at hand, namely getting a simple workable extension
packaging facility.
  


Well, the part about no precedent is not true. See 
http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/r910.htm 
for example. I didn't dream up the idea out of thin air ;-) (I pretty 
much started my computing career over 20 years ago working on DB2).


However, the part about it being complex is true.

And that is why I agree completely that we should not hold up the 
extension work waiting for it.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-05-29 Thread Aidan Van Dyk
* Markus Wanner mar...@bluegap.ch [090529 11:18]:
 Hi,

 Quoting Aidan Van Dyk ai...@highrise.ca:
 * Markus Wanner mar...@bluegap.ch [090529 11:06]:
 Comparison of the head of each branch between git and CVS (modulo CVS
 keyword expansion, which I've filtered out):

 How did you filter it out

 With perl some regexes.

 and without the filtering out, how does it do?

 Uh.. why is that of interest? With content hashing, these keywords do  
 more harm than good.

Yes, but the point is you want an exact replica of CVS right?  You're
git repo should have $PostgreSQL$ and the cvs export/checkout (you do
use -kk right) should also have $PostgreSQL$.

The 3 parsecvs errors were that it *didn't* recognoze the strange
$PostgreSQL ... Exp $ expansion that cvs did.

But it's important, because on *some* files you *do* want expanded
keywords (like the $OpenBSD ... Exp $.  One of the reasons pg CVS went
to the $PostgreSQL$ keyword (I'm guessing) was so they could explictly
de-couple them from other keywords that they didn't want munging on.

So, I wouldn't consider any conversion good unless it had all these:
parsecvs-master:contrib/pgcrypto/crypt-des.c: * $FreeBSD: 
src/secure/lib/libcrypt/crypt-des.c,v 1.12 1999/09/20 12:39:20 markm Exp $
parsecvs-master:contrib/pgcrypto/crypt-md5.c: * $FreeBSD: 
src/lib/libcrypt/crypt-md5.c,v 1.5 1999/12/17 20:21:45 peter Exp $
parsecvs-master:contrib/pgcrypto/md5.c:/*  $KAME: md5.c,v 1.3 
2000/02/22 14:01:17 itojun Exp $ */
parsecvs-master:contrib/pgcrypto/md5.h:/*  $KAME: md5.h,v 1.3 
2000/02/22 14:01:18 itojun Exp $ */
parsecvs-master:contrib/pgcrypto/rijndael.c:/*  $OpenBSD: rijndael.c,v 
1.6 2000/12/09 18:51:34 markus Exp $ */
parsecvs-master:contrib/pgcrypto/rijndael.h: *  $OpenBSD: rijndael.h,v 
1.3 2001/05/09 23:01:32 markus Exp $ */
parsecvs-master:contrib/pgcrypto/sha1.c:/* $KAME: sha1.c,v 1.3 
2000/02/22 14:01:18 itojun Exp $*/
parsecvs-master:contrib/pgcrypto/sha1.h:/* $KAME: sha1.h,v 1.4 
2000/02/22 14:01:18 itojun Exp $*/
parsecvs-master:contrib/pgcrypto/sha2.c:/*  $OpenBSD: sha2.c,v 1.6 
2004/05/03 02:57:36 millert Exp $*/
parsecvs-master:contrib/pgcrypto/sha2.h:/*  $OpenBSD: sha2.h,v 1.2 
2004/04/28 23:11:57 millert Exp $*/
parsecvs-master:src/backend/port/darwin/system.c: * $FreeBSD: 
src/lib/libc/stdlib/system.c,v 1.6 2000/03/16 02:14:41 jasone Exp $
parsecvs-master:src/port/crypt.c:/* $NetBSD: crypt.c,v 1.18 
2001/03/01 14:37:35 wiz Exp $   */
parsecvs-master:src/port/crypt.c:__RCSID($NetBSD: crypt.c,v 1.18 
2001/03/01 14:37:35 wiz Exp $);
parsecvs-master:src/port/qsort.c:/* $NetBSD: qsort.c,v 1.13 
2003/08/07 16:43:42 agc Exp $   */
parsecvs-master:src/port/qsort_arg.c:/* $NetBSD: qsort.c,v 1.13 
2003/08/07 16:43:42 agc Exp $   */
parsecvs-master:src/port/strlcat.c: *   $OpenBSD: strlcat.c,v 1.13 
2005/08/08 08:05:37 espie Exp $  */
parsecvs-master:src/port/strlcpy.c:/*   $OpenBSD: strlcpy.c,v 1.11 
2006/05/05 15:27:38 millert Exp $*/

As well as stuff like:
parsecvs-master:src/backend/access/index/genam.c: *   $PostgreSQL$
parsecvs-master:src/backend/access/index/indexam.c: * $PostgreSQL$
parsecvs-master:src/backend/access/nbtree/Makefile:#$PostgreSQL$
parsecvs-master:src/backend/access/nbtree/README:$PostgreSQL$
parsecvs-master:src/backend/access/nbtree/nbtcompare.c: * 
$PostgreSQL$
parsecvs-master:src/backend/access/nbtree/nbtinsert.c: *  
$PostgreSQL$
parsecvs-master:src/backend/access/nbtree/nbtpage.c: *$PostgreSQL$
parsecvs-master:src/backend/access/nbtree/nbtree.c: * $PostgreSQL$
parsecvs-master:src/backend/access/nbtree/nbtsearch.c: *  
$PostgreSQL$

Basically, identical what to a cvs export/checkout/update gives you with
a -kk.

But I'm picky ;-)

a.

-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [HACKERS] plperl error format vs plpgsql error format vs pgTAP

2009-05-29 Thread Robert Haas
On Fri, May 29, 2009 at 7:59 AM, Kevin Field kevinjamesfi...@gmail.com wrote:
 On May 28, 5:19 pm, da...@kineticode.com (David E. Wheeler) wrote:
 On May 28, 2009, at 12:53 PM, Kevin Field wrote:

  Can pgTap check for a regex instead if just a string?

  That's the other option, if the pgTAP author is willing...if the
  SQLSTATE thing doesn't work out I guess we'll have to go down that
  road.

 Patches welcome. ;-)

    http://github.com/theory/pgtap/tree/master/

 I'm getting a new version ready to release as I type.

 Thanks, great to know.  :)  Although, I do think changing plperl is
 the more proper option, so I'm going to try there first...

It seems to me that removing line numbers from PL/perl error messages
is not a good solution to anything.  Line numbers are extremely useful
for debugging purposes, and getting rid of them because one particular
testing framework doesn't know how to use regular expressions is
solving the wrong problem.

I'm also a bit confused because your original post had a line number
in the PL/pgsql output, too, just formatted slightly differently.  Why
doesn't that one cause a problem?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plperl error format vs plpgsql error format vs pgTAP

2009-05-29 Thread Andrew Dunstan



Kevin Field wrote:

On May 28, 5:19 pm, da...@kineticode.com (David E. Wheeler) wrote:
  

On May 28, 2009, at 12:53 PM, Kevin Field wrote:



Can pgTap check for a regex instead if just a string?


That's the other option, if the pgTAP author is willing...if the
SQLSTATE thing doesn't work out I guess we'll have to go down that
road.
  

Patches welcome. ;-)

   http://github.com/theory/pgtap/tree/master/

I'm getting a new version ready to release as I type.



Thanks, great to know.  :)  Although, I do think changing plperl is
the more proper option, so I'm going to try there first...

  


As I pointed out before, these line numbers are put there by the perl 
engine, not by the plperl glue code.


If you want to make plperl strip out the line number from every error 
message the perl engine produces, I am going to object. It might make 
things easier for pgTap but it will make life much harder in other ways.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-29 Thread Zdenek Kotala

Tom Lane píše v pá 29. 05. 2009 v 11:28 -0400:
 Zdenek Kotala zdenek.kot...@sun.com writes:
  The biggest problem is dictionary change. I'm not sure if it happened
  but IIRC Teodor mentioned it in Ottawa. If it happened It hits down
  tsvector compatibility at all.  
 
 No more than changing dictionary behavior in an existing installation.
 What was stated when the issue came up during 8.3 development is that
 you don't normally need to worry about small changes in dictionary
 behavior because overall text search behavior will still be close
 enough.  (I seem to recall that I'd complained that any change in
 dictionary behavior would invalidate indexes based on the dictionary,
 and this was the answer.)

It seems to be OK then. However, pg_migrator should inform about it.

Zdenek


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Python 3.0 does not work with PL/Python

2009-05-29 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 David Blewett wrote:
 I brought this up last August [1]. Zope has a working sandbox that 
 they include in their distribution.
 http://archives.postgresql.org/message-id/9d1f8d830808041008v50104fd8p6181d5ddce85...@mail.gmail.com
  

 How many python installations have this gadget? If the answer is not 
 many then it's not much good to us, unless someone wants to create 
 PL/zope-sandbox. Really, something like this should be part of a 
 standard python installation.

Yeah.  For one thing, how much trust can you put in a security mechanism
that hasn't been accepted upstream?  Given the history of this sort of
thing in Python, I'm not prepared to just assume that Zope got it right.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-05-29 Thread Alvaro Herrera
Aidan Van Dyk wrote:

 Yes, but the point is you want an exact replica of CVS right?  You're
 git repo should have $PostgreSQL$ and the cvs export/checkout (you do
 use -kk right) should also have $PostgreSQL$.
 
 The 3 parsecvs errors were that it *didn't* recognoze the strange
 $PostgreSQL ... Exp $ expansion that cvs did.

Huh, no -- I agree that $OpenBSD$ etc should remain (we don't munge them
anyway), but $PostgreSQL$, $Id$, $Revision$ etc tags are best gone
because, as Markus says, their expansion interferes with content hashing.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plperl error format vs plpgsql error format vs pgTAP

2009-05-29 Thread David E. Wheeler

On May 29, 2009, at 4:59 AM, Kevin Field wrote:


  http://github.com/theory/pgtap/tree/master/

I'm getting a new version ready to release as I type.


Thanks, great to know.  :)  Although, I do think changing plperl is
the more proper option, so I'm going to try there first...


I added `throws_like()` to the To Do list, so if anyone wants to do  
that…fork and clone!


Best,

David
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Dimitri Fontaine

Le 29 mai 09 à 17:12, Tom Lane a écrit :

What it sounds like to me is an amazingly complicated gadget with
absolutely no precedent of successful use anywhere.  We'll spend a  
year

fooling with the details of this and be no closer to actually solving
the problem at hand, namely getting a simple workable extension
packaging facility.



What it sounds like to me is a way to all agree what the finished  
feature would look like, allowing us to commit incremental patches.  
Coarse(?) grained plan:


 A. nested namespaces

 B. packaging facility, each module have its own schema in pg_extension
sub schemas in pg_extension.myext are possible and welcomed to  
organize things


 C. synonyms, allowing DBA to organise the visibility as they see fit,
and to overcome search_path limitations

The ordering of those points would still need to be talked about, I'd  
see A as necessary to get through before B implementation begins, but  
at least this would solve the search_path and default schema  
destination points while designing the extension packaging facility.  
Then when B is done, or parallel to development of B, we can have C,  
so that everyone is happy: it works and is not a PITA to maintain.


All in all, agreeing about those steps now would open up the real  
matters of extension packaging to begin.


Regards,
--
dim

PS: I realize that my line of thoughts is tied to imagining that the  
more visible (and complex, as in agreeing on bikesched color) part of  
the packaging facility user design is its relationship with schemas  
and search_path. Even the SQL syntax of creating (altering/droping/ 
granting) the new SQL object seems like it'll be easier.
That done, the rest of it is mainly self-constrained, I don't foresee  
another such controversial part related to the existing system...


Now, that's in the archive and I'll soon really look like a fool :)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compiler warning cleanup - unitilized const variables, pointer type mismatch

2009-05-29 Thread Tom Lane
Zdenek Kotala zdenek.kot...@sun.com writes:
 Tom Lane píše v čt 28. 05. 2009 v 11:42 -0400:
 The proposed signature change on psql_completion
 is going to replace a warning on your system with outright failures on
 other people's.

 I check readline and definition is still same at least from 5.0 version.
 I'm not still understand why it should failure on other systems. I
 looked on revision 1.30 of the file and there is only readline-4.2
 support mentioned. Is readline 4.2 the problem?

[ pokes around... ]  Actually I think the reason it's like this is that
very old versions of readline have completion_matches() rather than
rl_completion_matches(), and the former is declared to take char * not
const char *.  So it still would compile, you'd just get cast-away-const
warnings.  Which is probably okay considering that hardly anyone is
likely to still be using such old readline libs anyway.

We could try experimenting with that after we branch for 8.5.  I'm not
eager to fool with it in late beta, as we'd be invalidating any port
testing already done.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread David E. Wheeler

On May 29, 2009, at 3:24 AM, Peter Eisentraut wrote:

Yeah, to reiterate what I posted elsewhere, perhaps it'd be a good  
idea to
give up on the search path idea altogether and think more in terms  
of an

import facility like Python, Java, and sometimes Perl have.


+1

Actually, Perl's is usually a file path. `use Foo::Bar::Baz;` triggers  
a search for Foo/Bar/Baz.pm unless Foo::Bar::Baz is already loaded.


Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread David E. Wheeler

On May 29, 2009, at 3:38 AM, Dimitri Fontaine wrote:

PS: we still have to provide users with easy tools to (dynamically)  
manage search_path, don't we?
(I prefer not to start the search_path management tool ideas right  
here).


Yes, we do, and that's what at least half this thread is about.  
Whether or not such tools are put to use for extensions support is a  
separate issue, but both need addressing, I think.


Best,

Davdi

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plperl error format vs plpgsql error format vs pgTAP

2009-05-29 Thread Kevin Field
On May 29, 11:35 am, robertmh...@gmail.com (Robert Haas) wrote:
 On Fri, May 29, 2009 at 7:59 AM, Kevin Field kevinjamesfi...@gmail.com 
 wrote:
  On May 28, 5:19 pm, da...@kineticode.com (David E. Wheeler) wrote:
  On May 28, 2009, at 12:53 PM, Kevin Field wrote:

   Can pgTap check for a regex instead if just a string?

   That's the other option, if the pgTAP author is willing...if the
   SQLSTATE thing doesn't work out I guess we'll have to go down that
   road.

  Patches welcome. ;-)

 http://github.com/theory/pgtap/tree/master/

  I'm getting a new version ready to release as I type.

  Thanks, great to know.  :)  Although, I do think changing plperl is
  the more proper option, so I'm going to try there first...

 It seems to me that removing line numbers from PL/perl error messages
 is not a good solution to anything.  Line numbers are extremely useful
 for debugging purposes, and getting rid of them because one particular
 testing framework doesn't know how to use regular expressions is
 solving the wrong problem.

You're right, but that's not what I'm proposing...

 I'm also a bit confused because your original post had a line number
 in the PL/pgsql output, too, just formatted slightly differently.  Why
 doesn't that one cause a problem?

The difference is, in PL/pgsql they're in the CONTEXT: line, whereas
in plperl they're in the error line.  This is inconsistent; if we fix
it, we don't need to add kludge to pgTAP.

But later in the thread the desired fix became not changing perl but
instead making a way to report error codes from plperl, which is what
I'm attempting to do with my rusty C skills soon.  plperl should have
ereport() *anyway*, as I believe Tom had insinuated.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plperl error format vs plpgsql error format vs pgTAP

2009-05-29 Thread Kevin Field
On May 29, 11:48 am, Kevin Field kevinjamesfi...@gmail.com wrote:
 On May 29, 11:35 am, robertmh...@gmail.com (Robert Haas) wrote:



  On Fri, May 29, 2009 at 7:59 AM, Kevin Field kevinjamesfi...@gmail.com 
  wrote:
   On May 28, 5:19 pm, da...@kineticode.com (David E. Wheeler) wrote:
   On May 28, 2009, at 12:53 PM, Kevin Field wrote:

Can pgTap check for a regex instead if just a string?

That's the other option, if the pgTAP author is willing...if the
SQLSTATE thing doesn't work out I guess we'll have to go down that
road.

   Patches welcome. ;-)

  http://github.com/theory/pgtap/tree/master/

   I'm getting a new version ready to release as I type.

   Thanks, great to know.  :)  Although, I do think changing plperl is
   the more proper option, so I'm going to try there first...

  It seems to me that removing line numbers from PL/perl error messages
  is not a good solution to anything.  Line numbers are extremely useful
  for debugging purposes, and getting rid of them because one particular
  testing framework doesn't know how to use regular expressions is
  solving the wrong problem.

 You're right, but that's not what I'm proposing...

  I'm also a bit confused because your original post had a line number
  in the PL/pgsql output, too, just formatted slightly differently.  Why
  doesn't that one cause a problem?

 The difference is, in PL/pgsql they're in the CONTEXT: line, whereas
 in plperl they're in the error line.  This is inconsistent; if we fix
 it, we don't need to add kludge to pgTAP.

 But later in the thread the desired fix became not changing perl but
 instead making a way to report error codes from plperl, which is what
 I'm attempting to do with my rusty C skills soon.  plperl should have
 ereport() *anyway*, as I believe Tom had insinuated.

BTW, I noticed in exec_stmt_raise() in src/pl/plpgsql/src/pl_exec.c
that the comment still says throw it with elog() rather than ereport
() even though ereport() is used in all places but one in the
function:

default:
elog(ERROR, unrecognized raise option: %d, 
opt-opt_type);

Should this be changed to:

default:
ereport(ERROR, (errmsg_internal(unrecognized 
raise option: %d,
opt-opt_type)));

...along with the comment?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-05-29 Thread Alvaro Herrera
Tom Lane escribió:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  Tom Lane escribi�:
  What was in the back of my mind was that we'd go around and mass-remove
  $PostgreSQL$ (and any other lurking tags), but only from HEAD and only
  after the repo conversion.  Although just before it would be okay too.
 
  You mean we would remove them from CVS?  I don't think that's
  necessarily a good idea; it'd be massive changes for no good reason.
 
 Uh, how is it different from any other mass edit, such as our annual
 copyright-year updates, or pgindent runs?

Well, the other mass edits have a purpose.  This one would be only to
help the migration.

  My idea was to remove them from the repository that would be used for the
  conversion (I think that means editing the ,v files),
 
 Ick ... I'm willing to tolerate a few small manual ,v edits if we have
 to do it to make tags consistent or something like that.  I don't think
 we should be doing massive edits of that kind.

Yeah, that idea wasn't all that great after all.

 But anyway, that's not the interesting point.  The interesting point is
 what about the historical aspect of it, not whether we want to dispense
 with the tags going forward.  Should our repo conversion try to
 represent the historical states of the files including the tag strings?

Since we're going to lose them functionally after the conversion, it
doesn't seem that they serve any purpose.  After all, they will not
represent anything on the new repository.

The problem is that they are a problem for the conversion.  Are they
expanded before or after the commit?  Because the very expansion causes
the file to change identity, files being identified by the SHA1 sum of
their contents.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-05-29 Thread Aidan Van Dyk
* Alvaro Herrera alvhe...@commandprompt.com [090529 11:45]:
 Aidan Van Dyk wrote:
 
  Yes, but the point is you want an exact replica of CVS right?  You're
  git repo should have $PostgreSQL$ and the cvs export/checkout (you do
  use -kk right) should also have $PostgreSQL$.
  
  The 3 parsecvs errors were that it *didn't* recognoze the strange
  $PostgreSQL ... Exp $ expansion that cvs did.
 
 Huh, no -- I agree that $OpenBSD$ etc should remain (we don't munge them
 anyway), but $PostgreSQL$, $Id$, $Revision$ etc tags are best gone
 because, as Markus says, their expansion interferes with content hashing.

I *think* you're actually agreeing with me.  *Hiding* the diffs that
include munching of keywords is not what we want.  We want the
conversion to *not* munge keyword-like things (No, $OpenBSD$ is *not*
a keyword in the PostgreSQL CVS repository.  But $PostgreSQL$ *is*.

So we want the conversion to be identical to:
 cvs export -kk -r $tag

That will have *keywords* be unexpanded; namely these specific ones:
Author
Date
Header
Id
Locker
Log
Name
RCSfile
Revision
Source
State
PostgreSQL
but *not* keyword-like entries, like:
$ NetBSD ... Exp $
$ FreeBSD ... Exp $
$ OpenBSD ... Exp $
$ KAME ... Exp $
which are *not* CVS keywords in the PostgreSQL repository.  

i.e. Just like I said, identical to cvs checkout/export -kk.


Now, and intersting question, do you want the perfect conversion to
contain *other* keyword un-expansion possiblities that would have happened
on any commits on Nov 29/30 2003 when CVSROOT/options contained:
+tagexpand=iPostgreSQL
If you had checked out something on that day, even with a -kk, $Log$
would have been expanded, because for that day, $Log$ was *not* an
eligable keyword on the PostgreSQL CVS repository.

Whooee... Fun with CVS history

a.


-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [HACKERS] plperl error format vs plpgsql error format vs pgTAP

2009-05-29 Thread Tom Lane
Kevin Field kevinjamesfi...@gmail.com writes:
 default:
   elog(ERROR, unrecognized raise option: %d, 
 opt-opt_type);

 Should this be changed to:

 default:
   ereport(ERROR, (errmsg_internal(unrecognized 
 raise option: %d,
 opt-opt_type)));

No, we generally don't bother with that.  The above two are exactly
equivalent and the first is easier to write, so why complicate the code?
ereport is needed if you want to specify a SQLSTATE, provide a
translatable error message, etc, but for internal shouldn't-happen cases
we customarily just use elog.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plperl error format vs plpgsql error format vs pgTAP

2009-05-29 Thread Kevin Field
On May 29, 11:48 am, Kevin Field kevinjamesfi...@gmail.com wrote:
 On May 29, 11:35 am, robertmh...@gmail.com (Robert Haas) wrote:



  On Fri, May 29, 2009 at 7:59 AM, Kevin Field kevinjamesfi...@gmail.com 
  wrote:
   On May 28, 5:19 pm, da...@kineticode.com (David E. Wheeler) wrote:
   On May 28, 2009, at 12:53 PM, Kevin Field wrote:

Can pgTap check for a regex instead if just a string?

That's the other option, if the pgTAP author is willing...if the
SQLSTATE thing doesn't work out I guess we'll have to go down that
road.

   Patches welcome. ;-)

  http://github.com/theory/pgtap/tree/master/

   I'm getting a new version ready to release as I type.

   Thanks, great to know.  :)  Although, I do think changing plperl is
   the more proper option, so I'm going to try there first...

  It seems to me that removing line numbers from PL/perl error messages
  is not a good solution to anything.  Line numbers are extremely useful
  for debugging purposes, and getting rid of them because one particular
  testing framework doesn't know how to use regular expressions is
  solving the wrong problem.

 You're right, but that's not what I'm proposing...

  I'm also a bit confused because your original post had a line number
  in the PL/pgsql output, too, just formatted slightly differently.  Why
  doesn't that one cause a problem?

 The difference is, in PL/pgsql they're in the CONTEXT: line, whereas
 in plperl they're in the error line.  This is inconsistent; if we fix
 it, we don't need to add kludge to pgTAP.

 But later in the thread the desired fix became not changing perl but
 instead making a way to report error codes from plperl, which is what
 I'm attempting to do with my rusty C skills soon.  plperl should have
 ereport() *anyway*, as I believe Tom had insinuated.

Hmm, I'm rustier than I thought.  I might need some help with this
later.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] explain analyze rows=%.0f

2009-05-29 Thread Tom Lane
Joshua Tolley eggyk...@gmail.com writes:
 On Thu, May 28, 2009 at 11:12:42PM -0400, Robert Haas wrote:
 On Thu, May 28, 2009 at 11:00 PM, Euler Taveira de Oliveira
 Don't you think is too strange having, for example, 6.67 rows?
 
 No stranger than having it say 7 when it's really not.  Actually mine
 mostly come out 1 when the real value is somewhere between 0.5 and
 1.49.  :-(

 +1. It would help users realize more quickly that some of the values in the
 EXPLAIN output are, for instance, *average* number of rows *per iteration* of 
 a
 nested loop, say, rather than total rows found in all loops.

I think it would only be sensible to show fractional digits if nloops is
greater than 1.  Otherwise the value must in fact be an integer, and
you're just going to confuse people more by suggesting that it might not
be.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Josh Berkus

Dimitri,


We'd still need search_path in there, as Python's still using a path.
With 'default' search_path you'd have to qualify your type as
pg_extensions.postgis.submodule.special_type, with pg_extensions in
search_path the following notation would find it too:
postgis.submodule.special_type.
And if you have pg_extensions.postgis.submodule in the search_path, then
you can use special_type without having to (nest-) schema qualify it.


But *incompatible* with SQL truncation of qualified names.  Remember 
that you can refer to something by any portion of its qualified name, 
such as:


pg_extensions.postgis.submodule.special_type
postgis.submodule.special_type
submodule.special_type
special_type

... are all valid.  Which is fine until you think that we could have a:

pg_extensions.pg_tap.submodule.special_type

or even a:

schema submodule.special_type

which would confuse both the search path and the user.  What this means 
is that all schema names would have to be unique, whether they are 
nested or not.  Which makes subschema *within* an extension rather useless.


--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] information_schema.columns changes needed for OLEDB

2009-05-29 Thread Konstantin Izmailov
Hi,
I'm not sure that it is related to information_schema but I wanted to let
you know that some Postgres functions are listed in pg_proc while others are
not. For example, all Data Type Formatting function are in pg_proc (to_char,
to_hex, ...). While several of the Date/Time Functions are not there
(extract, localtime, ...).

Why different Postgres function are not equal???

This causes issues to Windows integration as well.
Konstantin


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-29 Thread Bruce Momjian
Tom Lane wrote:
 Josh Berkus j...@agliodbs.com writes:
  Bruce,
  The ordering of the lexems was changed:
 
  What does that get us in terms of performance etc.?
 
 It was changed to support partial-match tsvector queries.  Without it,
 a partial match query would have to scan entire tsvectors instead
 of applying binary search.  I don't know if Oleg and Teodor did any
 actual performance tests on the size of the hit, but it seems like
 it could be pretty awful for large documents.

I started thinking about the performance issues of the tsvector changes.
Teodor gave me this code for conversion that basically does:

qsort_arg((void *) ARRPTR(t), t-size, sizeof(WordEntry), cmpLexeme, 
(void*) t);

So, basically, every time there is a cast we have to do a sort, which
for a large document would yield poor performance, and because we are
not storing the sorted result, it happens for every access;  this might
be an unacceptable performance burden.

So, one idea would be, instead of a cast, have pg_migrator rebuild the
tsvector columns with ALTER TABLE, so then the 8.4 index code could be
used.  But then we might as well just tell the users to migrate the
tsvector tables themselves, which is how pg_migrator behaves now.

Obviously we are still trying to figure out the best way to handle data
type changes;  I think as soon as we figure out a plan for tsvector we
can use that method for future changes.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] trouble with to_char('L')

2009-05-29 Thread Tom Lane
Hiroshi Inoue in...@tpf.co.jp writes:
 Tom Lane wrote:
 I think what this suggests is that there probably needs to be some
 encoding conversion logic near the places we examine localeconv()
 output.

 Attached is a patch to the current CVS.
 It uses a similar way like LC_TIME stuff does.

I'm not really in a position to test/commit this, since I don't have a
Windows machine.  However, since no one else is stepping up to deal with
it, here's a quick review:

* This seems to be assuming that the user has set LC_MONETARY and
LC_NUMERIC the same.  What if they're different?

* What if the selected locale corresponds to Unicode (ie UTF16)
encoding?

* #define'ing strdup() to do something rather different from strdup
seems pretty horrid from the standpoint of code readability and
maintainability, especially with nary a comment explaining it.

* Code will dump core on malloc failure.

* Since this code is surely not performance critical, I wouldn't bother
with trying to optimize it; hence drop the special case for all-ASCII.

* Surely we already have a symbol somewhere that can be used in
place of this:
 #defineMAX_BYTES_PER_CHARACTER 4


regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean shutdown and warm standby

2009-05-29 Thread Simon Riggs

On Thu, 2009-05-28 at 18:09 -0400, Tom Lane wrote:

 What's your point? Surely the applied patch is a *necessary* component
 of any attempt to try to ensure archiving is complete at shutdown.
 I agree that it doesn't cover every risk factor, and there are some
 risk factors that cannot be covered by Postgres itself.  But isn't it
 a step in a desirable direction?

Well, in one way, yes. I certainly encourage Guillaume to submit more
patches and for everybody to review them, as has been done. I turned up
late to the party on this, I know.

Regrettably, the patch doesn't remove the problem it was supposed to
remove and I'm highlighting there is still risk of data loss. I suggest
that we don't change any docs, and carefully word or even avoid any
release note inclusion to avoid lulling people into stopping safety
measures. 

The patch doesn't cause any problems though so we don't need to remove
it either.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-29 Thread Josh Berkus

Bruce,


So, one idea would be, instead of a cast, have pg_migrator rebuild the
tsvector columns with ALTER TABLE, so then the 8.4 index code could be
used.  But then we might as well just tell the users to migrate the
tsvector tables themselves, which is how pg_migrator behaves now.


It would be nice to have pg_migrator handle this, especially if we could 
do it in parallel.  Then we just have to warn users that migrating a 
database with tsvector columns takes significantly longer.  That is,


1) do rest of catalog swap and link/copy of objects.
2) mark all tsvector columns as 83_tsvector and add new tsvector type
   (these columns will be unusable for queries)
3) bring up database
4) search for all 83_tsvector columns
5) do ALTER TABLE on each of these columns, in parallel, up to a 
configuration setting (default 3).


However, I can also understand not having time to implement the above 
before 8.4 release.



--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-29 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 It would be nice to have pg_migrator handle this, especially if we could 
 do it in parallel.  Then we just have to warn users that migrating a 
 database with tsvector columns takes significantly longer.  That is,

 1) do rest of catalog swap and link/copy of objects.
 2) mark all tsvector columns as 83_tsvector and add new tsvector type
 (these columns will be unusable for queries)
 3) bring up database
 4) search for all 83_tsvector columns
 5) do ALTER TABLE on each of these columns, in parallel, up to a 
 configuration setting (default 3).

pg_migrator is already emitting a script that is intended to be run
after conversion, to handle REINDEXing of incompatible indexes.  That
could easily be made to do ALTER TYPE on old tsvector columns too, no?

The parallel bit is pie in the sky and should not be considered even
for a millisecond during this release cycle.  Save it for 8.5, or
suggest to people that they manually cut the script apart if they're
desperate to have that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean shutdown and warm standby

2009-05-29 Thread Heikki Linnakangas

Simon Riggs wrote:

Regrettably, the patch doesn't remove the problem it was supposed to
remove and I'm highlighting there is still risk of data loss. 


I feel that you're moving the goalposts. What exactly is the problem it 
was supposed to remove in your opinion?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] explain analyze rows=%.0f

2009-05-29 Thread Robert Haas
On Fri, May 29, 2009 at 1:30 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Joshua Tolley eggyk...@gmail.com writes:
 On Thu, May 28, 2009 at 11:12:42PM -0400, Robert Haas wrote:
 On Thu, May 28, 2009 at 11:00 PM, Euler Taveira de Oliveira
 Don't you think is too strange having, for example, 6.67 rows?

 No stranger than having it say 7 when it's really not.  Actually mine
 mostly come out 1 when the real value is somewhere between 0.5 and
 1.49.  :-(

 +1. It would help users realize more quickly that some of the values in the
 EXPLAIN output are, for instance, *average* number of rows *per iteration* 
 of a
 nested loop, say, rather than total rows found in all loops.

 I think it would only be sensible to show fractional digits if nloops is
 greater than 1.  Otherwise the value must in fact be an integer, and
 you're just going to confuse people more by suggesting that it might not
 be.

That might be over-engineering, but I'll take it.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean shutdown and warm standby

2009-05-29 Thread Robert Haas
On Fri, May 29, 2009 at 2:23 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Regrettably, the patch doesn't remove the problem it was supposed to
 remove and I'm highlighting there is still risk of data loss. I suggest
 that we don't change any docs, and carefully word or even avoid any
 release note inclusion to avoid lulling people into stopping safety
 measures.

I think it's pretty clear that you and the OP are talking about two
different problems.  To quote Guillaume:

Yes, the problem is that before this change, even with a working
replication and a clean shutdown, you still had to replicate the last
WAL file by hand.

I think that's a pretty legitimate complaint.  You seem to that this
wasn't worth fixing at this point in the development cycle, because it
was always possible to write a script to copy that last WAL file by
hand.  That's a judgment call, of course, and you are entitled to your
own opinion on the topic, but that doesn't mean that the complaint, as
defined by the person complaining, hasn't been fixed.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator and an 8.3-compatible tsvector data type

2009-05-29 Thread Bruce Momjian
Tom Lane wrote:
 Josh Berkus j...@agliodbs.com writes:
  It would be nice to have pg_migrator handle this, especially if we could 
  do it in parallel.  Then we just have to warn users that migrating a 
  database with tsvector columns takes significantly longer.  That is,
 
  1) do rest of catalog swap and link/copy of objects.
  2) mark all tsvector columns as 83_tsvector and add new tsvector type
  (these columns will be unusable for queries)
  3) bring up database
  4) search for all 83_tsvector columns
  5) do ALTER TABLE on each of these columns, in parallel, up to a 
  configuration setting (default 3).
 
 pg_migrator is already emitting a script that is intended to be run
 after conversion, to handle REINDEXing of incompatible indexes.  That
 could easily be made to do ALTER TYPE on old tsvector columns too, no?

Hmmm.  Well, the problem right now with v8_3_tsvector is that it fails
on index create, even when the index is empty, because I can't figure
out how to simply set up the proper index catalog entries.  Once that is
fixed and I can bind tsvector to v8_3_tsvector on schema creation, I can
easily emit ALTER TABLE to fix the issue.  And, at that point the
tsvector GIN indexes would be automatically created so I can skip that
part.

 The parallel bit is pie in the sky and should not be considered even
 for a millisecond during this release cycle.  Save it for 8.5, or
 suggest to people that they manually cut the script apart if they're
 desperate to have that.

Agreed.

FYI, this is 1% as hard as the Win32 port, so I am not discouraged.  ;-)

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean shutdown and warm standby

2009-05-29 Thread Simon Riggs

On Fri, 2009-05-29 at 21:46 +0300, Heikki Linnakangas wrote:
 Simon Riggs wrote:
  Regrettably, the patch doesn't remove the problem it was supposed to
  remove and I'm highlighting there is still risk of data loss. 
 
 I feel that you're moving the goalposts. What exactly is the problem it 
 was supposed to remove in your opinion?

I feel that you wish to argue this endlessly so that my point is lost
and the threat of data loss that was left exposed is forgotten.

I'm happy that I understand those threats and will advise my clients
accordingly. I've tried to help the community, but in the end, there is
a point where I stop trying to do so. Now, in fact.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Greg Stark
On Fri, May 29, 2009 at 5:23 PM, David E. Wheeler da...@kineticode.com wrote:
 PS: we still have to provide users with easy tools to (dynamically) manage
 search_path, don't we?
 (I prefer not to start the search_path management tool ideas right here).

 Yes, we do, and that's what at least half this thread is about. Whether or
 not such tools are put to use for extensions support is a separate issue,
 but both need addressing, I think.

Do we really? The only reason people are having trouble managing their
search_path is because they're not using it as intended and putting
things in lots of different schemas that they intend to all be
visible. If they put everything they intend to be visible to users in
one schema they wouldn't have this problem.

That said, I don't mind the idea of having a way to push things onto
search path like you often do in sh using PATH=/foo/bar:$PATH.

But I think the only reason to install something into a separate
schema is precisely if you *want* that schema to not be visible to
users automatically. So having more and more complex ways to include
schemas in the search path automatically is fixing a problem created
by setting things up wrong in the first place.

I'm actually not sure if we should allow extensions to be installed
into separate schemas. If you do then it means we can't detect
conflicts. A module might refer to an object intending to get its
local object but end up getting some object from some other module
depending on how the user set up his search_path.

To make installing into separate schemas work we would have to have
each extension have some magic way to refer to its own schema and
enforce that all objects are referred to this way. We don't have any
way to do that currently and I think that would also limit our ability
to have extensions which depend on other extensions.

In short I think people who want to put things in different schemas
are being misled by their intuition. By installing everything into one
schema you end up with a *more* organized system where everything has
a well defined meaning. If you install everything in different schemas
then that's where you end up with a mishmash where you're not sure
what objects are being used when depending on a global run-time
parameter which might need to be set differently for each module.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean shutdown and warm standby

2009-05-29 Thread Simon Riggs

On Fri, 2009-05-29 at 14:54 -0400, Robert Haas wrote:
 On Fri, May 29, 2009 at 2:23 PM, Simon Riggs si...@2ndquadrant.com wrote:
  Regrettably, the patch doesn't remove the problem it was supposed to
  remove and I'm highlighting there is still risk of data loss. I suggest
  that we don't change any docs, and carefully word or even avoid any
  release note inclusion to avoid lulling people into stopping safety
  measures.
 
 I think it's pretty clear that you and the OP are talking about two
 different problems.  To quote Guillaume:
 
 Yes, the problem is that before this change, even with a working
 replication and a clean shutdown, you still had to replicate the last
 WAL file by hand.
 
 I think that's a pretty legitimate complaint.  

It's valid complaint, yes, but only for people that do this manually,
which is nobody I ever met, in *production*. (ymmv etc)

 You seem to think that this wasn't worth fixing...

And for them, it hasn't been completely fixed. That point was not made
by patch author or committer, leaving the impression it was now
completely safe, which, I truly regret to say, is not correct.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Tom Lane
Greg Stark st...@enterprisedb.com writes:
 I'm actually not sure if we should allow extensions to be installed
 into separate schemas.

It's starting to seem that best practice is to install public
functions/etc into a common schema and private objects into an
extension-specific schema.  The main problem with that from an extension
author's point of view is the need to explicitly qualify all references
to private objects, since they won't be in the search path.  Which is
tedious, but doable.

Another issue is that doing that pretty much hard-wires what the
extension's private schema name is.  Dunno how much we care, though.

You could certainly do this without any new search-path-related
features, but I wonder whether the system could provide any extra
support for it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Andrew Dunstan



Tom Lane wrote:

Greg Stark st...@enterprisedb.com writes:
  

I'm actually not sure if we should allow extensions to be installed
into separate schemas.



It's starting to seem that best practice is to install public
functions/etc into a common schema and private objects into an
extension-specific schema.  The main problem with that from an extension
author's point of view is the need to explicitly qualify all references
to private objects, since they won't be in the search path.  Which is
tedious, but doable.
  


The main problem as I see it is that you are abandoning one of the two 
uses of schemas, namely namespace separation.  With this pattern an 
extension author has no guarantee that there won't be a name collision 
with some other extension. Pace Greg, schemas are not just about privacy.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Python 3.0 does not work with PL/Python

2009-05-29 Thread Hannu Krosing
On Fri, 2009-05-29 at 11:12 +0300, Peter Eisentraut wrote:
 On Friday 29 May 2009 03:53:17 Alvaro Herrera wrote:
  Bruce Momjian escribió:
   Peter Eisentraut wrote:
On Monday 06 April 2009 02:10:59 James Pye wrote:
 Any thoughts on the acceptability of a complete rewrite for Python 3?
   
http://www.joelonsoftware.com/articles/fog69.html
  
   You usually have to rewrite when you have not done refactoring as part
   of development;  PGDG does refactoring regularly.
 
  Except that plpython stagnates, save for minor hacks here and there.
 
 But that doesn't mean that there is anything wrong with it.  Of course there 
 is, but those are isolated problems that can be fixed when desired.  For 
 example, it might just be that those who use it don't have use of INOUT 
 parameters or table returns.

Yeah. And I _almost_ fixed those. Just did not have time to make the
final push to clean up things enough to be acceptable for patching back
into core.

And fixing those actually meant refactoring those parts into using newer
structures and functions :)

And I also think that pl/python, even for python 2.x does need lots of
refactoring in most places in order to be maintainable. 

-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Robert Haas
On Fri, May 29, 2009 at 4:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Greg Stark st...@enterprisedb.com writes:
 I'm actually not sure if we should allow extensions to be installed
 into separate schemas.

 It's starting to seem that best practice is to install public
 functions/etc into a common schema and private objects into an
 extension-specific schema.  The main problem with that from an extension
 author's point of view is the need to explicitly qualify all references
 to private objects, since they won't be in the search path.  Which is
 tedious, but doable.

This sounds quite horrid to me.  The way programming languages solve
this problem is they have a flag that either makes certain names not
visible from other namespaces, or they provide explicit control over
which names get exported.  Requiring the extension author to split up
the objects between two different hard-coded namespaces with schema
qualifications on every reference sounds like an unmanageable mess.

Of course we have no notion of exporting or importing names at all.
Maybe we should.  But I'm still of the opinion that this entire
discussion is a tangent.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing of parallel restore with current snapshot

2009-05-29 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 Tom,
 Is anyone interested enough to try it if I code it?

 If you're patient for results, sure.  I seem to be doing a customer 
 migration or upgrade every week now, so it wouldn't take me long to have 
 a test subject with a fairly complex database.

Here's a draft patch that does ordering using two lists, as I proposed.
Please test to see if it's any faster or slower than the original logic.

Note: since this changes struct TocEntry, be sure to recompile all files
in src/bin/pg_dump/ after patching.

regards, tom lane



binN1yOunKQWh.bin
Description: alternate-parallel-restore-1.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] information_schema.columns changes needed for OLEDB

2009-05-29 Thread Tom Lane
Konstantin Izmailov pgf...@gmail.com writes:
 you know that some Postgres functions are listed in pg_proc while others are
 not. For example, all Data Type Formatting function are in pg_proc (to_char,
 to_hex, ...). While several of the Date/Time Functions are not there
 (extract, localtime, ...).

The ones that appear not to be there are ones that the SQL standard
demands special weird syntax for.  The grammar translates such calls
to standard function calls to underlying functions, which usually are
named a bit differently to avoid confusion.  For instance
extract(field from some_expr) becomes date_part('field', some_expr).

If you want to know what all of these are, see the func_expr production
in parser/gram.y.

 This causes issues to Windows integration as well.

Complain to the SQL standards committee, especially to those members
who seem to think COBOL represented the apex of programming language
syntax design :-(

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Greg Stark
On Fri, May 29, 2009 at 10:26 PM, Robert Haas robertmh...@gmail.com wrote:
 This sounds quite horrid to me.  The way programming languages solve
 this problem is they have a flag that either makes certain names not
 visible from other namespaces, or they provide explicit control over
 which names get exported.

There are two factors which distinguish this situation from most
programming languages:

1) Hopefully these languages you're thinking of are lexically scoped.
So the search path in effect when the objects are defined decide which
other objects they reference. In Postgres in many cases we're
effectively dynamically scoped. If someone calls us with another
search path we'll pick up other objects we weren't expecting.

2) Normally programming languages do early binding so as soon as the
code is parsed references are resolved. You can't later define a new
function earlier in the search path and have it take over references
that have were previously referring to some other function.

 Requiring the extension author to split up
 the objects between two different hard-coded namespaces with schema
 qualifications on every reference sounds like an unmanageable mess.

Well I think the thinking is that if the extension author wants to
hide some objects from the public he creates a schema for them and
references them explicitly.

If he pushes that private schema onto the search path he'll find any
functions he calls -- admittedly not that common since we don't have
any way to do callbacks, i suppose triggers on tables his code
modifies counts though -- will have this private schema in its search
path...

If we do want special handling it does seem to me that it would make
sense to have some token like _private_ which the extension loading
mechanism would automatically substitute for a unique schema name.
Otherwise we're relying on extension authors to come up with unique
names.



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-29 Thread Hannu Krosing
On Wed, 2009-05-06 at 18:33 +0300, Peter Eisentraut wrote:
 On Tuesday 05 May 2009 17:38:33 Tom Lane wrote:
  Kevin Grittner kevin.gritt...@wicourts.gov writes:
   Bernd Helmle maili...@oopsware.de wrote:
   Another approach would be to just dump bytea columns in binary
   format only (not sure how doable that is, though).
  
   If that's not doable, perhaps a base64 option for bytea COPY?
 
  I'm thinking plain old pairs-of-hex-digits might be the best
  tradeoff if conversion speed is the criterion.  The main problem
  in any case would be to decide how to control the format option.
 
 The output format can be controlled by a GUC parameter.  And while we are at 
 it, we can also make bytea understand the new output format on input, so we 
 can offer an end-to-end alternative to the amazingly confusing current bytea 
 format and also make byteain() equally faster at the same time.
 
 For distinguishing various input formats, we could use the backslash to 
 escape 
 the format specification without breaking backward compatibilty, e.g.,
 
 '\hexd41d8cd98f00b204e9800998ecf8427e'
 
 With a bit of extra work we can wrap this up to be a more or less SQL-
 conforming blob type, which would also make a lot of people very happy.

And we can also escape the need to uncompress TOAST'ed fields - just
markup the compression as another \c at the beginning of data.


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bytea vs. pg_dump

2009-05-29 Thread Hannu Krosing
On Fri, 2009-05-29 at 11:06 +0300, Peter Eisentraut wrote:
 On Friday 29 May 2009 04:26:35 Bruce Momjian wrote:
  Added to TODO:
  |Improve bytea COPY format
 
  * http://archives.postgresql.org/pgsql-hackers/2009-05/msg00192.php
 
 Btw., I have started to write some code for that.

why not copy bytea always in base64 encoded or similar format - this
will both save at least 2x the space on average random bytea data _and_
is probably faster, as it can be more easily done by table lookups in
bigger chunks

an alternative is to just escape minimal amount of characters, probably
just \0 , \n and \\

-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path improvements WAS: search_path vs extensions

2009-05-29 Thread Josh Berkus

Greg,


Do we really? The only reason people are having trouble managing their
search_path is because they're not using it as intended and putting
things in lots of different schemas that they intend to all be
visible.


Apparently you've never adminned a database with hundreds (or thousands) 
of stored procedures.


Sometimes one needs to use schemas just for namespacing (they are called 
namespaces after all), and not for security or visibility.


In fact, I'd argue that that is one of the problems with the whole 
schema concept: it's three things at once.



I'm actually not sure if we should allow extensions to be installed
into separate schemas. If you do then it means we can't detect
conflicts. A module might refer to an object intending to get its
local object but end up getting some object from some other module
depending on how the user set up his search_path.


I agree with this.  Eliminating module naming conflicts is a good in itself.

From a DBA and database designer perspective, the missing functionality 
from being able to do everything with schema that I want are listed 
below.  It's been my experience that the awkwardness of managing 
search_path has caused a *lot* of our users to ignore schema as a 
feature and not use schema when they otherwise should.


a) the ability to push a schema onto the current search path
b) the ability to pull a schema off the current search path
c) the ability as superuser to have my own special schema which are 
always in the search path, as pg_catalog and $user_temp are.*
d) the ability as superuser to lock specific role so that they can't 
change their search path**

e) having roles somehow inherit search_path on a SET ROLE***

* if you're not sure why someone would want this, consider 
information_schema.  If your application depends on I_S to work, how do 
you make sure it's always in every user's search_path?


** think about the number of security exploits around search_path we 
could protect against if we had this.


*** this is the same issue as it is with resource management (i.e. 
work_mem).  However, it's particularly apt for search_path; imagine a 
database with an accounting schema and a user who belongs to both the 
accounting and the HR roles.


--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Josh Berkus

Robert,


Of course we have no notion of exporting or importing names at all.
Maybe we should.  But I'm still of the opinion that this entire
discussion is a tangent.


As far as Extensions are concerned?  Yes, it is.

Dimitri:  I vote for you to get on with assuming everything goes into 
pg_extensions.  We can always change that later if there's any kind of 
consensus.


--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread David E. Wheeler

On May 29, 2009, at 12:41 PM, Greg Stark wrote:


That said, I don't mind the idea of having a way to push things onto
search path like you often do in sh using PATH=/foo/bar:$PATH.


Yes, +1.


But I think the only reason to install something into a separate
schema is precisely if you *want* that schema to not be visible to
users automatically. So having more and more complex ways to include
schemas in the search path automatically is fixing a problem created
by setting things up wrong in the first place.


A reason I've run into is to make database maintenance and migration  
easier. For a recent client, all contrib modules were installed in a  
single, separate schema, named contrib. This makes it easy to dump all  
of the database code but not dump the contrib stuff, and that's useful  
for two reasons:


1. The client was dumping the schema into svn every night, and the  
contrib stuff just cluttered it up (I'm not saying checking a schema  
in like this is a good idea, just that I've seen it).


2. Migrating to a new version of PostgreSQL, the server can be pre- 
build with the contrib schema, with new versions with the new release,  
and then the dump from the old server doesn't have the contrib crap in  
it to cause conflicts.


So, yeah, there may be collisions that a given DBA has to deal with,  
and then will want more than one schema. But for the vast majority of  
uses, I think that a pg_extensions schema will serve nicely to keep  
third-party extensions separate from in-house database objects.


Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread David E. Wheeler

On May 29, 2009, at 2:45 PM, Greg Stark wrote:


2) Normally programming languages do early binding so as soon as the
code is parsed references are resolved. You can't later define a new
function earlier in the search path and have it take over references
that have were previously referring to some other function.


Not functions, but see method dispatch.


Well I think the thinking is that if the extension author wants to
hide some objects from the public he creates a schema for them and
references them explicitly.


Agreed.


If he pushes that private schema onto the search path he'll find any
functions he calls -- admittedly not that common since we don't have
any way to do callbacks, i suppose triggers on tables his code
modifies counts though -- will have this private schema in its search
path...


Yeah, it'd be nice to lexically scope such search_path modifications,  
such as for the duration of a function call.



If we do want special handling it does seem to me that it would make
sense to have some token like _private_ which the extension loading
mechanism would automatically substitute for a unique schema name.
Otherwise we're relying on extension authors to come up with unique
names.


Agreed.

Best,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path improvements WAS: search_path vs extensions

2009-05-29 Thread David E. Wheeler

On May 29, 2009, at 2:52 PM, Josh Berkus wrote:


a) the ability to push a schema onto the current search path
b) the ability to pull a schema off the current search path


push, pop, shift, unshift. :-)

Come to think of it, I want these for arrays, too. ;-)

Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing of parallel restore with current snapshot

2009-05-29 Thread Josh Berkus

Tom,


Here's a draft patch that does ordering using two lists, as I proposed.
Please test to see if it's any faster or slower than the original logic.


Great.  I'll need to get permission from a client; I can't host large 
enough/complex enough databases on my own system.  :-(


--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Robert Haas
On Fri, May 29, 2009 at 5:45 PM, Greg Stark st...@enterprisedb.com wrote:
 On Fri, May 29, 2009 at 10:26 PM, Robert Haas robertmh...@gmail.com wrote:
 This sounds quite horrid to me.  The way programming languages solve
 this problem is they have a flag that either makes certain names not
 visible from other namespaces, or they provide explicit control over
 which names get exported.

 There are two factors which distinguish this situation from most
 programming languages:

 1) Hopefully these languages you're thinking of are lexically scoped.
 So the search path in effect when the objects are defined decide which
 other objects they reference. In Postgres in many cases we're
 effectively dynamically scoped. If someone calls us with another
 search path we'll pick up other objects we weren't expecting.

 2) Normally programming languages do early binding so as soon as the
 code is parsed references are resolved. You can't later define a new
 function earlier in the search path and have it take over references
 that have were previously referring to some other function.

Good point.  But maybe there's some way of getting some kind of
behavior that is closer to lexical scoping/early binding?  Because the
way it works right now has lousy security implications, beyond being
difficult for search_path management.  Assign a search path to a
schema, that applies to views and functions defined therein?
*brainstorming*

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Greg Smith

On Fri, 29 May 2009, Greg Stark wrote:

The only reason people are having trouble managing their search_path is 
because they're not using it as intended and putting things in lots of 
different schemas that they intend to all be visible. If they put 
everything they intend to be visible to users in one schema they 
wouldn't have this problem.


Every PostgreSQL installation I've ever seen that heavily uses schemas 
aggressively uses them to partition up the various applications into 
components that can easily be reinstalled, the goal being to make 
deploying new versions easier.  Put component A into schema A, component B 
into schema B, and then if you need to make a change just to the workings 
of B you can easily dump the data from B, DROP SCHEMA s CASCADE,. apply 
new DDL change, and then reinstall things associated with that component 
without touching anything in A.  The nice thing about this approach, 
compared with applying DDL deltas, is that afterwards you know you've got 
a complete chunk of code each time that will also install somewhere else 
identically into that schema.


That I run into all the time, usually with every schema in the default 
search_path.  Using schemas primarly as a security mechanism isn't nearly 
as popular as far as I've seen.


Anyway, I think the answer to all the extension related questions should 
be to pick whatever lets a prototype that handles the dependency and 
dump/reload problems get solved most easily.  You really need to use the 
simplest possible schema standard that works for extensions and decouple 
the problems from one another if any progress is going to get made here.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] information_schema.columns changes needed for OLEDB

2009-05-29 Thread Konstantin Izmailov
Tom,
this is very helpful - thank you so much!

I had to discover those 'missing' functions one by one, usually after users'
complaints.

Konstantin

On Fri, May 29, 2009 at 11:35 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Konstantin Izmailov pgf...@gmail.com writes:
  you know that some Postgres functions are listed in pg_proc while others
 are
  not. For example, all Data Type Formatting function are in pg_proc
 (to_char,
  to_hex, ...). While several of the Date/Time Functions are not there
  (extract, localtime, ...).

 The ones that appear not to be there are ones that the SQL standard
 demands special weird syntax for.  The grammar translates such calls
 to standard function calls to underlying functions, which usually are
 named a bit differently to avoid confusion.  For instance
 extract(field from some_expr) becomes date_part('field', some_expr).

 If you want to know what all of these are, see the func_expr production
 in parser/gram.y.

  This causes issues to Windows integration as well.

 Complain to the SQL standards committee, especially to those members
 who seem to think COBOL represented the apex of programming language
 syntax design :-(

regards, tom lane



Re: [HACKERS] Python 3.0 does not work with PL/Python

2009-05-29 Thread James Pye

On May 29, 2009, at 1:17 AM, Peter Eisentraut wrote:

On Friday 29 May 2009 04:06:14 Andrew Dunstan wrote:
Otherwise, I'm not too keen simply to throw Python 2.x overboard  
until
it's no longer common on platforms people are likely to want to  
install

Postgres on, if that's what's implied by the original question.


My guess is that we will need to keep around a Python 2.x version  
for at least

another three years, meaning two or three major PostgreSQL releases.


Yeah, I wasn't meaning to imply tossing 2.x out...

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Greg Stark
On Fri, May 29, 2009 at 11:18 PM, Robert Haas robertmh...@gmail.com wrote:

 Good point.  But maybe there's some way of getting some kind of
 behavior that is closer to lexical scoping/early binding?  Because the
 way it works right now has lousy security implications, beyond being
 difficult for search_path management.  Assign a search path to a
 schema, that applies to views and functions defined therein?
 *brainstorming*

Well we already set search_path locally in SECURITY DEFINER functions.
Normal functions run with the credentials of the caller so that's not
an issue.

But if a SECURITY DEFINER function calls another function that other
function will inherit the credentials of the caller so it must inherit
the search path of the caller as well. So that has to be dynamically
scoped.

I'm beginning to understand why Oracle programmers are accustomed to
setting SECURITY DEFINER everywhere. I think Oracle also knows to
treat such code as lexically scoped and can bind references when
loading such code.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path improvements WAS: search_path vs extensions

2009-05-29 Thread Greg Stark
On Fri, May 29, 2009 at 11:03 PM, David E. Wheeler da...@kineticode.com wrote:
 On May 29, 2009, at 2:52 PM, Josh Berkus wrote:

 a) the ability to push a schema onto the current search path
 b) the ability to pull a schema off the current search path

 push, pop, shift, unshift. :-)

 Come to think of it, I want these for arrays, too. ;-)

push and unshift sure -- and you do have those for arrays, it's spelled ||.

I'm not so sure about pop/shift though. How would you know the element
you want is at the beginning/end unless you just put it there?

I think what you really want is to use SET LOCAL or RESET to restore
it to whatever it was before you started futzing with it. We might
need a more flexible way to do that that isn't tied to transactions
though.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path improvements WAS: search_path vs extensions

2009-05-29 Thread Greg Stark
On Fri, May 29, 2009 at 10:52 PM, Josh Berkus j...@agliodbs.com wrote:

 Sometimes one needs to use schemas just for namespacing (they are called
 namespaces after all), and not for security or visibility.

What's the point of namespaces if not to implement visibility? The
interesting thing to do would be to hide all the internal foo
functions in a foo.* schema and only put the external api in public.
That way you can't accidentally call an internal foo function or have
a name conflict between two internal functions. The external api could
even just be a bunch of thin wrappers around the implementation
functions in foo.* (what Oracle calls public synonyms).

If you just put them all in search path you haven't bought anything,
all your functions are in the same namespace and one module can
override another's objects. Actually it's worse than just putting them
all in one schema since you won't even be warned when a conflict
happens. It will just silently start doing something different.

 c) the ability as superuser to have my own special schema which are always
 in the search path, as pg_catalog and $user_temp are.*

 * if you're not sure why someone would want this, consider
 information_schema.  If your application depends on I_S to work, how do you
 make sure it's always in every user's search_path?

Uhm, wouldn't you just refer to information_schema.foo? What if some
other part of your application depends on information_schema *not*
being in your path? Using global state for this seems destined to
leave you with something broken that can't be fixed without breaking
something else.

 d) the ability as superuser to lock specific role so that they can't
 change their search path**

 ** think about the number of security exploits around search_path we could
 protect against if we had this.

Actually I'm thinking of how many security exploits this would
*create*. So if I call a security_definer function which has a
search_path set on it which search_path would it use?

This seems like it would make it impossible to code any extension to
work reliably. You would never know when some object in your extension
was being hidden by some public object which the locked search_path
overrode.

Hm, I'm beginning to think extensions need to have search_path set on
every function or have every object reference everywhere be explicitly
pg_extension.* (and/or _private_.* like my earlier suggestion).


 e) having roles somehow inherit search_path on a SET ROLE***

Grr. I'm still bitter about su doing that on some systems without
su -. I think I've lost that battle though and I'm forever doomed to
never know what su will do on a new system.



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] search_path vs extensions

2009-05-29 Thread Robert Haas
On Fri, May 29, 2009 at 7:53 PM, Greg Stark st...@enterprisedb.com wrote:
 On Fri, May 29, 2009 at 11:18 PM, Robert Haas robertmh...@gmail.com wrote:

 Good point.  But maybe there's some way of getting some kind of
 behavior that is closer to lexical scoping/early binding?  Because the
 way it works right now has lousy security implications, beyond being
 difficult for search_path management.  Assign a search path to a
 schema, that applies to views and functions defined therein?
 *brainstorming*

 Well we already set search_path locally in SECURITY DEFINER functions.
 Normal functions run with the credentials of the caller so that's not
 an issue.

Maybe not for security, but certainly it is for correctness.

 But if a SECURITY DEFINER function calls another function that other
 function will inherit the credentials of the caller so it must inherit
 the search path of the caller as well. So that has to be dynamically
 scoped.

 I'm beginning to understand why Oracle programmers are accustomed to
 setting SECURITY DEFINER everywhere. I think Oracle also knows to
 treat such code as lexically scoped and can bind references when
 loading such code.

Uh... if I'm understanding you correctly, then I'm really hoping we
engineer a better solution for PostgreSQL.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq is not thread safe

2009-05-29 Thread Bruce Momjian
Zdenek Kotala wrote:
 
 Bruce Momjian p??e v ?t 28. 05. 2009 v 17:20 -0400:
 
   
   Done, patch attached and applied.
  
  I went with a warning because it seemed most appropriate, but it looks
  very large:
  
  http://developer.postgresql.org/pgdocs/postgres/libpq-connect.html
  
  Should it be a notice?
 
 I prefer warning. It is important message and beginners usually don't
 know it.

OK.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] explan refactoring

2009-05-29 Thread Robert Haas
As I was trying to figure out the least invasive way to make
explain_outNode() support machine-readable output, I noticed that
there is a whole pile of duplicated code for dealing with scan
targets.  The attached refactoring may be worth applying independently
of what happens with the rest of the project, so it's attached here
for review and comment.

This also removes a redundant branch of a switch further down, folding
T_SubqueryScan into the preceding chunk.

It's tempting to simplify this further by ripping some of the asserts
out of ExplainScanTarget() on the theory that, while it may be good
sanity checking, it's not really EXPLAIN's job to do this type of
validation.  But I've left them alone for now.

...Robert
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
***
*** 73,78  static void show_sort_keys(Plan *sortplan, int nkeys, AttrNumber *keycols,
--- 73,79 
  static void show_sort_info(SortState *sortstate,
  			   StringInfo str, int indent, ExplainState *es);
  static const char *explain_get_index_name(Oid indexId);
+ static void ExplainScanTarget(StringInfo str, Scan *plan, ExplainState *es);
  
  
  /*
***
*** 668,790  explain_outNode(StringInfo str,
  		case T_SeqScan:
  		case T_BitmapHeapScan:
  		case T_TidScan:
- 			if (((Scan *) plan)-scanrelid  0)
- 			{
- RangeTblEntry *rte = rt_fetch(((Scan *) plan)-scanrelid,
- 			  es-rtable);
- char	   *relname;
- 
- /* Assume it's on a real relation */
- Assert(rte-rtekind == RTE_RELATION);
- 
- /* We only show the rel name, not schema name */
- relname = get_rel_name(rte-relid);
- 
- appendStringInfo(str,  on %s,
-  quote_identifier(relname));
- if (strcmp(rte-eref-aliasname, relname) != 0)
- 	appendStringInfo(str,  %s,
- 	 quote_identifier(rte-eref-aliasname));
- 			}
- 			break;
- 		case T_BitmapIndexScan:
- 			appendStringInfo(str,  on %s,
- explain_get_index_name(((BitmapIndexScan *) plan)-indexid));
- 			break;
  		case T_SubqueryScan:
- 			if (((Scan *) plan)-scanrelid  0)
- 			{
- RangeTblEntry *rte = rt_fetch(((Scan *) plan)-scanrelid,
- 			  es-rtable);
- 
- appendStringInfo(str,  %s,
-  quote_identifier(rte-eref-aliasname));
- 			}
- 			break;
  		case T_FunctionScan:
- 			if (((Scan *) plan)-scanrelid  0)
- 			{
- RangeTblEntry *rte = rt_fetch(((Scan *) plan)-scanrelid,
- 			  es-rtable);
- Node	   *funcexpr;
- char	   *proname;
- 
- /* Assert it's on a RangeFunction */
- Assert(rte-rtekind == RTE_FUNCTION);
- 
- /*
-  * If the expression is still a function call, we can get the
-  * real name of the function.  Otherwise, punt (this can
-  * happen if the optimizer simplified away the function call,
-  * for example).
-  */
- funcexpr = ((FunctionScan *) plan)-funcexpr;
- if (funcexpr  IsA(funcexpr, FuncExpr))
- {
- 	Oid			funcid = ((FuncExpr *) funcexpr)-funcid;
- 
- 	/* We only show the func name, not schema name */
- 	proname = get_func_name(funcid);
- }
- else
- 	proname = rte-eref-aliasname;
- 
- appendStringInfo(str,  on %s,
-  quote_identifier(proname));
- if (strcmp(rte-eref-aliasname, proname) != 0)
- 	appendStringInfo(str,  %s,
- 	 quote_identifier(rte-eref-aliasname));
- 			}
- 			break;
  		case T_ValuesScan:
- 			if (((Scan *) plan)-scanrelid  0)
- 			{
- RangeTblEntry *rte = rt_fetch(((Scan *) plan)-scanrelid,
- 			  es-rtable);
- char	   *valsname;
- 
- /* Assert it's on a values rte */
- Assert(rte-rtekind == RTE_VALUES);
- 
- valsname = rte-eref-aliasname;
- 
- appendStringInfo(str,  on %s,
-  quote_identifier(valsname));
- 			}
- 			break;
  		case T_CteScan:
- 			if (((Scan *) plan)-scanrelid  0)
- 			{
- RangeTblEntry *rte = rt_fetch(((Scan *) plan)-scanrelid,
- 			  es-rtable);
- 
- /* Assert it's on a non-self-reference CTE */
- Assert(rte-rtekind == RTE_CTE);
- Assert(!rte-self_reference);
- 
- appendStringInfo(str,  on %s,
-  quote_identifier(rte-ctename));
- if (strcmp(rte-eref-aliasname, rte-ctename) != 0)
- 	appendStringInfo(str,  %s,
- 	 quote_identifier(rte-eref-aliasname));
- 			}
- 			break;
  		case T_WorkTableScan:
! 			if (((Scan *) plan)-scanrelid  0)
! 			{
! RangeTblEntry *rte = rt_fetch(((Scan *) plan)-scanrelid,
! 			  es-rtable);
! 
! /* Assert it's on a self-reference CTE */
! Assert(rte-rtekind == RTE_CTE);
! Assert(rte-self_reference);
! 
! appendStringInfo(str,  on %s,
!  quote_identifier(rte-ctename));
! if (strcmp(rte-eref-aliasname, rte-ctename) != 0)
! 	appendStringInfo(str,  %s,
! 	 quote_identifier(rte-eref-aliasname));
! 			}
  			break;
  		default:
  			break;
--- 669,684 
  		case T_SeqScan:
  		case T_BitmapHeapScan:
  		case T_TidScan:
  		case T_SubqueryScan:
  		case 

Re: [HACKERS] Allow vacuumdb to only analyze

2009-05-29 Thread decibel

On May 27, 2009, at 11:31 AM, decibel wrote:

It does seem somewhat useful to be able to analyze all databases
easily from the command-line, but putting it into vacuumdb is
certainly a hack.
So... do we want a completely separate analyzedb command? That  
seems like far overkill.


Arguably there are yet other things you'd want to do across an  
entire cluster, so perhaps what we really want is a 'clusterrun' or  
'clustercmd' command?



No one else has commented, so I'm guessing that means no one is  
opposed to allowing for vacuumdb to just analyze. If anyone else  
objects to this please speak up before I put the final touches on the  
patch...

--
Decibel!, aka Jim C. Nasby, Database Architect  deci...@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers