date:20110212

Re: [HACKERS] ALTER TYPE 2: skip already-provable no-work rewrites

2011-02-12 Thread Robert Haas

On Sat, Feb 12, 2011 at 10:45 AM, Noah Misch  wrote:
> That said, I've tried both constructions, and I marginally prefer the end 
> result
> with AlteredTableInfo.verify.  I've inlined ATColumnChangeRequiresRewrite into
> ATPrepAlterColumnType; it would need to either pass back two bools or take an
> AlteredTableInfo arg to mutate, so this seemed cleaner.

I think I like the idea of passing it the AlteredTableInfo.

> I've omitted the
> assertion that my previous version added to ATRewriteTable; it was helpful for
> other scan-only type changes, but it's excessive for domains alone.  
> Otherwise,
> the differences are cosmetic.

So in the case of a constrained domain, it looks like we're going to
evaluate the changed columns, but if no error is thrown, we're going
to assume they match the original ones and throw out the data?  Yikes.
 I didn't like that Assert much, but maybe we need it, because this is
scary.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Debian readline/libedit breakage

2011-02-12 Thread Greg Smith


Dimitri Fontaine wrote:

Now, what I think I would do about the core package is a quite simple
backport of them, using Martin's excellent work.  Do we want our own QA
on them?  If yes, I think I would need some help here, maybe with some
build farm support for running from our debian packages rather than from
either CVS or git.
  


What the RPM packaging does is run this (approximately):

   pushd src/test/regress
   make all
   make MAX_CONNECTIONS=5 check

Something similar might be sufficient for QA on the Debian packaging 
too.  The overhead of the buildfarm makes sense if you want to rebuild 
after every single commit.  It may be overkill to go through that just 
for testing .deb packaging, which realistically you're only going to 
want to do after each point release.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Debian readline/libedit breakage

2011-02-12 Thread Dimitri Fontaine

Magnus Hagander  writes:
> On Sat, Feb 12, 2011 at 22:46, Dimitri Fontaine  
> wrote:
>> If we really believe that the debian interpretation of the licence issue
>> here is moot, surely the easiest action is to offer a debian package
>> repository hosted in the postgresql.org infrastructure.
>>
> Are you volunteering? ;)

I would, yes.  I would benefit from that in more than one place, and of
course we would have to ship extensions packages for all supported major
versions too, which is something I've been working on for debian. See

  http://packages.debian.org/squeeze/postgresql-server-dev-all

Now, what I think I would do about the core package is a quite simple
backport of them, using Martin's excellent work.  Do we want our own QA
on them?  If yes, I think I would need some help here, maybe with some
build farm support for running from our debian packages rather than from
either CVS or git.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extensions vs PGXS' MODULE_PATHNAME handling

2011-02-12 Thread Dimitri Fontaine

Tom Lane  writes:
> Right, the basic difficulty here is exactly that in a Makefile that's
> building multiple shlibs, there is no easy way to decide which shlibs go
> with which sql scripts.  The existing implementation essentially relies
> on the base name of the sql script matching the base name of the shlib.
> Adding a single-valued shlib property wouldn't improve matters at all.

My take here is to way that in this case, the current (9.1) way to deal
with the situation is to have multiple extensions when you have multiple
shlibs.  After all we know that multiple extensions from the same
Makefile works, thanks to contrib/spi (I mean extension/spi).

And we even have inter-extensions dependencies in 9.1, so that's
friendly enough I think.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Debian readline/libedit breakage

2011-02-12 Thread Charles.McDevitt

> charles.mcdev...@emc.com wrote:
> > The GNU people will never be 100% satisfied by anything you do to psql, 
> > other
> than making it GPL.
> > Readline is specifically licensed in a way to try to force this (but many 
> > disagree
> with their ability to force this).
> >
> 
> The "GNU people" are perfectly content with the license of PostgreSQL.
> They are unhappy with the license terms of OpenSSL, which is fair
> because they are ridiculous.  Eric Young and the rest of the
> contributors produced a useful piece of software, and made it considerly
> less valuable to the world due to the ego trip terms:
> http://www.openssl.org/source/license.html -- the worst specific problem
> is the requirement to acknowledge OpenSSL use in advertising of projects
> that use it.
> 
> The PostgreSQL community has had similar issues with popular software
> commonly used on top of PostgreSQL, that happened to use a non-standard
> license with unique terms.  It would be both hypocritical and incorrect
> to now blame the GNU projects for taking a similar stand on this one.

You are correct... I overreacted, after having run into problems in the past 
with GPL (vs LGPL) issues.
My apologies to all for adding stupid distracting comments to this thread.

It would be wonderful if the OpenSSL people would compromise on this, but I 
suppose that isn't possible.

I'd love to see libedit get better, and would like to see that solution, 
because OpenSSL's FIPS compliance really helps with Federal customers, but I 
realize that involves a lot of effort.
I hope if the PostgreSQL project goes down the path of switching to GnuTLS, 
OpenSSL remains an option (for the server side). 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extensions vs PGXS' MODULE_PATHNAME handling

2011-02-12 Thread David E. Wheeler

On Feb 12, 2011, at 3:37 PM, Tom Lane wrote:

>> How likely is *that*?
> 
> Not very, but the rules are getting a bit complicated ...

Doesn't seem complicated to me:

1. Use -- to separate extension name, old version, new version
2. Don't use - at the beginning or end of name or version number
3. Profit

How hard is that?

David

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extensions vs PGXS' MODULE_PATHNAME handling

2011-02-12 Thread Tom Lane

"David E. Wheeler"  writes:
> On Feb 12, 2011, at 3:12 PM, Tom Lane wrote:
>> Hm.  I think we'd still have to disallow dash as the first or last
>> character in a version name to make that unambiguous.  Not sure it's
>> worth the trouble.

> How likely is *that*?

Not very, but the rules are getting a bit complicated ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extensions vs PGXS' MODULE_PATHNAME handling

2011-02-12 Thread David E. Wheeler

On Feb 12, 2011, at 3:12 PM, Tom Lane wrote:

> "David E. Wheeler"  writes:
>> On Feb 12, 2011, at 2:29 PM, Tom Lane wrote:
>>> I did think of another idea besides forbidding dash in extension names:
>>> what if we use double dash as the name/version separator,
> 
>> +1 You might even consider mandating a double-dash between versions, so that 
>> they could have dashes:
>>extension--oldversion--newversion.sql
> 
> Hm.  I think we'd still have to disallow dash as the first or last
> character in a version name to make that unambiguous.  Not sure it's
> worth the trouble.

How likely is *that*?

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extensions vs PGXS' MODULE_PATHNAME handling

2011-02-12 Thread Tom Lane

"David E. Wheeler"  writes:
> On Feb 12, 2011, at 2:29 PM, Tom Lane wrote:
>> I did think of another idea besides forbidding dash in extension names:
>> what if we use double dash as the name/version separator,

> +1 You might even consider mandating a double-dash between versions, so that 
> they could have dashes:
> extension--oldversion--newversion.sql

Hm.  I think we'd still have to disallow dash as the first or last
character in a version name to make that unambiguous.  Not sure it's
worth the trouble.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extensions vs PGXS' MODULE_PATHNAME handling

2011-02-12 Thread David E. Wheeler

On Feb 12, 2011, at 2:29 PM, Tom Lane wrote:

> I did think of another idea besides forbidding dash in extension names:
> what if we use double dash as the name/version separator, ie the naming
> conventions are like
>   extension--version.control
>   extension--version.sql
>   extension--oldversion-newversion.sql
> Then we'd only have to forbid double dash in extension names, which
> seems unlikely to be a problem for anybody.  (I think we might also have
> to forbid empty version names to make this bulletproof, but that doesn't
> bother me much either.)

+1 You might even consider mandating a double-dash between versions, so that 
they could have dashes:

extension--oldversion--newversion.sql

We don't have to worry about the length of the file name, do we?

Best,

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQL/MED - file_fdw

2011-02-12 Thread Noah Misch

On Sat, Feb 12, 2011 at 03:42:17PM -0600, Kevin Grittner wrote:
> In two hours of testing with a 90GB production database, the copy
> patch on top of HEAD ran 0.6% faster than HEAD for pg_dumpall
> (generating identical output files), but feeding that in to and
> empty cluster with psql ran 8.4% faster with the patch than without!
> I'm going to repeat that latter with more attention to whether
> everything made it in OK.  (That's not as trivial to check as the
> dump phase.)
>  
> Do you see any reason that COPY FROM should be significantly
> *faster* with the patch?

No.  Up to, say, 0.5% wouldn't be too surprising, but 8.4% is surprising.  What
is the uncertainty of that figure?

> Are there any particular things I should
> be checking for problems?

Nothing comes to mind.

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extensions vs PGXS' MODULE_PATHNAME handling

2011-02-12 Thread Tom Lane

Dimitri Fontaine  writes:
> Tom Lane  writes:
>> Right, the basic difficulty here is exactly that in a Makefile that's
>> building multiple shlibs, there is no easy way to decide which shlibs go
>> with which sql scripts.  The existing implementation essentially relies
>> on the base name of the sql script matching the base name of the shlib.
>> Adding a single-valued shlib property wouldn't improve matters at all.

> My take here is to way that in this case, the current (9.1) way to deal
> with the situation is to have multiple extensions when you have multiple
> shlibs.  After all we know that multiple extensions from the same
> Makefile works, thanks to contrib/spi (I mean extension/spi).

But contrib/spi is exactly the case where it *won't* work.  We need to
somehow figure out that $libdir/autoinc is what to substitute in
autoinc-1.0.sql, $libdir/insert_username in insert_username-1.0.sql,
etc.

Also, I've been looking at the pg_available_extensions issue a bit.
I don't yet have a proposal for exactly how we ought to redefine it,
but I did notice that the existing code is terribly confused by
secondary control files: it doesn't realize that they're not primary
control files, so you get e.g. hstore and hstore-1.0 as separate
listings.

We could possibly work around that by giving secondary control files a
different extension, but I'm becoming more and more convinced that it's
just a bad idea to have a file naming rule in which it's ambiguous where
the extension name stops and the version name starts.

I did think of another idea besides forbidding dash in extension names:
what if we use double dash as the name/version separator, ie the naming
conventions are like
extension--version.control
extension--version.sql
extension--oldversion-newversion.sql
Then we'd only have to forbid double dash in extension names, which
seems unlikely to be a problem for anybody.  (I think we might also have
to forbid empty version names to make this bulletproof, but that doesn't
bother me much either.)

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] XMin Hot Standby Feedback patch

2011-02-12 Thread Daniel Farina

This is another bit of the syncrep patch split out.

I will revisit the replication timeout one Real Soon, I promise -- but
I have a couple things to do today that may delay that until the
evening.

https://github.com/fdr/postgres/commit/ad3ce9ac62f0e128d7d1fd20d47184f867056af1

Context diff supplied here.

Note that this information is not exposed via catalog in the original
syncrep patch, and is not here. Do we want that kind of reporting?

-- 
fdr
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***
*** 2029,2034  SET ENABLE_SEQSCAN TO OFF;
--- 2029,2038 
  This parameter can only be set in the postgresql.conf
  file or on the server command line.
 
+
+ You should also consider setting hot_standby_feedback
+ as an alternative to using this parameter.
+

   
   
***
*** 2121,2126  SET ENABLE_SEQSCAN TO OFF;
--- 2125,2145 

   
  
+  
+   hot_standby_feedback (boolean)
+   
+hot_standby_feedback configuration parameter
+   
+   
+
+ Specifies whether or not a hot standby will send feedback to the primary
+ about queries currently executing on the standby. This parameter can
+ be used to eliminate query cancels caused by cleanup records, though
+ it can cause database bloat on the primary for some workloads.
+ The default value is off.
+ This parameter can only be set at server start. It only has effect
+ if hot_standby is enabled.
+ 
   
replication_timeout_server (integer)

*** a/doc/src/sgml/high-availability.sgml
--- b/doc/src/sgml/high-availability.sgml
***
*** 1393,1403  if (!triggered)
  These conflicts are hard conflicts in the sense that queries
  might need to be cancelled and, in some cases, sessions disconnected to resolve them.
  The user is provided with several ways to handle these
! conflicts. Conflict cases include:
  

 
  
   Access Exclusive locks taken on the primary server, including both
   explicit LOCK commands and various DDL
   actions, conflict with table accesses in standby queries.
--- 1393,1410 
  These conflicts are hard conflicts in the sense that queries
  might need to be cancelled and, in some cases, sessions disconnected to resolve them.
  The user is provided with several ways to handle these
! conflicts. Conflict cases in order of likely frequency are:
  

 
  
+  Application of a vacuum cleanup record from WAL conflicts with
+  standby transactions whose snapshots can still see any of
+  the rows to be removed.
+ 
+
+
+ 
   Access Exclusive locks taken on the primary server, including both
   explicit LOCK commands and various DDL
   actions, conflict with table accesses in standby queries.
***
*** 1417,1432  if (!triggered)
 
 
  
!  Application of a vacuum cleanup record from WAL conflicts with
!  standby transactions whose snapshots can still see any of
!  the rows to be removed.
! 
!
!
! 
!  Application of a vacuum cleanup record from WAL conflicts with
!  queries accessing the target page on the standby, whether or not
!  the data to be removed is visible.
  
 

--- 1424,1433 
 
 
  
! 	 Buffer pin deadlock caused by application of a vacuum cleanup
!  record from WAL conflicts with queries accessing the target
!  page on the standby, whether or not the data to be removed is
!  visible.
  
 

***
*** 1539,1555  if (!triggered)
  
 
  Remedial possibilities exist if the number of standby-query cancellations
! is found to be unacceptable.  The first option is to connect to the
! primary server and keep a query active for as long as needed to
! run queries on the standby. This prevents VACUUM from removing
! recently-dead rows and so cleanup conflicts do not occur.
! This could be done using  and
! pg_sleep(), or via other mechanisms. If you do this, you
  should note that this will delay cleanup of dead rows on the primary,
  which may result in undesirable table bloat. However, the cleanup
  situation will be no worse than if the standby queries were running
! directly on the primary server, and you are still getting the benefit of
! off-loading execution onto the standby.
  max_standby_archive_delay must be kept large in this case,
  because delayed WAL files might already contain entries that conflict with
  the desired standby queries.
--- 1540,1555 
  
 
  Remedial possibilities exist if the number of standby-

Re: [HACKERS] Debian readline/libedit breakage

2011-02-12 Thread Magnus Hagander

On Sat, Feb 12, 2011 at 22:46, Dimitri Fontaine  wrote:
> Tom Lane  writes:
>> Less narrow-minded interpretation of GPL requirements, perhaps.
>> (And yes, we have real lawyers on staff considering these issues.)
>
> If we really believe that the debian interpretation of the licence issue
> here is moot, surely the easiest action is to offer a debian package
> repository hosted in the postgresql.org infrastructure.
>
> That would also allow us to maintain all our currently supported
> versions and choose to consider reaching EOL of one of them where it's
> still included in a stable debian releases.  Debian folks can't do that
> and as a result they will only ship one major version at a time, which
> certainly is a shame.

Yeah, I've been thinking about that before, for other reasons. It's
fallen down so far on the fact that our current packager (Martin)
isn't too interested in doing it, and he's been the one with the
cycles and experience...

Are you volunteering? ;)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extensions vs PGXS' MODULE_PATHNAME handling

2011-02-12 Thread Tom Lane

Dimitri Fontaine  writes:
> Tom Lane  writes:
>> pgxs.mk will substitute for MODULE_PATHNAME in it is
>> "$libdir/hstore-1.0" ... not exactly what's wanted.  This is because the
>> transformation rule depends on $*, ie the base name of the input file.

> A though that is occurring to me here would be to add a shlib property
> in the control file and have the SQL script use $libdir/$shlib, or even
> $shlib maybe.  That would only work for extensions scripts, and even
> only for those containing a single .so.

Right, the basic difficulty here is exactly that in a Makefile that's
building multiple shlibs, there is no easy way to decide which shlibs go
with which sql scripts.  The existing implementation essentially relies
on the base name of the sql script matching the base name of the shlib.
Adding a single-valued shlib property wouldn't improve matters at all.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] btree_gist (was: CommitFest progress - or lack thereof)

2011-02-12 Thread Oleg Bartunov


Oops, thank for remind !

Oleg
On Sat, 12 Feb 2011, Tom Lane wrote:


Jeff Davis  writes:

btree_gist is essentially required for exclusion constraints to be
useful in a practical way.



In fact, can you submit it for the next commitfest to be included in
core? That would allow range types and exclusion constraints to be used
out-of-the-box in 9.2.



Only if you think it's reasonable to put it in core, of course. If
extensions are easy enough to install, maybe that's not really
necessary.


btree_gist is entirely unready to be included in core.
http://archives.postgresql.org/pgsql-bugs/2010-10/msg00104.php

regards, tom lane




Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Debian readline/libedit breakage

2011-02-12 Thread Dimitri Fontaine

Tom Lane  writes:
> Less narrow-minded interpretation of GPL requirements, perhaps.
> (And yes, we have real lawyers on staff considering these issues.)

If we really believe that the debian interpretation of the licence issue
here is moot, surely the easiest action is to offer a debian package
repository hosted in the postgresql.org infrastructure.

That would also allow us to maintain all our currently supported
versions and choose to consider reaching EOL of one of them where it's
still included in a stable debian releases.  Debian folks can't do that
and as a result they will only ship one major version at a time, which
certainly is a shame.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] btree_gist (was: CommitFest progress - or lack thereof)

2011-02-12 Thread Tom Lane

Jeff Davis  writes:
> btree_gist is essentially required for exclusion constraints to be
> useful in a practical way.

> In fact, can you submit it for the next commitfest to be included in
> core? That would allow range types and exclusion constraints to be used
> out-of-the-box in 9.2.

> Only if you think it's reasonable to put it in core, of course. If
> extensions are easy enough to install, maybe that's not really
> necessary.

btree_gist is entirely unready to be included in core.
http://archives.postgresql.org/pgsql-bugs/2010-10/msg00104.php

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQL/MED - file_fdw

2011-02-12 Thread Kevin Grittner

Noah Misch  wrote:

> I'd say, run them with this patch alone.  The important thing is
> to not penalize existing COPY users.  Incidentally, the "did you
> want ... ?" was a genuine question.  I see very little performance
> risk here, so the tests could be quite cursory, even absent
> entirely.

In two hours of testing with a 90GB production database, the copy
patch on top of HEAD ran 0.6% faster than HEAD for pg_dumpall
(generating identical output files), but feeding that in to and
empty cluster with psql ran 8.4% faster with the patch than without!
I'm going to repeat that latter with more attention to whether
everything made it in OK.  (That's not as trivial to check as the
dump phase.)

Do you see any reason that COPY FROM should be significantly
*faster* with the patch?  Are there any particular things I should
be checking for problems?

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sorting. When?

2011-02-12 Thread Kevin Grittner

"mac_man2...@yahoo.it"  wrote:

> So, invoking or not invoking sorting depends on different
> parameters of the specific DBMS, does it?

In the right circumstances (where autovacuum or other maintenance
jobs happen to run in the background at the right moment), two
back-to-back runs on the same database with no data or configuration
changes could do this differently.  You wouldn't see that too often,
but if the estimated costs of sorting versus non-sorting plans were
close, the random sampling of one statistics sampling run versus
another could cause a change.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sorting. When?

2011-02-12 Thread Pavel Stehule

2011/2/12 mac_man2...@yahoo.it :
> So, invoking or not invoking sorting depends on different parameters of the
> specific DBMS, does it?
>
> This also means that it depends on the specific implementation of the
> Planner and, as a consequence, on the specific DBMS?
> I mean, different DBMS can chose differently on invoking sorting even if
> they are executing the same query over the same set of data?
>

exactly

Regards

Pavel Stehule

> Fava.
>
> Il 11/02/2011 22:49, Nicolas Barbier ha scritto:
>
> 2011/2/11 Kevin Grittner :
>
> "mac_man2...@yahoo.it"  wrote:
>
> I need to know, from an algorithmic point of view, in which cases
> sorting is invoked.
>
> [..]
>
> Are your really looking to categorize the types of queries where
> sorting is *invoked*, or the ones where it is *considered*?  Or
> perhaps only those where it is *required*, since there are no
> possible plans without sorting?
>
> Or, if you are seeking the exact rules that are used by the planner to
> determine all possible plans from which the one with minimum cost is
> chosen (and hence all ways in which sorts can be used), I think that
> the source code is the only complete reference. A non-complete
> introduction is:
>
> http://www.postgresql.org/docs/9.0/static/planner-optimizer.html>
>
> Basically, physical operators (seq scan, index scan, hash join, merge
> join, nested loop, filter, set operation, etc) may require their input
> to satisfy certain sort constraints (for example, both inputs of a
> merge join need to be sorted on the join attribute(s)). If it happens
> to be of lowest cost to explicitly sort the inputs right before
> consuming them, that will be done. If there is a way to get the same
> input in an already-ordered way (for example an index scan, or the
> output of a merge join), so that the cost is less than the non-ordered
> way + an explicit sort, then that already-ordered way will be chosen.
>
> Super-basic example:
>
> SELECT * FROM t ORDER BY a;
>
> This may either perform a seq scan of table t and then do an explicit
> sort, or perform a full index scan of an index on attribute a
> (provided that such an index exists), in which case the explicit sort
> is not needed because the index scan will deliver the rows in
> already-sorted order. Which option is chosen depends on the cost: The
> costs of both plans are calculated and the least costly plan is
> chosen. See the (non-exhaustive) list of things that influence the
> costs provided by Kevin to get a feeling for how many variables there
> are that influence this choice.
>
> Nicolas
>
>
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sorting. When?

2011-02-12 Thread mac_man2...@yahoo.it

So, invoking or not invoking sorting depends on different parameters of 
the specific DBMS, does it?


This also means that it depends on the specific implementation of the 
Planner and, as a consequence, *on the specific DBMS*?
I mean, different DBMS can chose differently on invoking sorting even if 
they are executing the same query over the same set of data?


Fava.

Il 11/02/2011 22:49, Nicolas Barbier ha scritto:

2011/2/11 Kevin Grittner:


"mac_man2...@yahoo.it"  wrote:


I need to know, from an algorithmic point of view, in which cases
sorting is invoked.

[..]


Are your really looking to categorize the types of queries where
sorting is *invoked*, or the ones where it is *considered*?  Or
perhaps only those where it is *required*, since there are no
possible plans without sorting?

Or, if you are seeking the exact rules that are used by the planner to
determine all possible plans from which the one with minimum cost is
chosen (and hence all ways in which sorts can be used), I think that
the source code is the only complete reference. A non-complete
introduction is:

http://www.postgresql.org/docs/9.0/static/planner-optimizer.html>

Basically, physical operators (seq scan, index scan, hash join, merge
join, nested loop, filter, set operation, etc) may require their input
to satisfy certain sort constraints (for example, both inputs of a
merge join need to be sorted on the join attribute(s)). If it happens
to be of lowest cost to explicitly sort the inputs right before
consuming them, that will be done. If there is a way to get the same
input in an already-ordered way (for example an index scan, or the
output of a merge join), so that the cost is less than the non-ordered
way + an explicit sort, then that already-ordered way will be chosen.

Super-basic example:

SELECT * FROM t ORDER BY a;

This may either perform a seq scan of table t and then do an explicit
sort, or perform a full index scan of an index on attribute a
(provided that such an index exists), in which case the explicit sort
is not needed because the index scan will deliver the rows in
already-sorted order. Which option is chosen depends on the cost: The
costs of both plans are calculated and the least costly plan is
chosen. See the (non-exhaustive) list of things that influence the
costs provided by Kevin to get a feeling for how many variables there
are that influence this choice.

Nicolas

Re: [HACKERS] btree_gist (was: CommitFest progress - or lack thereof)

2011-02-12 Thread Jeff Davis

On Sat, 2011-02-12 at 18:53 +0300, Oleg Bartunov wrote:
> > It sure seems like
> > http://www.postgresql.org/docs/9.0/static/btree-gist.html could be and
> > should be improved, in general..  If this module is really still just a
> > test bed for GiST, then perhaps it's not a big deal..
> 
> No, it's quite useful and used in many projects, since it's the only way
> to create multicolumn gist indexes like (tsvector,date).

+1

btree_gist is essentially required for exclusion constraints to be
useful in a practical way.

In fact, can you submit it for the next commitfest to be included in
core? That would allow range types and exclusion constraints to be used
out-of-the-box in 9.2.

Only if you think it's reasonable to put it in core, of course. If
extensions are easy enough to install, maybe that's not really
necessary.

Regards,
Jeff Davis

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI bug?

2011-02-12 Thread Kevin Grittner

I wrote:
 
> it seems likely that such a cycle might be related to this new
> code not properly allowing for some aspect of tuple cleanup.
 
I found a couple places where cleanup could let these fall through
the cracks long enough to get stale and still be around when a tuple
ID is re-used, causing problems.  Please try the attached patch and
see if it fixes the problem for you.
 
If it does, then there's no need to try to track the other things I
was asking about.
 
Thanks!
 
-Kevin

*** a/src/backend/storage/lmgr/predicate.c
--- b/src/backend/storage/lmgr/predicate.c
***
*** 2329,2334  PredicateLockTupleRowVersionLink(const Relation relation,
--- 2329,2336 
if (next != NULL)
{
next->priorVersionOfRow = NULL;
+   if (SHMQueueEmpty(&next->predicateLocks))
+   PredXact->NeedTargetLinkCleanup = true;
oldtarget->nextVersionOfRow = NULL;
}
  
***
*** 3128,3133  ClearOldPredicateLocks(void)
--- 3130,3136 
int i;
HASH_SEQ_STATUS seqstat;
PREDICATELOCKTARGET *locktarget;
+   PREDICATELOCKTARGET *next;
  
LWLockAcquire(SerializableFinishedListLock, LW_EXCLUSIVE);
finishedSxact = (SERIALIZABLEXACT *)
***
*** 3237,3256  ClearOldPredicateLocks(void)
LWLockAcquire(FirstPredicateLockMgrLock + i, LW_EXCLUSIVE);
LWLockAcquire(PredicateLockNextRowLinkLock, LW_SHARED);
  
!   hash_seq_init(&seqstat, PredicateLockTargetHash);
!   while ((locktarget = (PREDICATELOCKTARGET *) hash_seq_search(&seqstat)))
{
!   if (SHMQueueEmpty(&locktarget->predicateLocks)
!   && locktarget->priorVersionOfRow == NULL
!   && locktarget->nextVersionOfRow == NULL)
{
!   hash_search(PredicateLockTargetHash, &locktarget->tag,
!   HASH_REMOVE, NULL);
}
}
  
-   PredXact->NeedTargetLinkCleanup = false;
- 
LWLockRelease(PredicateLockNextRowLinkLock);
for (i = NUM_PREDICATELOCK_PARTITIONS - 1; i >= 0; i--)
LWLockRelease(FirstPredicateLockMgrLock + i);
--- 3240,3267 
LWLockAcquire(FirstPredicateLockMgrLock + i, LW_EXCLUSIVE);
LWLockAcquire(PredicateLockNextRowLinkLock, LW_SHARED);
  
!   while (PredXact->NeedTargetLinkCleanup)
{
!   PredXact->NeedTargetLinkCleanup = false;
!   hash_seq_init(&seqstat, PredicateLockTargetHash);
!   while ((locktarget = (PREDICATELOCKTARGET *) 
hash_seq_search(&seqstat)))
{
!   if (SHMQueueEmpty(&locktarget->predicateLocks)
!   && locktarget->priorVersionOfRow == NULL)
!   {
!   next = locktarget->nextVersionOfRow;
!   if (next != NULL)
!   {
!   next->priorVersionOfRow = NULL;
!   if 
(SHMQueueEmpty(&next->predicateLocks))
!   PredXact->NeedTargetLinkCleanup 
= true;
!   }
!   hash_search(PredicateLockTargetHash, 
&locktarget->tag,
!   HASH_REMOVE, NULL);
!   }
}
}
  
LWLockRelease(PredicateLockNextRowLinkLock);
for (i = NUM_PREDICATELOCK_PARTITIONS - 1; i >= 0; i--)
LWLockRelease(FirstPredicateLockMgrLock + i);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pika failing since the per-column collation patch

2011-02-12 Thread Peter Eisentraut

On lör, 2011-02-12 at 13:34 +0100, Rémi Zara wrote:
> Since the per-column collation patch went in, pika (NetBSD 5.1/mips) started 
> failing consistently with this diff:
> 
> *** 
> /home/pgbuildfarm/workdir/HEAD/pgsql.15101/src/test/regress/expected/polymorphism.out
>  Sat Feb 12 02:16:07 2011
> --- 
> /home/pgbuildfarm/workdir/HEAD/pgsql.15101/src/test/regress/results/polymorphism.out
>   Sat Feb 12 09:10:21 2011
> ***
> *** 624,630 
>   
>   -- such functions must protect themselves if varying element type isn't OK
>   select max(histogram_bounds) from pg_stats;
> ! ERROR:  cannot compare arrays of different element types
>   -- test variadic polymorphic functions
>   create function myleast(variadic anyarray) returns anyelement as $$
> select min($1[i]) from generate_subscripts($1,1) g(i)
> --- 624,630 
>   
>   -- such functions must protect themselves if varying element type isn't OK
>   select max(histogram_bounds) from pg_stats;
> ! ERROR:  locale operation to be invoked, but no collation was derived
>   -- test variadic polymorphic functions
>   create function myleast(variadic anyarray) returns anyelement as $$
> select min($1[i]) from generate_subscripts($1,1) g(i)
> 
> Is there something I can do to help investigate this ?

It's only failing on this one machine, but there isn't anything
platform-specific in this code, so I'd look for memory management faults
on the code or a compiler problem.  Try with lower optimization for a
start.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extensions vs PGXS' MODULE_PATHNAME handling

2011-02-12 Thread Dimitri Fontaine

Tom Lane  writes:
> pgxs.mk will substitute for MODULE_PATHNAME in it is
> "$libdir/hstore-1.0" ... not exactly what's wanted.  This is because the
> transformation rule depends on $*, ie the base name of the input file.
[...]
> On balance #3 seems the least bad, but I wonder if anyone sees this
> choice differently or has another solution that I didn't think of.

A though that is occurring to me here would be to add a shlib property
in the control file and have the SQL script use $libdir/$shlib, or even
$shlib maybe.  That would only work for extensions scripts, and even
only for those containing a single .so.

But the only counter example I know of is PGQ, and its install script is
ran by its command line tools.  So PGQ would now ship 2 or 3 extensions
with some dependencies, each with its own .so.  Seems cleaner for me
anyway.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Change pg_last_xlog_receive_location not to move backwards

2011-02-12 Thread Simon Riggs

On Mon, 2011-01-31 at 16:12 +0900, Fujii Masao wrote:
> On Sun, Jan 30, 2011 at 10:44 AM, Jeff Janes  wrote:
> > I do not understand what doing so gets us.
> >
> > Say we previously received 2/3 of a WAL file, and replayed most of it.
> > So now the shared buffers have data that has been synced to that WAL
> > file already, and some of those dirty shared buffers have been written
> > to disk and some have not.  At this point, we need the data in the first
> > 2/3 of the WAL file in order to reach a consistent state.  But now we
> > lose the connection to the master, and then we restore it.  Now we
> > request the entire file from the start rather than from where it
> > left off.
> >
> > Either of two things happens.  Most likely, the newly received WAL file
> > matches the file it is overwriting, in which case there was no
> > point in asking for it.
> >
> > Less likely, the master is feeding us gibberish.  By requesting the
> > full WAL file, we check the header and detect that the master is feeding
> > us gibberish.  Unfortunately, we only detect that fact *after* we have
> > replaced a critical part of our own (previously good) copy of the WAL
> > file with said gibberish.  The standby is now in an unrecoverable state.
> 
> Right. To avoid this problem completely, IMO, walreceiver should validate
> the received WAL data before writing it. Or, walreceiver should write the
> WAL to the transient file, and the startup process should rename it to the
> correct name after replaying it.
> 
> We should do something like the above?
> 
> > With a bit of malicious engineering, I have created this situation.
> > I don't know how likely it is that something like that could happen
> > accidentally, say with a corrupted file system.  I have been unable
> > to engineer a situation where checking the header actually does
> > any good.  It has either done nothing, or done harm.
> 
> OK, I seem to have to consider again why the code which retreats the
> replication starting location exists.
> 
> At first, I added the code to prevent a half-baked WAL file. For example,
> please imagine the case where you start the standby server with no WAL
> files in pg_xlog. In this case, if replication starts from the middle of WAL
> file, the received WAL file is obviously broken (i.e., with no WAL data in
> the first half of file). This broken WAL file might cause trouble when we
> restart the standby and even when we just promote it (because the last
> applied WAL file is re-fetched at the end of recovery).
> 
> OTOH, we can start replication from the middle of WAL file if we can
> *ensure* that the first half of WAL file already exists. At least, when the
> standby reconnects to the master, we might be able to ensure that and
> start from the middle.

Some important points here, but it seems complex.

AFAICS we need to do two things
* check header of WAL file matches when we start streaming
* start streaming from last received location

So why not do them separately, rather than rewinding the received
location and kludging things?

Seems easier to send all required info in a separate data packet, then
validate the existing WAL file against that. Then start streaming from
last received location. That way we don't have any "going backwards"
logic at all.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Change pg_last_xlog_receive_location not to move backwards

2011-02-12 Thread Simon Riggs

On Sat, 2011-02-12 at 09:32 -0500, Robert Haas wrote:
> On Fri, Feb 11, 2011 at 12:52 PM, Stephen Frost  wrote:
> > * Robert Haas (robertmh...@gmail.com) wrote:
> >> Actually... wait a minute.  Now that I'm thinking about this a little
> >> more, I'm not so convinced.  If we leave this the way is, and just
> >> paper over the problem using the method Stephen and I both had in
> >> mind, then we could be streaming from a position that precedes the
> >> so-called "current" position.  And worse, the newly introduced replies
> >> to the master will still show the position going backward, so the
> >> master will reflect a position that is earlier that the position the
> >> slave reports for itself, not because of any real difference but
> >> because of a reporting difference.  That sure doesn't seem good.
> >
> > I'm really not sure it's as bad as all that...  The slave and the master
> > are only going to be "out of sync" wrt position until the slave sends
> > the request for update to the master, gets back the result, and checks
> > the XLOG header, right?
> 
> It'll restream the whole segment up to wherever it was, but basically, yes.
> 
> I think a big part of the problem here is that we have wildly
> inconsistent terminology for no especially good reason.  The standby
> reports three XLOG positions to the master, currently called write,
> flush, and apply.  The walreceiver reports positions called
> recievedUpTo and lastChunkStart.  receivedUpTo is actually the FLUSH
> position, and lastChunkStart is the previous flush position, except
> when the walreceiver first starts, when it's the position at which
> streaming is to begin.
> 
> So, what if we did some renaming?  I'd be inclined to start by
> renaming "receivedUpTo" to Flush, and add a new position called
> Stream.  When walreciever is started, we set Stream to the position at
> which streaming is going to begin (which can rewind) and leave Flush
> alone (so it never rewinds).

OK

>   We then change the walreceiver feedback
> mechanism to use the term stream_location rather than write_location;

OK

> and we could consider replacing pg_last_xlog_receive_location() with
> pg_last_xlog_stream_location() and pg_last_xlog_flush_location().
> That's a backward compatibility break, but maybe it's worth it for the
> sake of terminological consistency, not to mention accuracy.

Don't see a need to break anything. OK with two new function names.

> I'd also be inclined to go to the walreceiver code and and rename the
> apply_location to replay_location, so that it matches
> pg_last_xlog_replay_location().  The latter is in 9.0, but the former
> is new to 9.1, so we can still fix it to be consistent without a
> backward compatibility break.

Any renaming of things should be done as cosmetic fixes after important
patches are in.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI bug?

2011-02-12 Thread Kevin Grittner

YAMAMOTO Takashi  wrote:

> i have seen this actually happen.  i've confirmed the creation of
> the loop with the attached patch.  it's easily reproducable with
> my application. i can provide the full source code of my
> application if you want. (but it isn't easy to run unless you are
> familiar with the recent version of NetBSD)
> i haven't found a smaller reproducible test case yet.

I've never used NetBSD, so maybe a few details will help point me in
the right direction faster than the source code.

Has your application ever triggered any of the assertions in the
code?  (In particular, it would be interesting if it ever hit the
one right above where you patched.)

How long was the loop?

Did you notice whether the loop involved multiple tuples within a
single page?

Did this coincide with an autovacuum of the table?

These last two are of interest because it seems likely that such a
cycle might be related to this new code not properly allowing for
some aspect of tuple cleanup.

Thanks for finding this and reporting it, and thanks in advance for
any further detail you can provide.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Extensions vs PGXS' MODULE_PATHNAME handling

2011-02-12 Thread Tom Lane

I've run into a small infelicity that was introduced by our recent round
of redesign of the extensions feature.  Specifically, if we have an
installation script that is named like hstore-1.0.sql.in, then what
pgxs.mk will substitute for MODULE_PATHNAME in it is
"$libdir/hstore-1.0" ... not exactly what's wanted.  This is because the
transformation rule depends on $*, ie the base name of the input file.

There are a number of things we could do about this, each with some
upsides and downsides:

1. Forget about using MODULE_PATHNAME, and just start hardwiring
"$libdir/shlib-name" into install scripts.  A small upside is we'd not
need the .sql.in-to-.sql build step anymore.  The downside is that it's
kind of nice that the sql scripts don't need to know the shlib name ---
that certainly simplifies copying-and-pasting example functions.

2. Change the pgxs.mk rule to use $(MODULE_big)$(MODULES) instead of $*
(as I suspect it originally did, given the conditional around it).
This would work for makefiles that use $(MODULE_big) or use $(MODULES)
to build just a single shlib.  In those that build multiple shlibs
(currently only contrib/spi), we'd still have to fall back to hardwiring
"$libdir/shlib-name" into the install scripts.  Upside: works without
changes in simple cases.  Downside: breaks for multiple output modules,
and ugly as sin anyway.

3. Change the pgxs.mk rule to strip $* down to whatever's before the
first dash.  The downside of this is that we'd have to restrict
extensions to not have names including dash, a restriction not being
made presently.  On the other hand, we may well have to enforce such a
restriction anyway in order to get pg_available_extensions to make sense
of the directory contents.  Another point is that changing the rule
would potentially break old-style non-extension modules that use dashes
in their names.  We could work around that by making the rule behavior
conditional on whether EXTENSION is defined, which is kinda ugly but
probably worth doing for backwards compatibility's sake.

On balance #3 seems the least bad, but I wonder if anyone sees this
choice differently or has another solution that I didn't think of.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] btree_gist (was: CommitFest progress - or lack thereof)

2011-02-12 Thread Oleg Bartunov


Stephen,

On Sat, 12 Feb 2011, Stephen Frost wrote:


Oleg,

* Oleg Bartunov (o...@sai.msu.su) wrote:

what do you need for documentation ? From users point of view we add just
knn support and all examples are available in btree_gist.sql and sql
subdirectory. Contact me directly, if you have questions.


It sure seems like
http://www.postgresql.org/docs/9.0/static/btree-gist.html could be and
should be improved, in general..  If this module is really still just a
test bed for GiST, then perhaps it's not a big deal..


No, it's quite useful and used in many projects, since it's the only way
to create multicolumn gist indexes like (tsvector,date).


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Change pg_last_xlog_receive_location not to move backwards

2011-02-12 Thread Stephen Frost

* Robert Haas (robertmh...@gmail.com) wrote:
> I think a big part of the problem here is that we have wildly
> inconsistent terminology for no especially good reason.  

Agreed.

> Thoughts, comments?

I thought about this for a bit and agree w/ your suggestions.

So, +1 from me.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] ALTER TYPE 2: skip already-provable no-work rewrites

2011-02-12 Thread Noah Misch

On Fri, Feb 11, 2011 at 02:49:27PM -0500, Robert Haas wrote:
> You might want to consider a second boolean in lieu of a three way
> enum.  I'm not sure if that's cleaner but if it lets you write:
> 
> if (blah)
> at->verify = true;
> 
> instead of:
> 
> if (blah)
>  at->worklevel = Min(at->worklevel, WORK_VERIFY);
> 
> ...then I think that might be cleaner.

Good point; the Max() calls did not make much sense all by themselves.  The
point was to make sure nothing decreased the worklevel.  Wrapping them in a
macro, say, ATRequireWork, probably would have helped.

That said, I've tried both constructions, and I marginally prefer the end result
with AlteredTableInfo.verify.  I've inlined ATColumnChangeRequiresRewrite into
ATPrepAlterColumnType; it would need to either pass back two bools or take an
AlteredTableInfo arg to mutate, so this seemed cleaner.  I've omitted the
assertion that my previous version added to ATRewriteTable; it was helpful for
other scan-only type changes, but it's excessive for domains alone.  Otherwise,
the differences are cosmetic.

The large block in ATRewriteTable is now superfluous.  For easier review, I
haven't removed it.

I missed a typo in the last patch: "T if we a rewrite is forced".  Not changed
in this patch as I assume you'll want to commit it separately.

nm
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 452ced6..466be25 100644
diff --git a/src/backend/commands/index 1db42d0..2e10661 100644
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
***
*** 142,147  typedef struct AlteredTableInfo
--- 142,148 
List   *newvals;/* List of NewColumnValue */
boolnew_notnull;/* T if we added new NOT NULL 
constraints */
boolrewrite;/* T if we a rewrite is forced 
*/
+   boolverify; /* T if we shall verify 
constraints */
Oid newTableSpace;  /* new tablespace; 0 means no 
change */
/* Objects to rebuild after completing ALTER TYPE operations */
List   *changedConstraintOids;  /* OIDs of constraints to 
rebuild */
***
*** 336,342  static void ATPrepAlterColumnType(List **wqueue,
  AlteredTableInfo *tab, Relation rel,
  bool recurse, bool recursing,
  AlterTableCmd *cmd, LOCKMODE 
lockmode);
- static bool ATColumnChangeRequiresRewrite(Node *expr, AttrNumber varattno);
  static void ATExecAlterColumnType(AlteredTableInfo *tab, Relation rel,
  const char *colName, TypeName 
*typeName, LOCKMODE lockmode);
  static void ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, 
LOCKMODE lockmode);
--- 337,342 
***
*** 3242,3247  ATRewriteTables(List **wqueue, LOCKMODE lockmode)
--- 3242,3251 
heap_close(rel, NoLock);
}
  
+   /* New NOT NULL constraints always require a scan. */
+   if (tab->new_notnull)
+   tab->verify = true;
+ 
/*
 * We only need to rewrite the table if at least one column 
needs to
 * be recomputed, or we are adding/removing the OID column.
***
*** 3313,3319  ATRewriteTables(List **wqueue, LOCKMODE lockmode)
 * Test the current data within the table against new 
constraints
 * generated by ALTER TABLE commands, but don't rebuild 
data.
 */
!   if (tab->constraints != NIL || tab->new_notnull)
ATRewriteTable(tab, InvalidOid, lockmode);
  
/*
--- 3317,3323 
 * Test the current data within the table against new 
constraints
 * generated by ALTER TABLE commands, but don't rebuild 
data.
 */
!   if (tab->verify)
ATRewriteTable(tab, InvalidOid, lockmode);
  
/*
***
*** 3387,3393  ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap, 
LOCKMODE lockmode)
Relationnewrel;
TupleDesc   oldTupDesc;
TupleDesc   newTupDesc;
-   boolneedscan = false;
List   *notnull_attrs;
int i;
ListCell   *l;
--- 3391,3396 
***
*** 3446,3452  ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap, 
LOCKMODE lockmode)
switch (con->contype)
{
case CONSTR_CHECK:
-   needscan = true;
con->qualstate = (List *)
ExecPrepareExpr((Expr *

Re: [HACKERS] is_absolute_path incorrect on Windows

2011-02-12 Thread Bruce Momjian

Bruce Momjian wrote:
> Bruce Momjian wrote:
> > Tom Lane wrote:
> > > Bruce Momjian  writes:
> > > > Tom Lane wrote:
> > > >> Bruce Momjian  writes:
> > > >>> I have reviewed is_absolute_path() and have implemented
> > > >>> path_is_relative_and_below_cwd() to cleanly handle cases like 'E:abc' 
> > > >>> on
> > > >>> Win32;  patch attached.
> > > >> 
> > > >> This patch appears to remove some security-critical restrictions.
> > > >> Why did you delete the path_contains_parent_reference calls?
> > > 
> > > > They are now in path_is_relative_and_below_cwd(),
> > > 
> > > ... and thus not invoked in the absolute-path case.  This is a security
> > > hole.
> > > 
> > > >  I don't see a general reason to prevent
> > > > ".." in absolute paths, only relative ones.
> > > 
> > >   load '/path/to/database/../../../path/to/anywhere'
> > 
> > Ah, good point. I was thinking about someone using ".." in the part of
> > the path that is compared to /data or /log, but using it after that
> > would clearly be a security problem.
> > 
> > I have attached an updated patch that restructures the code to be
> > clearer and adds the needed checks.
> 
> I decided that my convert_and_check_filename() usage was too intertwined
> so I have developed a simplified version that is easier to understand; 
> patch attached.

Applied, with a new mention of why we don't use GetFullPathName():

+   /*
+*  On Win32, a drive letter _not_ followed by a slash, e.g. 'E:abc', is
+*  relative to the cwd on that drive, or the drive's root directory
+*  if that drive has no cwd.  Because the path itself cannot tell us
+*  which is the case, we have to assume the worst, i.e. that it is not
+*  below the cwd.  We could use GetFullPathName() to find the full path
+*  but that could change if the current directory for the drive changes
+*  underneath us, so we just disallow it.
+*/

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Change pg_last_xlog_receive_location not to move backwards

2011-02-12 Thread Robert Haas

On Fri, Feb 11, 2011 at 12:52 PM, Stephen Frost  wrote:
> * Robert Haas (robertmh...@gmail.com) wrote:
>> Actually... wait a minute.  Now that I'm thinking about this a little
>> more, I'm not so convinced.  If we leave this the way is, and just
>> paper over the problem using the method Stephen and I both had in
>> mind, then we could be streaming from a position that precedes the
>> so-called "current" position.  And worse, the newly introduced replies
>> to the master will still show the position going backward, so the
>> master will reflect a position that is earlier that the position the
>> slave reports for itself, not because of any real difference but
>> because of a reporting difference.  That sure doesn't seem good.
>
> I'm really not sure it's as bad as all that...  The slave and the master
> are only going to be "out of sync" wrt position until the slave sends
> the request for update to the master, gets back the result, and checks
> the XLOG header, right?

It'll restream the whole segment up to wherever it was, but basically, yes.

I think a big part of the problem here is that we have wildly
inconsistent terminology for no especially good reason.  The standby
reports three XLOG positions to the master, currently called write,
flush, and apply.  The walreceiver reports positions called
recievedUpTo and lastChunkStart.  receivedUpTo is actually the FLUSH
position, and lastChunkStart is the previous flush position, except
when the walreceiver first starts, when it's the position at which
streaming is to begin.

So, what if we did some renaming?  I'd be inclined to start by
renaming "receivedUpTo" to Flush, and add a new position called
Stream.  When walreciever is started, we set Stream to the position at
which streaming is going to begin (which can rewind) and leave Flush
alone (so it never rewinds).  We then change the walreceiver feedback
mechanism to use the term stream_location rather than write_location;
and we could consider replacing pg_last_xlog_receive_location() with
pg_last_xlog_stream_location() and pg_last_xlog_flush_location().
That's a backward compatibility break, but maybe it's worth it for the
sake of terminological consistency, not to mention accuracy.

I'd also be inclined to go to the walreceiver code and and rename the
apply_location to replay_location, so that it matches
pg_last_xlog_replay_location().  The latter is in 9.0, but the former
is new to 9.1, so we can still fix it to be consistent without a
backward compatibility break.

Thoughts, comments?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multiset patch review

2011-02-12 Thread Stephen Frost

Itagaki,

* Itagaki Takahiro (itagaki.takah...@gmail.com) wrote:
> On Sat, Feb 12, 2011 at 05:01, Stephen Frost  wrote:
> > What does the spec say about this, if anything?  Is that required by
> > spec, or is the spec not relevant to this because MULTISETs are only one
> > dimensional..?
> 
> Yes. The SQL standard has only one-dimensional ARRAYs and MULTISETs ,
> though it supports "collections of collections", that we don't have.

Yeah, I was afraid of that.. :(

> > In my view, we should be throwing an error if we get called on a
> > multi-dim array and we can't perform the operation on that in an
> > obviously correct way, not forcing the input to match something we can
> > make work.
> 
> Agreed, I'll do so. We could also keep lower-bounds of arrays,
> at least on sort.

Sounds good.  I'm also fine with providing a 'flatten' function, I just
don't agree w/ doing it automatically.

> > I'm not thrilled with called ArrayGetNItems array_cardinality().  Why
> > not just provide a function with a name like "array_getnitems()"
> > instead?
> 
> We must use the name, because it is in the SQL standard.

If we use the name, then we have to throw an error when it's not a
single dimension array, since that's what's in the SQL standard.  In
that case, we might as well have another function which gives us
ArrayGetNItems anyway.

> > trim_array() suffers from two problems: lack of comments, and no spell
> > checking done on those that are there.
> 
> What comments do you want?

Uhm, how about ones that explain what's going on in each paragraph of
code..?  And in other places, commenting the functions, what they do,
what they're used for, and documenting each of the arguments that are
passed in..

> > Should get_type_cache() really live in array_userfuncs.c ?
> 
> Do you find codes using the same operation in other files?

Not yet, but logically it's about gathering information about types and
could be needed beyond just arrays..

> > There's more, primairly lack of comments and what I consider poor
> > function naming ("sort_or_unique()" ?  really?),
> 
> Could you suggest better names?

How about 'array_sort()' or similar?  With appropriate arguments that
can be used to request unique'ing or not?  Or is there a "just unique
it, but don't sort it" option for this function?

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] btree_gist (was: CommitFest progress - or lack thereof)

2011-02-12 Thread Robert Haas

On Sat, Feb 12, 2011 at 7:47 AM, Stephen Frost  wrote:
> Oleg,
>
> * Oleg Bartunov (o...@sai.msu.su) wrote:
>> what do you need for documentation ? From users point of view we add just
>> knn support and all examples are available in btree_gist.sql and sql
>> subdirectory. Contact me directly, if you have questions.
>
> It sure seems like
> http://www.postgresql.org/docs/9.0/static/btree-gist.html could be and
> should be improved, in general..  If this module is really still just a
> test bed for GiST, then perhaps it's not a big deal..

I agree that the documentation there could be a lot better, but I
don't think that's a commit-blocker for this patch.  However, "us
reaching beta" will be a commit-blocker.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] btree_gist (was: CommitFest progress - or lack thereof)

2011-02-12 Thread Stephen Frost

Oleg,

* Oleg Bartunov (o...@sai.msu.su) wrote:
> what do you need for documentation ? From users point of view we add just
> knn support and all examples are available in btree_gist.sql and sql
> subdirectory. Contact me directly, if you have questions.

It sure seems like
http://www.postgresql.org/docs/9.0/static/btree-gist.html could be and
should be improved, in general..  If this module is really still just a
test bed for GiST, then perhaps it's not a big deal..

Thanks,

Stephen

signature.asc
Description: Digital signature

[HACKERS] pika failing since the per-column collation patch

2011-02-12 Thread Rémi Zara

Hi,

Since the per-column collation patch went in, pika (NetBSD 5.1/mips) started 
failing consistently with this diff:

*** 
/home/pgbuildfarm/workdir/HEAD/pgsql.15101/src/test/regress/expected/polymorphism.out
   Sat Feb 12 02:16:07 2011
--- 
/home/pgbuildfarm/workdir/HEAD/pgsql.15101/src/test/regress/results/polymorphism.out
Sat Feb 12 09:10:21 2011
***
*** 624,630 
  
  -- such functions must protect themselves if varying element type isn't OK
  select max(histogram_bounds) from pg_stats;
! ERROR:  cannot compare arrays of different element types
  -- test variadic polymorphic functions
  create function myleast(variadic anyarray) returns anyelement as $$
select min($1[i]) from generate_subscripts($1,1) g(i)
--- 624,630 
  
  -- such functions must protect themselves if varying element type isn't OK
  select max(histogram_bounds) from pg_stats;
! ERROR:  locale operation to be invoked, but no collation was derived
  -- test variadic polymorphic functions
  create function myleast(variadic anyarray) returns anyelement as $$
select min($1[i]) from generate_subscripts($1,1) g(i)

Is there something I can do to help investigate this ?

Regards,

Rémi Zara
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fwd: [JDBC] Weird issues when reading UDT from stored function

2011-02-12 Thread Lukas Eder

I had tried that before. That doesn't seem to change anything. JDBC still
expects 6 OUT parameters, instead of just 1...

2011/2/11 Robert Haas 

> On Tue, Jan 25, 2011 at 2:39 AM, Lukas Eder  wrote:
> > So what you're suggesting is that the plpgsql code is causing the issues?
> > Are there any indications about how I could re-write this code? The
> > important thing for me is to have the aforementioned signature of the
> > plpgsql function with one UDT OUT parameter. Even if this is a bit
> awkward
> > in general, in this case, I don't mind rewriting the plpgsql function
> > content to create a workaround for this problem...
>
> Possibly something like address := (SELECT ...) rather than SELECT ...
> INTO address?
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: [HACKERS] [Mingw-users] mingw64

2011-02-12 Thread Ralf Wildenhues

Hello, and sorry for the delay,

* Peter Rosin wrote on Sat, Jan 29, 2011 at 02:26:24PM CET:
> Or is plain 'ar' used somewhere instead of 'x86_64-w64-mingw32-ar'?

Automake outputs 'AR = ar' in Makefile.in for rules creating old
libraries iff neither AC_PROG_LIBTOOL nor another method to define
AR correctly is used in configure.ac.

So this issue concerns packages using Automake but not using Libtool.

I figured with AM_PROG_AR eventually being needed anyway that would fix
this in one go ...

A good workaround, as already mentioned, is to use this in configure.ac:
  AC_CHECK_TOOL([AR], [ar], [false])

Cheers,
Ralf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI bug?

2011-02-12 Thread YAMAMOTO Takashi

hi,

> YAMAMOTO Takashi  wrote:
>  
>> it seems that PredicateLockTupleRowVersionLink sometimes create
>> a loop of targets (it founds an existing 'newtarget' whose
>> nextVersionOfRow chain points to the 'oldtarget') and it later
>> causes CheckTargetForConflictsIn loop forever.
>  
> Is this a hypothetical risk based on looking at the code, or have
> you seen this actually happen?  Either way, could you provide more
> details?  (A reproducible test case would be ideal.)

i have seen this actually happen.  i've confirmed the creation of the loop
with the attached patch.  it's easily reproducable with my application.
i can provide the full source code of my application if you want.
(but it isn't easy to run unless you are familiar with the recent
version of NetBSD)
i haven't found a smaller reproducible test case yet.

YAMAMOTO Takashi

>  
> This being the newest part of the code, I'll grant that it is the
> most likely to have an unidentified bug; but given that the pointers
> are from one predicate lock target structure identified by a tuple
> ID to one identified by the tuple ID of the next version of the row,
> it isn't obvious to me how a cycle could develop.
>  
> -Kevin
diff --git a/src/backend/storage/lmgr/predicate.c 
b/src/backend/storage/lmgr/predicate.c
index 722d0f8..3e1a3e2 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -2350,7 +2350,25 @@ PredicateLockTupleRowVersionLink(const Relation relation,
newtarget->nextVersionOfRow = NULL;
}
else
+   {
Assert(newtarget->priorVersionOfRow == NULL);
+#if 0
+   Assert(newtarget->nextVersionOfRow == NULL);
+#endif
+   if (newtarget->nextVersionOfRow != NULL) {
+   PREDICATELOCKTARGET *t;
+
+   elog(WARNING, "new %p has next %p\n",
+   newtarget, newtarget->nextVersionOfRow);
+   for (t = newtarget->nextVersionOfRow; t != NULL;
+   t = t->nextVersionOfRow) {
+   if (oldtarget != t) {
+   elog(WARNING, "creating a loop 
new=%p old=%p\n",
+   newtarget, oldtarget);
+   }
+   }
+   }
+   }
 
newtarget->priorVersionOfRow = oldtarget;
oldtarget->nextVersionOfRow = newtarget;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pl/python custom exceptions for SPI

2011-02-12 Thread Jan Urbański

On 11/02/11 10:53, Jan Urbański wrote:
> On 10/02/11 22:26, Steve Singer wrote:

Here's an updated patch with documentation. It's an incremental patch on
top of the latest explicit-subxacts version.

Cheers,
Jan
diff --git a/doc/src/sgml/plpython.sgml b/doc/src/sgml/plpython.sgml
index 87be8c2..aee54bf 100644
*** a/doc/src/sgml/plpython.sgml
--- b/doc/src/sgml/plpython.sgml
*** $$ LANGUAGE plpythonu;
*** 950,960 
  Functions accessing the database might encounter errors, which
  will cause them to abort and raise an exception.  Both
  plpy.execute and
! plpy.prepare can raise an instance of
! plpy.SPIError, which by default will terminate
! the function.  This error can be handled just like any other
! Python exception, by using the try/except
! construct.  For example:
  
  CREATE FUNCTION try_adding_joe() RETURNS text AS $$
  try:
--- 950,959 
  Functions accessing the database might encounter errors, which
  will cause them to abort and raise an exception.  Both
  plpy.execute and
! plpy.prepare can raise an instance of a subclass of
! plpy.SPIError, which by default will terminate the
! function.  This error can be handled just like any other Python exception,
! by using the try/except construct.  For example:
  
  CREATE FUNCTION try_adding_joe() RETURNS text AS $$
  try:
*** CREATE FUNCTION try_adding_joe() RETURNS
*** 966,971 
--- 965,1007 
  $$ LANGUAGE plpythonu;
  
 
+
+ The actual class of the exception being raised corresponds to exact
+ condition that caused the error (refer to 
+ for a list of possible conditions).  The
+ plpy.spiexceptions module defines an exception class for
+ each PostgreSQL condition, deriving their names
+ from the condition name.  For instance, division_by_zero
+ becomes DivisionByZero, unique_violation
+ becomes UniqueViolation, fdw_error
+ becomes FdwError and so on.  Each of these exception
+ classes inherits from SPIError.  This separation makes
+ it easier to handle specific errors, for instance:
+ 
+ CREATE FUNCTION insert_fraction(numerator int, denominator int) RETURNS text AS $$
+ from plpy import spiexceptions
+ try:
+ plpy.execute("INSERT INTO fractions(frac) VALUES (%d / %d)" %
+ (numerator, denominator))
+ except spiexceptions.DivisionByZero:
+ return "denominator cannot equal zero"
+ except spiexceptions.UniqueViolation:
+ return "already have that fraction"
+ except plpy.SPIError, e:
+ return "other error, SQLSTATE %s" % e.sqlstate
+ else:
+ return "fraction inserted"
+ $$ LANGUAGE plpythonu;
+ 
+
+
+ Note that because all exceptions from
+ the plpy.spiexceptions module inherit
+ from SPIError, an except clause
+ handling it will catch any database access error. You can differentiate
+ inside the except block by looking at
+ the sqlstate string attribute.
+

   
  
diff --git a/src/pl/plpython/Makefile b/src/pl/plpython/Makefile
index 33dddc6..b3f0d0e 100644
*** a/src/pl/plpython/Makefile
--- b/src/pl/plpython/Makefile
*** PSQLDIR = $(bindir)
*** 86,94 
--- 86,102 
  
  include $(top_srcdir)/src/Makefile.shlib
  
+ # Force this dependency to be known (see src/pl/plpgsql/src/Makefile)
+ plpython.o: spiexceptions.h
+ 
+ # Generate spiexceptions.h from utils/errcodes.h
+ spiexceptions.h: $(top_srcdir)/src/backend/utils/errcodes.txt generate-spiexceptions.pl
+ 	$(PERL) $(srcdir)/generate-spiexceptions.pl $< > $@
  
  all: all-lib
  
+ distprep: spiexceptions.h
+ 
  install: all installdirs install-lib
  ifeq ($(python_majorversion),2)
  	cd '$(DESTDIR)$(pkglibdir)' && rm -f plpython$(DLSUFFIX) && $(LN_S) $(shlib) plpython$(DLSUFFIX)
*** endif
*** 134,143 
  submake:
  	$(MAKE) -C $(top_builddir)/src/test/regress pg_regress$(X)
  
! clean distclean maintainer-clean: clean-lib
  	rm -f $(OBJS)
  	rm -rf results
  	rm -f regression.diffs regression.out
  ifeq ($(PORTNAME), win32)
  	rm -f python${pytverstr}.def
  endif
--- 142,156 
  submake:
  	$(MAKE) -C $(top_builddir)/src/test/regress pg_regress$(X)
  
! clean distclean: clean-lib
  	rm -f $(OBJS)
  	rm -rf results
  	rm -f regression.diffs regression.out
+ 
+ # since we distribute spiexceptions.h, only remove it in maintainer-clean
+ maintainer-clean: clean distclean
+ 	rm -f spiexceptions.h
+ 
  ifeq ($(PORTNAME), win32)
  	rm -f python${pytverstr}.def
  endif
diff --git a/src/pl/plpython/expected/plpython_error.out b/src/pl/plpython/expected/plpython_error.out
index 7597ca7..afbc6db 100644
*** a/src/pl/plpython/expected/plpython_error.out
--- b/src/pl/plpython/expected/plpython_error.out
*** CREATE FUNCTION sql_syntax_error() RETUR
*** 32,38 
  'plpy.execute("syntax error")'
  LANGUAGE plpythonu;
  SELECT sql_syntax_error();
! ERROR:  plpy.SPIError: syntax error

Re: [HACKERS] pl/python explicit subtransactions

2011-02-12 Thread Jan Urbański

On 11/02/11 17:22, Steve Singer wrote:
> On 11-02-10 05:20 AM, Jan Urbański wrote:
>>
>> D'oh, I was thinking about whether it's safe to skip the internal
>> subxact if you're in an implicit one and somehow I always convinced
>> myself that since you eventually close the explicit one, it is.
>>
>> Obviously my testing wasn't enough :( Attaching an updated patch with
>> improved docs incorporating Steve's fixes, and fixes & tests for not
>> statring the implicit subxact. That actually makes the patch a bit
>> smaller ;) OTOH I had to remove the section from the docs that claimed
>> performance improvement due to only starting the explicit subxact...
>>
> 
> This version of the patch looks fine to me and seems to work as expected.

Thanks,

attached is a version merged with master.

Jan
diff --git a/doc/src/sgml/plpython.sgml b/doc/src/sgml/plpython.sgml
index e05c293..87be8c2 100644
*** a/doc/src/sgml/plpython.sgml
--- b/doc/src/sgml/plpython.sgml
*** $$ LANGUAGE plpythonu;
*** 943,949 
  

  
!   
 Trapping Errors
  
 
--- 943,949 
  

  
!   
 Trapping Errors
  
 
*** $$ LANGUAGE plpythonu;
*** 968,973 
--- 968,1089 
 

   
+ 
+  
+   Explicit subtransactions
+   
+ Recovering from errors caused by database access as described
+ in  can lead to an undesirable situation
+ where some operations succeed before one of them fails and after recovering
+ from that error the data is left in an inconsistent state. PL/Python offers
+ a solution to this problem in the form of explicit subtransactions.
+   
+ 
+   
+Subtransaction context managers
+
+  Consider a function that implements a transfer between two accounts:
+ 
+ CREATE FUNCTION transfer_funds() RETURNS void AS $$
+ try:
+ plpy.execute("UPDATE accounts SET balance = balance - 100 WHERE account_name = 'joe'")
+ plpy.execute("UPDATE accounts SET balance = balance + 100 WHERE account_name = 'mary'")
+ except plpy.SPIError, e:
+ result = "error transferring funds: %s" % e.args
+ else:
+ result = "funds transferred correctly"
+ plpy.execute("INSERT INTO operations(result) VALUES ('%s')" % result)
+ $$ LANGUAGE plpythonu;
+ 
+  If the second UPDATE statement results in an exception
+  being raised, this function will report the error, but the result of the
+  first UPDATE will nevertheless be committed. In other
+  words, the funds will be withdrawn from Joe's account, but will not be
+  transferred to Mary's account.
+
+
+  To avoid such issues, you can wrap your plpy.execute
+  calls in an explicit subtransaction. The plpy module
+  provides a helper object to manage explicit subtransactions that gets
+  created with the plpy.subtransaction() function.
+  Objects created by this function implement the
+  http://docs.python.org/library/stdtypes.html#context-manager-types";>
+  context manager interface. Using explicit subtransactions we can
+  rewrite our function as:
+ 
+ CREATE FUNCTION transfer_funds2() RETURNS void AS $$
+ try:
+ with plpy.subtransaction():
+ plpy.execute("UPDATE accounts SET balance = balance - 100 WHERE account_name = 'joe'")
+ plpy.execute("UPDATE accounts SET balance = balance + 100 WHERE account_name = 'mary'")
+ except plpy.SPIError, e:
+ result = "error transferring funds: %s" % e.args
+ else:
+ result = "funds transferred correctly"
+ plpy.execute("INSERT INTO operations(result) VALUES ('%s')" % result)
+ $$ LANGUAGE plpythonu;
+ 
+  Note that the use of try/catch is still
+  required. Otherwise the exception would propagate to the top of the Python
+  stack and would cause the whole function to abort with
+  a PostgreSQL error.
+  The operations table would not have any row inserted
+  into it. The subtransaction context manager does not trap errors, it only
+  assures that all database operations executed inside its scope will be
+  atomically committed or rolled back.  A rollback of the subtransaction
+  block occurrs on any kind of exception exit, not only ones caused by
+  errors originating from database access. A regular Python exception raised
+  inside an explicit subtransaction block would also cause the
+  subtransaction to be rolled back.
+
+   
+ 
+   
+Older Python versions
+ 
+
+  Context managers syntax using the with keyword is
+  available by default in Python 2.6. If using PL/Python with an older
+  Python version, it is still possible to use explicit subtransactions,
+  although not as transparently. You can call the subtransaction
+  manager's __enter__ and __exit__
+  functions using the enter and exit
+  convenience aliases, and the example function that transfers funds could
+  be written as:
+ 
+ CREATE FUNCTION transfer_funds_old() RETURNS void

[HACKERS] Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding.

2011-02-12 Thread Alex Hunsaker

On Sun, Feb 6, 2011 at 15:31, Andrew Dunstan  wrote:
> Force strings passed to and from plperl to be in UTF8 encoding.
>
> String are converted to UTF8 on the way into perl and to the
> database encoding on the way back. This avoids a number of
> observed anomalies, and ensures Perl a consistent view of the
> world.

So I noticed a problem while playing with this in my discussion with
David Wheeler. pg_do_encoding() does nothing when the src encoding ==
the dest encoding. That means on a UTF-8 database we fail make sure
our strings are valid utf8.

An easy way to see this is to embed a null in the middle of a string:
=> create or replace function zerob() returns text as $$ return
"abcd\0efg"; $$ language plperl;
=> SELECT zerob();
abcd

Also It seems bogus to bogus to do any encoding conversion when we are
SQL_ASCII, and its really trivial to fix.

With the attached:
- when we are on a utf8 database make sure to verify our output string
in sv2cstr (we assume database strings coming in are already valid)

- Do no string conversion when we are SQL_ASCII in or out

- add plperl_helpers.h as a dep to plperl.o in our makefile

- remove some redundant calls to pg_verify_mbstr()

- as utf_e2u only as one caller dont pstrdup() instead have the caller
check (saves some cycles and memory)
*** a/doc/src/sgml/plperl.sgml
--- b/doc/src/sgml/plperl.sgml
***
*** 127,135  $$ LANGUAGE plperl;
  

  
!   Arguments will be converted from the database's encoding to UTF-8 
!   for use inside plperl, and then converted from UTF-8 back to the 
!   database encoding upon return. 
  

  
--- 127,136 
  

  
!   Arguments will be converted from the database's encoding to
!   UTF-8 for use inside plperl, and then converted from UTF-8 back
!   to the database encoding upon return.  Unless the database
!   encoding is SQL_ASCII, then no conversion is done.
  

  
*** a/src/pl/plperl/GNUmakefile
--- b/src/pl/plperl/GNUmakefile
***
*** 54,60  PSQLDIR = $(bindir)
  
  include $(top_srcdir)/src/Makefile.shlib
  
! plperl.o: perlchunks.h plperl_opmask.h
  
  plperl_opmask.h: plperl_opmask.pl
  	$(PERL) $< $@
--- 54,60 
  
  include $(top_srcdir)/src/Makefile.shlib
  
! plperl.o: perlchunks.h plperl_opmask.h plperl_helpers.h
  
  plperl_opmask.h: plperl_opmask.pl
  	$(PERL) $< $@
*** a/src/pl/plperl/plperl.c
--- b/src/pl/plperl/plperl.c
***
*** 2308,2315  plperl_spi_exec(char *query, int limit)
  	{
  		int			spi_rv;
  
- 		pg_verifymbstr(query, strlen(query), false);
- 
  		spi_rv = SPI_execute(query, current_call_data->prodesc->fn_readonly,
  			 limit);
  		ret_hv = plperl_spi_execute_fetch_result(SPI_tuptable, SPI_processed,
--- 2308,2313 
***
*** 2555,2563  plperl_spi_query(char *query)
  		void	   *plan;
  		Portal		portal;
  
- 		/* Make sure the query is validly encoded */
- 		pg_verifymbstr(query, strlen(query), false);
- 
  		/* Create a cursor for the query */
  		plan = SPI_prepare(query, 0, NULL);
  		if (plan == NULL)
--- 2553,2558 
***
*** 2767,2775  plperl_spi_prepare(char *query, int argc, SV **argv)
  			qdesc->argtypioparams[i] = typIOParam;
  		}
  
- 		/* Make sure the query is validly encoded */
- 		pg_verifymbstr(query, strlen(query), false);
- 
  		/
  		 * Prepare the plan and check for errors
  		 /
--- 2762,2767 
*** a/src/pl/plperl/plperl_helpers.h
--- b/src/pl/plperl/plperl_helpers.h
***
*** 20,27  static inline char *
  utf_e2u(const char *str)
  {
  	char *ret = (char*)pg_do_encoding_conversion((unsigned char*)str, strlen(str), GetDatabaseEncoding(), PG_UTF8);
- 	if (ret == str)
- 		ret = pstrdup(ret);
  	return ret;
  }
  
--- 20,25 
***
*** 34,46  sv2cstr(SV *sv)
  {
  	char *val;
  	STRLEN len;
  
  	/*
! 	 * get a utf8 encoded char * out of perl. *note* it may not be valid utf8!
  	 */
  	val = SvPVutf8(sv, len);
  
  	/*
  	 * we use perls length in the event we had an embedded null byte to ensure
  	 * we error out properly
  	 */
--- 32,59 
  {
  	char *val;
  	STRLEN len;
+ 	int enc = GetDatabaseEncoding();
  
  	/*
! 	 * we dont do any conversion when SQL_ASCII
  	 */
+ 	if (enc == PG_SQL_ASCII)
+ 	{
+ 		val = SvPV(sv, len);
+ 		return pstrdup(val);
+ 	}
+ 
  	val = SvPVutf8(sv, len);
  
  	/*
+ 	 * when the src encoding matches the dest encoding pg_do_encoding will not
+ 	 * do any verification. SvPVutf8() may return invalid utf8 so we need to do
+ 	 * that ourselves
+ 	 */
+ 	if (enc == PG_UTF8)
+ 		pg_verifymbstr(val, len, false);
+ 
+ 	/*
  	 * we use perls length in the event we had an embedded null byte to ensure
  	 * we error out properly
  	 */
***
*** 56,67  static inline SV *
  cstr2sv(const char *str)
  {
  	SV *sv;
! 	char *utf8_str = utf_e2u(str);

Re: [HACKERS] pl/python tracebacks

2011-02-12 Thread Jan Urbański

On 12/02/11 10:00, Alex Hunsaker wrote:
> On Sat, Feb 12, 2011 at 01:50, Jan Urbański  wrote:
>> On 12/02/11 04:12, Alex Hunsaker wrote:
>>> In PLy_traceback fname and prname look like they will leak (well as
>>> much as a palloc() in an error path can leak I suppose).
>>
>> But they're no palloc'd, no? fname is either a static " string,
>> or PyString_AsString, which also doesn't allocate memory, AFAIK.
> 
> Yeah, I was flat out wrong about proname :-(.
> 
> As for fname, I must be missing some magic. We have:
> 
> #if PY_MAJOR_VERSION > 3
> ...
> #define PyString_AsString(x) PLyUnicode_AsString(x)
> 
> PLyUnicode_AsString(PyObject *unicode)
> {
> PyObject   *o = PLyUnicode_Bytes(unicode);
> char   *rv = pstrdup(PyBytes_AsString(o));
> 
> Py_XDECREF(o);
> return rv;
> }
> 
> PyString_AsString is used all over the place without any pfrees. But I
> have no Idea how that pstrdup() is getting freed if at all.
> 
> Care to enlighten me ?

Ooops, seems that this hack that's meant to improve compatibility with
Python3 makes it leak. I wonder is the pstrdup is necessary here, but
OTOH the leak should not be overly significant, given that no-one
complained about it before... and PyString_AsString is being used in
several other places.

Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pl/python tracebacks

2011-02-12 Thread Alex Hunsaker

On Sat, Feb 12, 2011 at 01:50, Jan Urbański  wrote:
> On 12/02/11 04:12, Alex Hunsaker wrote:
>> In PLy_traceback fname and prname look like they will leak (well as
>> much as a palloc() in an error path can leak I suppose).
>
> But they're no palloc'd, no? fname is either a static " string,
> or PyString_AsString, which also doesn't allocate memory, AFAIK.

Yeah, I was flat out wrong about proname :-(.

As for fname, I must be missing some magic. We have:

#if PY_MAJOR_VERSION > 3
...
#define PyString_AsString(x) PLyUnicode_AsString(x)

PLyUnicode_AsString(PyObject *unicode)
{
PyObject   *o = PLyUnicode_Bytes(unicode);
char   *rv = pstrdup(PyBytes_AsString(o));

Py_XDECREF(o);
return rv;
}

PyString_AsString is used all over the place without any pfrees. But I
have no Idea how that pstrdup() is getting freed if at all.

Care to enlighten me ?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pl/python tracebacks

2011-02-12 Thread Jan Urbański

On 12/02/11 04:12, Alex Hunsaker wrote:
> On Wed, Feb 9, 2011 at 02:10, Jan Urbański  wrote:
>> On 06/02/11 20:12, Jan Urbański wrote:
>>> On 27/01/11 22:58, Jan Urbański wrote:
 On 23/12/10 14:56, Jan Urbański wrote:
> Here's a patch implementing traceback support for PL/Python mentioned in
> http://archives.postgresql.org/pgsql-hackers/2010-12/msg01991.php. It's
> an incremental patch on top of the plpython-refactor patch sent eariler.

 Updated to master.
>>>
>>> Updated to master again.
>>
>> Once more.
> 
> In PLy_traceback fname and prname look like they will leak (well as
> much as a palloc() in an error path can leak I suppose). 

But they're no palloc'd, no? fname is either a static " string,
or PyString_AsString, which also doesn't allocate memory, AFAIK. proname
is also a static string. They're transferred to heap-allocated memory in
appendStringInfo, which gets pfreed after emitting the error message.

> Marking as Ready.

Thanks!

Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] btree_gist (was: CommitFest progress - or lack thereof)

2011-02-12 Thread Oleg Bartunov


Stephen,

what do you need for documentation ? From users point of view we add just
knn support and all examples are available in btree_gist.sql and sql 
subdirectory. Contact me directly, if you have questions.


Oleg
On Fri, 11 Feb 2011, Stephen Frost wrote:


* Robert Haas (robertmh...@gmail.com) wrote:

Teodor sent it to the list Dec 28, see
http://archives.postgresql.org/message-id/4D1A1677.80300%40sigaev.ru

[...]

That having been said, this looks like a fairly mechanical change to a
contrib module that you and Teodor wrote.  So I'd say if you guys are
confident that it's correct, go ahead and commit.  If you need it
reviewed, or if you can't commit it in the next week or so, I think
it's going to have to wait for 9.2.


Alright, I've gone through this patch and the main thing it's missing is
documentation, as far as I can tell.  It passes all the regression tests
(and adds a number of them which are then tested with, which is always
nice) and while there are quite a few changes, they're all pretty
mechanical and simple.  There are some really minor whitespace issues
too, but overall I think this is ready to be committed, so long as we
have a promise that someone will write up the documentation for it.

I'd write the docs, but I'm not 100% sure that I know what's going on
enough to really write them correctly. :)  I'm also hoping that someone
else is already working on them.  If not, feel free to ping me and I'll
work on writing up *something*, at least.

Thanks,

Stephen



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

49 matches

Mail list logo