date:20100208

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Massa, Harald Armin

The pg8000 / bpgsql seem to be toy projects, and anyway you dont
want to use pure-Python drivers in high-performance environments.

I agree that there are some performance-challenges with pure-Python drivers.

And we should not forget to look for the reasons for the incubation of that
many pure-Python drivers:

a) Python is no longer one-language, one-implementation. There are (at
least) cPython (the original), Jython (on JVM), IronPython (from Microsoft
on CLR), PyPy (Python on Python), Unladen Swallow (from Google on LLVM). In
addition the nearly-Pythons as in Cython, RPython and ShedSkin

Everything apart from cPython (and possible UnladenSwallow) has its
challenges dealing with non-Python extension modules. From a developer
standpoint it can be tempting to be able to rely on the same database
adapter across more then one implementation

b) the stabilization of an Python Application Binary Interface is in early
discussion stage; meaning: it will take some time untill a non-Python
extension can be usable across Python versions. c-Extensions are allways a
major stumbling block on Python-n to Python-(n+1) versions

c) Stability. Python code is same-on-same more robust then C-Code, as some
of the crash-friendly-problems are eliminated (you cannot allocate memory
wrongly within Python, you cannot errorly access memory cross array
boundaries...)

especially a) is a point to consider when standardizing on a PostgreSQL
blessed Python-Postgresql-driver. How will the blessing extend to Jython /
Ironpython / PyPy?

Harald
-- 
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
no fx, no carrier pigeon
-
%s is too gigantic of an industry to bend to the whims of reality

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Greg Smith


Massa, Harald Armin wrote:
I agree that there are some performance-challenges with pure-Python 
drivers.
And we should not forget to look for the reasons for the incubation of 
that many pure-Python drivers:
a) Python is no longer one-language, one-implementation. There are (at 
least) cPython (the original), Jython (on JVM), IronPython (from 
Microsoft on CLR), PyPy (Python on Python), Unladen Swallow (from 
Google on LLVM). In addition the nearly-Pythons as in Cython, RPython 
and ShedSkin...
especially a) is a point to consider when standard, it's getting one 
driver that satisfies the needs of the people most like
izing on a PostgreSQL blessed Python-Postgresql-driver. How will the 
blessing extend to Jython / Ironpython / PyPy?


The point isn't so much standardizing.  Having a low performance 
Python driver turns into a PostgreSQL PR issue.  Last thing we need is 
the old PostgreSQL is slow meme to crop back up again via the Python 
community, if the driver suggested by the community isn't written with 
performance as a goal so that, say, PostgreSQL+Python looks really slow 
compared to MySQL+Python.  And if you're writing a database driver with 
performance as a goal, native Python is simply not an option.


Now, once *that* problem is under control, and there's a nicely 
licensed, well documented, major feature complete, and good performing 
driver, at that point it would be completely appropriate to ask what 
about people who want support for other Python platforms and don't care 
if it's slower?.  And as you say, nurturing the incubation of such 
drivers is completely worthwhile.  I just fear that losing focus by 
wandering too far in that direction, before resolving the main problem 
here, is just going to extend resolving the parts of the Python driver 
situation I feel people are most displeased with.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Gabriele Bartolini


Hi there,

Greg Smith ha scritto:
Looks like the first action item is to talk with the Psycopg people 
about their license.


Oh:  and I'm going to take care of this.  License changes can be a 
very sensitive topic and I'm told that discussion probably needs to 
happy in Italian too; I can arrange that.


I can try and help with this issue, given my role with the Italian 
PostgreSQL community and PostgreSQL business with 2ndQuadrant Italia. I 
have met Psycopg's developer a couple of times at open-source 
conferences. I have great respect for his work and his contribution in 
the open-source community, and I will be very happy to try and explain 
the situation to him.


I will keep you posted.

Ciao,
Gabriele

--
Gabriele Bartolini - 2ndQuadrant Italia
PostgreSQL Training, Services and Support
gabriele.bartol...@2ndquadrant.it | www.2ndQuadrant.it


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Backup history file should be replicated in Streaming Replication?

2010-02-08 Thread Heikki Linnakangas

Fujii Masao wrote:
 On Sun, Feb 7, 2010 at 1:02 AM, Bruce Momjian br...@momjian.us wrote:
 src/backend/access/transam/xlog.c
 else
 {
 XLogRecPtr  InvalidXLogRecPtr = {0, 0};
 ControlFile-minRecoveryPoint = InvalidXLogRecPtr;
 }
 In my original patch, the above is for the problem discussed in
 http://archives.postgresql.org/pgsql-hackers/2009-12/msg02039.php

 Since you've already fixed the problem, that code is useless.
 How about getting rid of that code?
 Has this been addressed?
 
 No. We need to obtain the comment about that from Heikki.

I removed that.

It only makes a difference if you stop archive recovery, remove
recovery.conf, and start up again, causing the server to do normal crash
recovery. That's a don't do that scenario, but it seems better to not
clear minRecoveryPoint, even though we don't check it during crash
recovery. It might be useful debug information, and also if you then put
recovery.conf back, we will enforce that you reach the minRecoveryPoint
again.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Streaming Replication on win32

2010-02-08 Thread Fujii Masao

On Mon, Jan 18, 2010 at 11:46 PM, Magnus Hagander mag...@hagander.net wrote:
 From what I can tell, this indicates that pq_getbyte_if_available() is
 not working - because it's supposed to never block, right?

 Right, it's not supposed to block.

 This could be because the win32 socket emulation layer simply wasn't
 designed to deal with non-blocking sockets. Specifically, it actually
 *always* sets the socket to non-blocking mode, and then uses that to
 properly emulate how sockets work under unix.

 I presume the win32 emulation layer can be taught about non-blocking
 sockets? Or maybe pq_getbyte_if_available() can be implemented using
 some other simpler method on Windows.

 It could be taught that, but it would probably be a lot easier to put
 platform specific code in pq_getbyte_if_available().

Umm.. in this case, for SSL on win32 case, we also need to create
new function like my_sock_read_if_available() that receives data
from non-blocking socket, and reassign it to the SSL BIO function.
Right? If so, it seems easier for me to tell the win32 layer about
non-blocking.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Backup history file should be replicated in Streaming Replication?

2010-02-08 Thread Fujii Masao

On Mon, Feb 8, 2010 at 6:11 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 I removed that.

Thanks!

 It only makes a difference if you stop archive recovery, remove
 recovery.conf, and start up again, causing the server to do normal crash
 recovery. That's a don't do that scenario, but it seems better to not
 clear minRecoveryPoint, even though we don't check it during crash
 recovery. It might be useful debug information, and also if you then put
 recovery.conf back, we will enforce that you reach the minRecoveryPoint
 again.

This makes sense.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] damage control mode

2010-02-08 Thread Magnus Hagander

2010/2/7 Josh Berkus j...@agliodbs.com:
 As between the two, I get the feeling that there is more interest in
 writeable CTEs.  But that impression might be wrong, since it's an
 unscientific recollection of discussions on -hackers; which are
 themselves not representative of anything.

 Writeable CTE is definitely the bigger feature.  Effectively, it allows
 people to do in a single query data-transformation operations which
 would have taken a stored procedure before.  Think of it as comparable
 to the introduction of callbacks in Perl for coolness.

Yes, it's bigger. It's certainly a bigger marketing checkbox item.
That doesn't necessarily make it more useful.

As a comparison point, I've come across a number of cases with clients
where being able to do RANGE BETWEEN on windowing queries would've
been extremely helpful, and where there's no reasonable way to do that
at all today other than to dump all the data off into the application.
Neither of which are exactly pretty or fast.

The similar case for Writable CTEs, I've always been able to wrap it
in a function. Which is nowhere near as nice as having writable CTEs
of course, but the workaround for not having it is less severe.

I certainly wish we could have both, of course... :S

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Provide rowcount for utility SELECTs

2010-02-08 Thread Boszormenyi Zoltan

Robert Haas írta:
 ...
 OK, please change it.
   

New patch is attached with the discussed changes.

 Someone who knows the overall structure of the code better than I do
 will have to comment on whether there are any code paths to worry
 about that do not go through PortalRun().

 A general concern I have is that this what we're basically doing here
 is handling the most common case in ProcessQuery() and then installing
 fallback mechanisms to pick up any stragglers: but the fallback
 mechanisms only guarantee that we'll add a number to the command tag,
 not that it will be meaningful.  Unfortunately, my limited imagination
 can't quite figure out in what cases we'll get a SELECT command tag
 back other than (1) plain old SELECT, (2) SELECT INTO, and (3) CTAS,
 so I'm not sure what to go test.
   

 Any thoughts on these issues, anyone?

 ...Robert
   

Best regards,
Zoltán Böszörményi

-- 
Bible has answers for everything. Proof:
But let your communication be, Yea, yea; Nay, nay: for whatsoever is more
than these cometh of evil. (Matthew 5:37) - basics of digital technology.
May your kingdom come - superficial description of plate tectonics

--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
http://www.postgresql.at/

diff -dcrpN pgsql.orig/src/backend/tcop/pquery.c pgsql/src/backend/tcop/pquery.c
*** pgsql.orig/src/backend/tcop/pquery.c	2010-01-03 12:54:25.0 +0100
--- pgsql/src/backend/tcop/pquery.c	2010-02-08 11:46:33.0 +0100
*** ProcessQuery(PlannedStmt *plan,
*** 205,211 
  		switch (queryDesc-operation)
  		{
  			case CMD_SELECT:
! strcpy(completionTag, SELECT);
  break;
  			case CMD_INSERT:
  if (queryDesc-estate-es_processed == 1)
--- 205,212 
  		switch (queryDesc-operation)
  		{
  			case CMD_SELECT:
! snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
! 		 SELECT %u, queryDesc-estate-es_processed);
  break;
  			case CMD_INSERT:
  if (queryDesc-estate-es_processed == 1)
*** PortalRun(Portal portal, long count, boo
*** 714,719 
--- 715,721 
  		  char *completionTag)
  {
  	bool		result;
+ 	uint32		nprocessed;
  	ResourceOwner saveTopTransactionResourceOwner;
  	MemoryContext saveTopTransactionContext;
  	Portal		saveActivePortal;
*** PortalRun(Portal portal, long count, boo
*** 776,814 
  		switch (portal-strategy)
  		{
  			case PORTAL_ONE_SELECT:
- (void) PortalRunSelect(portal, true, count, dest);
- 
- /* we know the query is supposed to set the tag */
- if (completionTag  portal-commandTag)
- 	strcpy(completionTag, portal-commandTag);
- 
- /* Mark portal not active */
- portal-status = PORTAL_READY;
- 
- /*
-  * Since it's a forward fetch, say DONE iff atEnd is now true.
-  */
- result = portal-atEnd;
- break;
- 
  			case PORTAL_ONE_RETURNING:
  			case PORTAL_UTIL_SELECT:
  
  /*
   * If we have not yet run the command, do so, storing its
!  * results in the portal's tuplestore.
   */
! if (!portal-holdStore)
  	FillPortalStore(portal, isTopLevel);
  
  /*
   * Now fetch desired portion of results.
   */
! (void) PortalRunSelect(portal, true, count, dest);
  
! /* we know the query is supposed to set the tag */
  if (completionTag  portal-commandTag)
! 	strcpy(completionTag, portal-commandTag);
  
  /* Mark portal not active */
  portal-status = PORTAL_READY;
--- 778,812 
  		switch (portal-strategy)
  		{
  			case PORTAL_ONE_SELECT:
  			case PORTAL_ONE_RETURNING:
  			case PORTAL_UTIL_SELECT:
  
  /*
   * If we have not yet run the command, do so, storing its
!  * results in the portal's tuplestore. Do this only for the
!  * PORTAL_ONE_RETURNING and PORTAL_UTIL_SELECT cases.
   */
! if ((portal-strategy != PORTAL_ONE_SELECT)  (!portal-holdStore))
  	FillPortalStore(portal, isTopLevel);
  
  /*
   * Now fetch desired portion of results.
   */
! nprocessed = PortalRunSelect(portal, true, count, dest);
  
! /*
!  * If the portal result contains a command tag and the caller
!  * gave us a pointer to store it, copy it. Patch the SELECT
!  * tag to also provide the rowcount.
!  */
  if (completionTag  portal-commandTag)
! {
! 	if (strcmp(portal-commandTag, SELECT) == 0)
! 		snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
! 		SELECT %u, nprocessed);
! 	else
! 		strcpy(completionTag, portal-commandTag);
! }
  
  /* Mark portal not active */
  portal-status = PORTAL_READY;
*** PortalRunMulti(Portal portal, bool isTop
*** 1318,1337 
  	 * If a command completion tag was supplied, use it.  Otherwise use the
  	 * portal's commandTag as the default completion tag.
  	 *
! 	 * Exception: clients will expect INSERT/UPDATE/DELETE tags to have
! 	 * counts, so fake something up if necessary.  (This could happen if the
  	 * original

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Massa, Harald Armin

Greg,


 The point isn't so much standardizing.  Having a low performance Python
 driver turns into a PostgreSQL PR issue.


I totally agree.

And if you're writing a database driver with performance as a goal, native
Python is simply not an option.

yes. Additionally: performance is not the only challenge. A native Python
implementation, without using libpq, will have to reimplement much of libpq
- just let me isolate proper escaping, and will have its own bugs.

Now, once *that* problem is under control, and there's a nicely licensed,
 well documented, major feature complete, and good performing driver, at that
 point it would be completely appropriate to ask what about people who want
 support for other Python platforms and don't care if it's slower?.


Pure Pythondrivers do exist now; and they are allready discussed in the
summaries - which is a good thing. With my remarks I just want to recommend
that we at least should document a position for them; and a way ahead. And I
need a place to point out that Python grew a FFI with ctypes. Maybe someone
will think of a DBAPI2.0 compatible ctypes libpq wrapper ...

Harald



-- 
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
no fx, no carrier pigeon
-
%s is too gigantic of an industry to bend to the whims of reality

Re: [HACKERS] Bugs in b-tree dead page removal

2010-02-08 Thread Heikki Linnakangas

Tom Lane wrote:
 Whilst looking around for stuff that could be deleted thanks to removing
 old-style VACUUM FULL, I came across some code in btree that seems
 rather seriously buggy.  For reasons explained in nbtree/README, we
 can't physically recycle a deleted btree index page until all
 transactions open at the time of deletion are gone --- otherwise we
 might re-use a page that an existing scan is about to land on, and
 confuse that scan.  (This condition is overly strong, of course, but
 it's what's designed in at the moment.)  The way this is implemented
 is to label a freshly-deleted page with the current value of
 ReadNewTransactionId().  Once that value is older than RecentXmin,
 the page is presumed recyclable.
 
 I think this was all right when it was designed, but isn't it rather
 badly broken by our subsequent changes to have transactions not take
 out an XID until/unless they write something?  A read-only transaction
 could easily be much older than RecentXmin, no?

Yeah :-(.

 The odds of an actual problem seem not very high, since to be affected
 a scan would have to be already in flight to the problem page when
 the deletion occurs.  By the time RecentXmin advances and we feed the
 page to the FSM and get it back, the scan's almost surely going to have
 arrived.  And I think the logic is such that this would not happen
 before the next VACUUM in any case.  Still, it seems pretty bogus.

One idea is to change the traversal logic slightly, so that whenever you
follow a pointer from page A to B, you keep A pinned until you've pinned
B. Then we could just take cleanup lock on the left, right and parent of
the empty page, one at a time, to ensure that no-one is in-flight to it,
and recycle it immediately. However, forward scans screw that up. When a
forward scan reads a page, it saves its right pointer at the same time,
so that if the page is subsequently split, it doesn't process tuples on
the new right half again. Even if we take cleanup lock on the left
sibling of an empty page, there can be scans further left that have a
direct pointer to the empty page. It could be salvaged if we require a
forward scan to pin the next page too when it saves the right pointer,
but that seems inefficient.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pathological regexp match

2010-02-08 Thread Magnus Hagander

2010/2/1 Michael Glaesemann michael.glaesem...@myyearbook.com:

 On Jan 31, 2010, at 22:14 , Tom Lane wrote:

 The Tcl folk accepted that patch, so I went ahead and applied it to
 our code.  It would still be a good idea for us to do any testing we
 can on it, though.

 I applied the patch and ran both the test query I submitted as well as 
 original problematic query that triggered the report, and it runs much 
 faster. Thanks for the fix!

I did the same, and it does not help in my case. FWIW, the regexp I'm
matching is:
pre .*?(.*?)/pre

(yes, the production system has already been fixed to use a smarter
regexp that solves the same problem)

The text is about 180Kb. PostgreSQL takes ~40 seconds without the
patch, ~36 seconds with it, to extract the match from it. Perl takes
0.016 seconds.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Largeobject Access Controls (r2460)

2010-02-08 Thread Alvaro Herrera

Takahiro Itagaki escribió:
 
 KaiGai Kohei kai...@kaigai.gr.jp wrote:
 
 default:both contents and metadata
 --data-only:same
 --schema-only:  neither
  
  However, it means only large object performs an exceptional object class
  that dumps its owner, acl and comment even if --data-only is given.
  Is it really what you suggested, isn't it?
 
 I wonder we still need to have both BLOB ITEM and BLOB DATA
 even if we will take the all-or-nothing behavior. Can we handle
 BLOB's owner, acl, comment and data with one entry kind?

I don't think this is necessarily a good idea.  We might decide to treat
both things separately in the future and it having them represented
separately in the dump would prove useful.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot standby documentation

2010-02-08 Thread Bruce Momjian

Markus Wanner wrote:
 Bruce,
 
 Bruce Momjian wrote:
  Ah, I now realize it only mentions warm standby, not hot, so I just
  updated the documentation to reflect that;  you can see it here:
 
 Maybe the table below also needs an update, because unlike Warm Standby 
 using PITR, a hot standby accepts read-only queries and can be 
 configured to not loose data on master failure.

Ahh, good point.  I had not considered the table would change.  What I
did was to mark Slaves accept read-only queries as Hot only.  You
can see the result here:

http://momjian.us/tmp/pgsql/high-availability.html

I did not change Master failure will never lose data because the 9.0
streaming implementation is not sychronous (see wal_sender_delay in
postgresql.conf), and I don't think even setting that to zero makes the
operation synchronous.  I think we will have to wait for PG 9.1 for
_synchronous_ streaming replication.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add on_plperl_init and on_plperlu_init to plperl UPDATE 3 [PATCH]

2010-02-08 Thread Tim Bunce

On Sun, Feb 07, 2010 at 08:25:33PM -0500, Andrew Dunstan wrote:
 Tim Bunce wrote:
 This is the third update to the fourth of the patches to be split out
 from the former 'plperl feature patch 1'.
 
 Changes in this patch:
 
 - Added plperl.on_plperl_init and plperl.on_plperlu_init GUCs
 Both are PGC_SUSET
 SPI functions are not available when the code is run.
 Errors are detected and reported as ereport(ERROR, ...)
 Corresponding documentation and tests for both.
 
 - Renamed plperl.on_perl_init to plperl.on_init
 
 - Improved state management of select_perl_context()
 An error during interpreter initialization will leave
 the state (interp_state etc) unchanged.
 
 - The utf8fix code has been greatly simplified.
 
 - More code comments re PGC_SUSET and no access to SPI functions.
 
 
 The docs on this patch need some cleaning up and expanding:
 
* The possible insecurity of %_SHARED needs expanding.

I tried. I couldn't decide how to expand what Tom Lane suggested
(http://archives.postgresql.org/message-id/1344.1265223...@sss.pgh.pa.us)
without it turning into a sprawling security tutorial.

So, since his suggestion seemed complete enough (albeit fairly vague),
I just used it almost verbatim.

Also, the PL/Tcl docs don't mention the issue at all and the PL/Python
docs say only The global dictionary GD is public data, available to all
Python functions within a session. Use with care.

The wording in the PL/Python docs seems better (available to all ...
use with care), so I've adopted the same kind of wording.

* The docs still refer to plperl.on_untrusted_init

Oops. Thanks. Fixed.

* plperl.on_plperl_init and plperl.on_plperlu_init can be documented
  together rather than repeating the same stuff almost word for word.

Ok. Done.

* extra examples for these two, and an explanation of why one might
  want to use each of the three on*init settings would be good.

plperl.on_init has an example that seems fairly self-explantory.
I've added one to the on_plperl_init/on_plperlu_init section that
also explains how a superuser can use ALTER USER ... SET  to set
a value for a non-superuser.

 I'll continue reviewing the patch, but these things at least need fixing.

I've an updated patch ready to go. I'll hold on to it for now.
Let me know if you have any more issues, or not.
Thanks.

Tim.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Create a relation mapping infrastructure to support changing

2010-02-08 Thread Alvaro Herrera


I just noticed that this patch

 Create a relation mapping infrastructure to support changing the 
 relfilenodes
 of shared or nailed system catalogs.  This has two key benefits:

creates a new function pg_relation_filenode() that only uses the
syscache to fetch the relation's filenode, without locking it.  I wonder
if we could do the same in the pg_relation_size() function and friends,
to avoid having to grab a lock on the relation.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Greg Sabino Mullane


-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 I have written up a set of guidelines for driver development 
 based on what I learned working on ruby-pg:

 http://wiki.postgresql.org/wiki/Driver_development
...
 I would appreciate comments by anyone (Greg Sabino Mullane: I included
 you in the CC because I thought you may have some input).

Good page. I looked it over but have nothing to add at the moment. I may 
do so later once my head is in dbdpg mode (working on other project 
at the moment :)

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201002080931
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAktwIEYACgkQvJuQZxSWSsjczQCgkU5b6iHPREJYMtAdWlFRDkYI
cS4An3AMyc+O06HzN8MYkfq8HG62371y
=+WCV
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Create a relation mapping infrastructure to support changing

2010-02-08 Thread Tom Lane

Alvaro Herrera alvhe...@commandprompt.com writes:
 I just noticed that this patch

 Create a relation mapping infrastructure to support changing the 
 relfilenodes
 of shared or nailed system catalogs.  This has two key benefits:

 creates a new function pg_relation_filenode() that only uses the
 syscache to fetch the relation's filenode, without locking it.  I wonder
 if we could do the same in the pg_relation_size() function and friends,
 to avoid having to grab a lock on the relation.

I don't think it's a good idea to try to do physical access to the
relation without any lock.  The filenode function is a bit special
because it doesn't need anything except the pg_class row itself.
(Except in the case of a mapped relationn, but the underlying
mapping entry is unlikely to disappear, too.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] damage control mode

2010-02-08 Thread Alvaro Herrera

Dimitri Fontaine wrote:
 Marko Tiikkaja marko.tiikk...@cs.helsinki.fi writes:
  The documentation has definitely improved from the last time Robert
  looked at it, but I fear it still needs some more work.  I'm willing to
  do that work, but I need something concrete.
 
 It seems to me documentation is required to get into the source tree
 before beta, and as we see with some other patches it's definitely the
 case even with our newer procedures that some code gets in without its
 documentation properly finished. I guess this amounts to the commiter
 willing to fill up the docs later on.

Eh?  Previously we allowed code to go in with documentation to be
written after feature freeze.  Is this no longer acceptable?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] damage control mode

2010-02-08 Thread Robert Haas

On Mon, Feb 8, 2010 at 10:25 AM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Dimitri Fontaine wrote:
 Marko Tiikkaja marko.tiikk...@cs.helsinki.fi writes:
  The documentation has definitely improved from the last time Robert
  looked at it, but I fear it still needs some more work.  I'm willing to
  do that work, but I need something concrete.

 It seems to me documentation is required to get into the source tree
 before beta, and as we see with some other patches it's definitely the
 case even with our newer procedures that some code gets in without its
 documentation properly finished. I guess this amounts to the commiter
 willing to fill up the docs later on.

 Eh?  Previously we allowed code to go in with documentation to be
 written after feature freeze.  Is this no longer acceptable?

I don't think we usually allow that for minor features.  For big
things, it's probably more reasonable, but I would think that at least
some effort should be put in before commit.  I'm new here, though, so
I might be all wet.  But I wouldn't want to commit ten patches without
documentation and then have someone come back and say, OK, you
committed 'em, you write the docs.  Or else no one comes back, and I
forget, and it never gets done.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [GENERAL] FM format modifier does not remove leading zero from year

2010-02-08 Thread Alvaro Herrera

Bruce Momjian wrote:
 Guy Rouillier wrote:
  On 1/6/2010 3:29 PM, Tom Lane wrote:
   Guy Rouillierguyr-...@burntmail.com  writes:
   Oracle states clearly in the SQL Reference manual:
  
   A modifier can appear in a format model more than once. In such a case,
   each subsequent occurrence toggles the effects of the modifier.
  
   *Toggles* the effect of the modifier?  Egad, what drunken idiot chose
   that specification?
  
  Eh, tomato, tomahto.  If you assume that someone will strip leading 
  zeroes consistently, the Oracle approach makes sense.  That would be a 
  reasonable assumption to make; why would I strip the zero off the month 
  but leave it on the day?  So, in the unusual case that you want to do 
  such a thing, you are asked to use a second occurrence of FM to turn 
  zero suppression back off.
 
 I have developed the attached patch which implements FM control of YYY,
 YY, and Y specifications.  I also documented that we do not match
 Oracle's toggle behavior.  There are a few effects on regression test
 output which are part of this patch.

What's the point of not following Oracle here, since this is solely an
Oracle compatibility function?


-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Listen / Notify - what to do when the queue is full

2010-02-08 Thread Alvaro Herrera

Joachim Wieland wrote:

 + typedef struct AsyncQueueEntry
 + {
 + /*
 +  * this record has the maximal length, but usually we limit it to
 +  * AsyncQueueEntryEmptySize + strlen(payload).
 +  */
 + Sizelength;
 + Oid dboid;
 + TransactionId   xid;
 + int32   srcPid;
 + charchannel[NAMEDATALEN];
 + charpayload[NOTIFY_PAYLOAD_MAX_LENGTH];
 + } AsyncQueueEntry;
 + #define AsyncQueueEntryEmptySize \
 +  (sizeof(AsyncQueueEntry) - NOTIFY_PAYLOAD_MAX_LENGTH + 1)

These are the on-disk notifications, right?  It seems to me a bit
wasteful to store channel name always as NAMEDATALEN bytes.  Can we
truncate it at its strlen?  I realize that this would cause the struct
definition to be uglier (you will no longer be able to have both channel
and payload pointers, only a char[1] pointer to a data area to which you
write both).  Typical channel names should be short, so IMHO this is
worthwhile.  Besides, I think the uglification of code this causes
should be fairly contained ...

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [GENERAL] FM format modifier does not remove leading zero from year

2010-02-08 Thread Tom Lane

Alvaro Herrera alvhe...@commandprompt.com writes:
 What's the point of not following Oracle here, since this is solely an
 Oracle compatibility function?

Changing FM's behavior like that will break approximately every user
of to_char() ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] IndexBuildHeapScan and RIDs order

2010-02-08 Thread Leonardo F

Hi,

I was looking at the code for bitmap index:

http://archives.postgresql.org/pgsql-hackers/2008-10/msg01691.php

and I couldn't understand why during bmbuild (the ambuild call for
bitmap indexes) the code checks for not-ordered ItemPointerData(s).

In which cases the ItemPointerData(s) given by IndexBuildHeapScan

are not in order (when allow_sync=false)?



Leonardo




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Writeable CTEs patch

2010-02-08 Thread Robert Haas

On Thu, Feb 4, 2010 at 11:57 AM, Marko Tiikkaja
marko.tiikk...@cs.helsinki.fi wrote:
 On 2010-02-04 18:04 UTC+2, I wrote:
 While working on the docs, I noticed one problem with the patch itself:
 it doesn't handle multi-statement DO INSTEAD rules correctly.  I'm going
 to submit a fix for that later.

 Here's an updated patch.  Only changes from the previous patch are
 fixing the above issue and a regression test for it.

The comments on the parts I asked about before are much better in this
version.  A few other things that I notice:

- I'm not sure that canSetTag is the right name for the additional
argument to ExecInsert/ExecUpdate/ExecDelete.  OTOH, I'm not sure it's
the wrong name either.  But should we use something like
isTopLevelQuery?

- It appears that we pull out all of the DML statements first and run
them in order, but I'm not sure that's the right thing to do.
Consider:

WITH x AS (INSERT ...), y AS (SELECT ...), z AS (INSERT ...) SELECT ...

I would assume we would do x, CCI, do y, do z, CCI, do main query, but
I don't think that's what this implements.  The user might be
surprised to find out that y sees the effects of z.

- I think that the comment in analyzeCTE that says /* Check that we
got something reasonable */ could be fleshed out a bit.  You could
still reference transformRangeSubselect, for example, but then explain
why the checks here are different (viz, CTEs can contain DML).

- The comment for RegisterSnapshotCopy identifies the function name as
RegisterSnapshot; I think this is a copy-and-pasteo.

- It seems like the gram.y changes for common_table_expr might benefit
from some factoring; that is, create a production (or find a suitable
existing one) for statements of the sort that can appear within
CTEs, and then use that in common_table_expr.  Or maybe this doesn't
work; I haven't tried it.

- I still don't much like the idea of using DML WITH in error
messages.  One idea I had (which might suck, but I'm just throwing it
out there) is to change hasDmlWith to an integer bitmap with a bit for
each of insert, update, and delete.  But it may be better still to
just rephrase the error messages.  Could we just write, e.g.
non-SELECT statements are not allowed within a cursor declaration?
Or we could say INSERT, UPDATE, and DELETE statements are not allowed
within a cursor declaration, but I'm thinking we may want to allow
things like COPY and EXPLAIN inside CTEs in the future, too, and
they'll presumably be treated similarly to DML.

For the record, Tom or whoever should feel to swoop in here at any
time, or add to any of this.  I'm just making suggestions until the
big guns show up.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] damage control mode

2010-02-08 Thread Dimitri Fontaine

Robert Haas robertmh...@gmail.com writes:
 On Mon, Feb 8, 2010 at 10:25 AM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 Eh?  Previously we allowed code to go in with documentation to be
 written after feature freeze.  Is this no longer acceptable?

 I don't think we usually allow that for minor features.  For big
 things, it's probably more reasonable, but I would think that at least
 some effort should be put in before commit.  I'm new here, though, so
 I might be all wet.  But I wouldn't want to commit ten patches without
 documentation and then have someone come back and say, OK, you
 committed 'em, you write the docs.  Or else no one comes back, and I
 forget, and it never gets done.

Well, traditionnaly, we had Bruce to sort those things out. But in this
case the problem is not so much about writing documentation than
deciding where to put it and what to explain exactly. I think.

Anyway saying the patch can not be considered by a commiter for only
lack of complete documentation is not a policy here, IME. It can happen,
but I would consider it bad news if it were to become a way to force the
release timeframe. What is hard is doing *good* compromises.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] damage control mode

2010-02-08 Thread Robert Haas

On Mon, Feb 8, 2010 at 11:47 AM, Dimitri Fontaine
dfonta...@hi-media.com wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Feb 8, 2010 at 10:25 AM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 Eh?  Previously we allowed code to go in with documentation to be
 written after feature freeze.  Is this no longer acceptable?

 I don't think we usually allow that for minor features.  For big
 things, it's probably more reasonable, but I would think that at least
 some effort should be put in before commit.  I'm new here, though, so
 I might be all wet.  But I wouldn't want to commit ten patches without
 documentation and then have someone come back and say, OK, you
 committed 'em, you write the docs.  Or else no one comes back, and I
 forget, and it never gets done.

 Well, traditionnaly, we had Bruce to sort those things out. But in this
 case the problem is not so much about writing documentation than
 deciding where to put it and what to explain exactly. I think.

 Anyway saying the patch can not be considered by a commiter for only
 lack of complete documentation is not a policy here, IME. It can happen,
 but I would consider it bad news if it were to become a way to force the
 release timeframe. What is hard is doing *good* compromises.

Of course any committer can consider any patch whenever they like,
regardless of how it is marked on commitfest.postgresql.org, right?
And there has been no shortage of committers doing just that; 80%+ of
the reviews for this CommitFest were done by committers.  But I'm not
going to spend the time to write the docs for somebody else's patch
unless I really care about seeing it go in; other committers are free
to do as they like, of course.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pathological regexp match

2010-02-08 Thread David E. Wheeler

On Feb 8, 2010, at 5:15 AM, Magnus Hagander wrote:

 The text is about 180Kb. PostgreSQL takes ~40 seconds without the
 patch, ~36 seconds with it, to extract the match from it. Perl takes
 0.016 seconds.

Obviously we need to support Perl regular expressions in core. Not PCRE, but 
Perl. ;-P

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Order of operations in lazy_vacuum_rel

2010-02-08 Thread Tom Lane

I see that lazy_vacuum_rel() truncates the heap before it does vacuuming
of the free space map.  Once upon a time this wouldn't have mattered,
but now it means that cancel interrupts are likely to be ignored for
the duration of FreeSpaceMapVacuum().  Is that really necessary?
Would it be okay to swap the two steps?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] damage control mode

2010-02-08 Thread Josh Berkus

On 2/8/10 7:31 AM, Robert Haas wrote:
 Eh?  Previously we allowed code to go in with documentation to be
 written after feature freeze.  Is this no longer acceptable?

My $0.0201115:

Depends on the feature, I'd say.  If it's sufficiently obvious to test
the feature without full documentation, then sure.  If, however,
reviewers can't adequately test the patch because they don't know all of
the syntax being implemented, then docs are a requirement.

--Josh Berkus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Order of operations in lazy_vacuum_rel

2010-02-08 Thread Alvaro Herrera

Tom Lane wrote:
 I see that lazy_vacuum_rel() truncates the heap before it does vacuuming
 of the free space map.  Once upon a time this wouldn't have mattered,
 but now it means that cancel interrupts are likely to be ignored for
 the duration of FreeSpaceMapVacuum().  Is that really necessary?
 Would it be okay to swap the two steps?

How would it matter?  Interrupts are not enabled until the transaction
is committed anyway, which must happen after both things have finished ..

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] damage control mode

2010-02-08 Thread David E. Wheeler

On Feb 8, 2010, at 9:34 AM, Josh Berkus wrote:

 Eh?  Previously we allowed code to go in with documentation to be
 written after feature freeze.  Is this no longer acceptable?
 
 My $0.0201115:

I think you need to use a NUMERIC type there, as some calculation has lost 
precision in the float.

Best,

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Order of operations in lazy_vacuum_rel

2010-02-08 Thread Tom Lane

Alvaro Herrera alvhe...@commandprompt.com writes:
 Tom Lane wrote:
 I see that lazy_vacuum_rel() truncates the heap before it does vacuuming
 of the free space map.  Once upon a time this wouldn't have mattered,
 but now it means that cancel interrupts are likely to be ignored for
 the duration of FreeSpaceMapVacuum().  Is that really necessary?
 Would it be okay to swap the two steps?

 How would it matter?  Interrupts are not enabled until the transaction
 is committed anyway, which must happen after both things have finished ..

The point would be to not disable interrupts till after doing the FSM
vacuuming.  Ignoring CANCEL for longer than we must is bad.

I'm also looking at not disabling the interrupt until lazy_truncate_heap
determines that it's actually going to do RelationTruncate.  The current
coding disables interrupts without any need in a large fraction of cases.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Writeable CTEs patch

2010-02-08 Thread Marko Tiikkaja

On 2010-02-08 18:42 +0200, Robert Haas wrote:
 On Thu, Feb 4, 2010 at 11:57 AM, Marko Tiikkaja
 Here's an updated patch.  Only changes from the previous patch are
 fixing the above issue and a regression test for it.
 
 - I'm not sure that canSetTag is the right name for the additional
 argument to ExecInsert/ExecUpdate/ExecDelete.  OTOH, I'm not sure it's
 the wrong name either.  But should we use something like
 isTopLevelQuery?

No objection to changing that.

 - It appears that we pull out all of the DML statements first and run
 them in order, but I'm not sure that's the right thing to do.
 Consider:
 
 WITH x AS (INSERT ...), y AS (SELECT ...), z AS (INSERT ...) SELECT ...
 
 I would assume we would do x, CCI, do y, do z, CCI, do main query, but
 I don't think that's what this implements.  The user might be
 surprised to find out that y sees the effects of z.

Hmm.  Right.  That sounds like the right thing to do.  Another option
(which I seem to recall we've discussed before) is to not allow any
SELECT statements between DML WITHs, but I think this is what we should
go for.

 - I think that the comment in analyzeCTE that says /* Check that we
 got something reasonable */ could be fleshed out a bit.  You could
 still reference transformRangeSubselect, for example, but then explain
 why the checks here are different (viz, CTEs can contain DML).

Ok, I'll look into that.

 - The comment for RegisterSnapshotCopy identifies the function name as
 RegisterSnapshot; I think this is a copy-and-pasteo.

You're right.  Will fix.

 - It seems like the gram.y changes for common_table_expr might benefit
 from some factoring; that is, create a production (or find a suitable
 existing one) for statements of the sort that can appear within
 CTEs, and then use that in common_table_expr.  Or maybe this doesn't
 work; I haven't tried it.

My bison-fu is not exactly strong, but I can look at the feasibility of
that.

 - I still don't much like the idea of using DML WITH in error
 messages.  One idea I had (which might suck, but I'm just throwing it
 out there) is to change hasDmlWith to an integer bitmap with a bit for
 each of insert, update, and delete.  But it may be better still to
 just rephrase the error messages.

I don't see how that would work.  We'd still potentially have many
different types of DML operations to deal with and that wouldn't help at
all at distinguishing which operation actually caused the error.  Or did
I misunderstand?

 Could we just write, e.g.
 non-SELECT statements are not allowed within a cursor declaration?
 Or we could say INSERT, UPDATE, and DELETE statements are not allowed
 within a cursor declaration, but I'm thinking we may want to allow
 things like COPY and EXPLAIN inside CTEs in the future, too, and
 they'll presumably be treated similarly to DML.

INSERT, UPDATE and DELETE is quite long and non-SELECT is a bit
clumsy IMO.  But I don't really have anything better to offer, either.


Regards,
Marko Tiikkaja

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] damage control mode

2010-02-08 Thread Merlin Moncure

On Sun, Feb 7, 2010 at 11:23 PM, Robert Haas robertmh...@gmail.com wrote:
 On Sun, Feb 7, 2010 at 3:37 PM, Josh Berkus j...@agliodbs.com wrote:
 As between the two, I get the feeling that there is more interest in
 writeable CTEs.  But that impression might be wrong, since it's an
 unscientific recollection of discussions on -hackers; which are
 themselves not representative of anything.

 Writeable CTE is definitely the bigger feature.  Effectively, it allows
 people to do in a single query data-transformation operations which
 would have taken a stored procedure before.  Think of it as comparable
 to the introduction of callbacks in Perl for coolness.

 Now if I knew what callbacks in Perl were, I'd probably be impressed.
 You mean closures?

 I have not looked at the window functions patch at all, and I haven't
 looked at the latest version of writeable CTEs, either.  I will try to
 spend some time on it in the next couple of days.  My feeling about
 the last version is that it lacked a lot in the documentation
 department, and also in the comments department.  Since I don't know
 that code very well, that made it hard for me to assess technical
 correctness.

 Hmmm, that's potentially lethal.  David Fetter has been doing a lot of
 presentations on the feature; surely he could turn them into some
 documentation?  David?

 I would be 100% in favor of some more help on the documentation.  I do
 plan to reread this patch, but I don't know that I can cover the
 amount of work that needs to be done myself, and as you say, lack of
 adequate documentation could very well kill this patch.  In fact, I'll
 go so far as to say it's one of the most likely reasons why this patch
 might not get in.  So any resources we can bring to bear on that issue
 would be well spent.

I'm on board to work on the documentation.  I think with a few hours
of work it should be in a reasonable state.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Strange heuristic in analyze.c

2010-02-08 Thread Greg Stark

On Fri, Feb 5, 2010 at 8:53 PM, Bruce Momjian br...@momjian.us wrote:
 Do you want a C comment to document this problem?

Well I would rather a better heuristic :)

We really need some statistics nerds in this group who can pipe up
when these kinds of issues come up. There must be a good way to
estimate the probability that we've seen all distinct values.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [CFReview] Red-Black Tree

2010-02-08 Thread Teodor Sigaev


That looks pretty good.  I confess I don't fully understand why it
works.  If we're inserting a bunch of equal-key entries, why does it
matter what order we insert them in?  Is there some code in here
(where?) that breaks ties on the basis of where they are in the input
data?


Entries to insert into GIN are unique by extractEntriesSU() call. So, instead of 
'{50,50,50}' array only one element 50 will be inserted.





I think that the code in ginInsertRecordBA() is needlessly complex.
As far as I can see, nNodesOnCurrentLevel is always exactly one more
than nNodesOnPreviousLevel, and I think step is also basically
redundant with both of these although the relationship is a little
more complex.  What I would suggest is something like:

- initialize step to the largest power of 2 s.t. step  nentry
- while step  0
-- for (i = step; true; i += 2 * step)
--- insert entry #i-1
--- if i  nentry - (2 * step)  /* must test before incrementing i, to
guard against overflow */
 break
-- step = step / 2

Good idea, implemented.



Typos:

bunary -  binary
This insertion order decreases number of rebalancing for tree -
should be number of rebalancings
castomized -  customized

Fixed

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


rbtree-0.12.gz
Description: Unix tar archive

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2010-02-08 Thread Greg Stark

On Mon, Feb 8, 2010 at 4:53 AM, Robert Haas robertmh...@gmail.com wrote:
 On Sun, Feb 7, 2010 at 10:09 PM, Alvaro Herrera
 Yeah, it seems there are two patches here -- one is the addition of
 fsync_fname() and the other is the fsync_prepare stuff.

Sorry, I'm just catching up on my mail from FOSDEM this past weekend.

I had come to the same conclusion as Greg that I might as well just
commit it with Tom's pg_flush_data() name and we can decide later if
and when we have pg_fsync_start()/pg_fsync_finish() whether it's worth
keeping two apis or not.

So I was just going to commit it like that but I discovered last week
that I don't have cvs write access set up yet. I'll commit it as soon
as I generate a new ssh key and Dave installs it, etc. I intentionally
picked a small simple patch that nobody was waiting on because I knew
there was a risk of delays like this and the paperwork. I'm nearly
there.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] damage control mode

2010-02-08 Thread Oleg Bartunov


On Mon, 8 Feb 2010, Josh Berkus wrote:


On 2/8/10 7:31 AM, Robert Haas wrote:

Eh?  Previously we allowed code to go in with documentation to be
written after feature freeze.  Is this no longer acceptable?


My $0.0201115:

Depends on the feature, I'd say.  If it's sufficiently obvious to test
the feature without full documentation, then sure.  If, however,
reviewers can't adequately test the patch because they don't know all of
the syntax being implemented, then docs are a requirement.


+1

Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: More frame options in window functions

2010-02-08 Thread Tom Lane

Hitoshi Harada umi.tan...@gmail.com writes:
 2010/1/23 Robert Haas robertmh...@gmail.com:
 Would it make sense to pull some of the infrastructure bits out of
 this patch and commit those bits separately, so as to reduce the size
 of the main patch?  In particular, the AggGetMemoryContext() stuff
 looks like a good candidate for that treatment.

 Fair enough. Attached contains that part only.

I started looking at this patch.  I believe that we should commit the 
AggGetMemoryContext API function --- *not* the window context
management changes that you included here, but only the API abstraction
--- to be sure that that gets into 9.0 so that third-party aggregate
functions can start relying on it instead of looking directly at the
AggState or WindowAggState node.  The rest of the patch might or might
not make it, but we can at least help people future-proof their code.

I'm fairly desperately unhappy with the RANGE PRECEDING/FOLLOWING parts
of the patch.  We have expended a great deal of sweat over the years
to avoid hard-wiring assumptions about particular operator names into
the code, but this patch is blithely ignoring that history and assuming
that + and - will do the right thing.  Also LookupOperName is
probably not the right thing, since it insists on exact or
binary-compatible match.  To the extent that there is any justification
at all for assuming that +/- are the right operators, it is that the
spec speaks in terms of the range bounds being VSK+V2F etc --- but if
someone were to actually write out such an expression, the parser would
allow for implicit casts to happen, so this code is not implementing
what that expression would produce.  Plus the results are dependent on
the current search path, which for example means it'll fail if the
window sort column is a user-defined type whose operators happen to be
out of path at the moment --- even though we would have found its
default sort opclass despite that.  And lastly, I'm totally unconvinced
that it's a good idea to accept an operator that returns a type other
than the type of the window sort column.  That seems to eliminate
whatever little protection we had against picking up an unsuitable
operator; and it complicates the code as well.

Given the lack of time remaining in this CF, I'm tempted to propose
ripping out the RANGE support and just trying to get ROWS committed.
That should be substantially less controversial from a semantic
standpoint, and it still seems like a considerable improvement in
functionality.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Order of operations in lazy_vacuum_rel

2010-02-08 Thread Alvaro Herrera

Tom Lane wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  Tom Lane wrote:
  I see that lazy_vacuum_rel() truncates the heap before it does vacuuming
  of the free space map.  Once upon a time this wouldn't have mattered,
  but now it means that cancel interrupts are likely to be ignored for
  the duration of FreeSpaceMapVacuum().  Is that really necessary?
  Would it be okay to swap the two steps?
 
  How would it matter?  Interrupts are not enabled until the transaction
  is committed anyway, which must happen after both things have finished ..
 
 The point would be to not disable interrupts till after doing the FSM
 vacuuming.  Ignoring CANCEL for longer than we must is bad.

Oh, I see.  I guess the answer is that it depends on what happens if you
interrupt after vacuuming the FSM.  I have no idea what that is supposed
to accomplish so I'm of no help here.  FreeSpaceMapVacuum says it's
about fixing inconsistencies in the FSM, so I'm guessing that it's not
critical.

 I'm also looking at not disabling the interrupt until lazy_truncate_heap
 determines that it's actually going to do RelationTruncate.  The current
 coding disables interrupts without any need in a large fraction of cases.

Hmm, yeah ... that moves the code to the innards of lazy_truncate_heap.
Seems reasonable.

FWIW I notice that RelationTruncate contains an outdated comment at the
top about the 'fsm' function argument which is nowadays no longer an
argument.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Florian Weimer

* Jeff Davis:

 I have written up a set of guidelines for driver development based on
 what I learned working on ruby-pg:

 http://wiki.postgresql.org/wiki/Driver_development

Interesting, thanks.

I'm contemplating to create a new language binding for libpq (or, to
be more precise, turn an existing language binding into something that
can be published).  I've been agonizing a bit over how to create a
bridge between the host language type system and the PostgreSQL type
system.  If I understand you correctly, you suggest to leave
everything as strings.  This solution has the appeal of being
implemented easily.  It also sidesteps a lot of issues revolving
around different representation choices for numbers.

Do you really suggest to preserve the PQexecParams API verbatim, that
is, passing in three arrays containing type, value, and format?  That
seems to be a bit problematic.  I suspect the common case will be to
use unknown types, text format, and the default conversion from values
to strings.  Only for BYTEA values, something else is required, and
I'm wondering how to encode that (the host language doesn't offer a
distinction between text and binary strings).

Conversely, for result sets, I'm tempted to transparently decode
escaped BYTEA columns.

 Note that the ruby-pg driver doesn't 100% adhere to those standards
 (encoding is the primary problem, and that will be fixed).

Lack of Unicode support means that I can punt that to application
authors, I guess.

By the way, the downside of using strings everywhere is that your
binding API will most likely not work with SQLite (or any other
SQL-like database which lacks column type information).

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2010-02-08 Thread Andres Freund

On Monday 08 February 2010 19:34:01 Greg Stark wrote:
 On Mon, Feb 8, 2010 at 4:53 AM, Robert Haas robertmh...@gmail.com wrote:
  On Sun, Feb 7, 2010 at 10:09 PM, Alvaro Herrera
  
  Yeah, it seems there are two patches here -- one is the addition of
  fsync_fname() and the other is the fsync_prepare stuff.
 
 Sorry, I'm just catching up on my mail from FOSDEM this past weekend.
 
 I had come to the same conclusion as Greg that I might as well just
 commit it with Tom's pg_flush_data() name and we can decide later if
 and when we have pg_fsync_start()/pg_fsync_finish() whether it's worth
 keeping two apis or not.
 
 So I was just going to commit it like that but I discovered last week
 that I don't have cvs write access set up yet. I'll commit it as soon
 as I generate a new ssh key and Dave installs it, etc. I intentionally
 picked a small simple patch that nobody was waiting on because I knew
 there was a risk of delays like this and the paperwork. I'm nearly
 there.
Do you still want me to split the patches into two or do you want to do it 
yourself?
One in multiple versions for the directory fsync and another one for 9.0?

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Writeable CTEs and empty relations

2010-02-08 Thread Marko Tiikkaja

Hi,

While playing around with another issue with the patch, I came across
the following:

= create table foo(a int);
CREATE TABLE
= with t as (insert into foo values(0)) select * from foo;
 a
---
(0 rows)

I traced this down to heapam.c, which has this:

/*
 * return null immediately if relation is empty
 */
 if (scan-rs_nblocks == 0)
 {
 Assert(!BufferIsValid(scan-rs_cbuf));
 tuple-t_data = NULL;
 return;
 }

and

/*
* Determine the number of blocks we have to scan.
*
* It is sufficient to do this once at scan start, since any tuples added
* while the scan is in progress will be invisible to my snapshot anyway.
* (That is not true when using a non-MVCC snapshot.  However, we couldn't
* guarantee to return tuples added after scan start anyway, since they
* might go into pages we already scanned.  To guarantee consistent
* results for a non-MVCC snapshot, the caller must hold some higher-level
* lock that ensures the interesting tuple(s) won't change.)
*/
scan-rs_nblocks = RelationGetNumberOfBlocks(scan-rs_rd);


This doesn't exactly work anymore since we modify the snapshot after
calling ExecInitScan().  I'm not really familiar with this part of the
code, so I'm asking: is there a simple enough way around this?  Would
updating scan-rs_nblocks before scanning the first tuple be OK?


Regards,
Marko Tiikkaja

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: More frame options in window functions

2010-02-08 Thread Josh Berkus

On 2/8/10 11:17 AM, Tom Lane wrote:
 Given the lack of time remaining in this CF, I'm tempted to propose
 ripping out the RANGE support and just trying to get ROWS committed.
 That should be substantially less controversial from a semantic
 standpoint, and it still seems like a considerable improvement in
 functionality.

+1

--Josh Berkus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Florian Weimer

* Andrew McNamara:

Any other suggestions before I turn the above into a roadmap page on the 
wiki?

 I got sick of the constant stream of escaping bugs impacting on psycopg
 and pyPgSQL, and wrote my own DB-API driver, using the more modern
 libpq/binary/protocol 3 APIs where ever possible. The result is BSD
 licensed:

 http://code.google.com/p/ocpgdb/

I saw your note that you have to specify the types for date values
etc.  Is this really desirable or even necessary?  Can't you specify
the type as unknown (OID 705, I believe)?

At work, we recently used to typelessness of Perl's DBD::Pg with great
effect, introducing a more compact, type-safe representation for a few
columns, without having to change all the existing Perl scripts
accessing the database.  That's why I'm wondering...

(And we might be using Python instead of Perl today.  Lack of a decent
PostgreSQL module for Python meant it was very hard to argue against
using Perl ...)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Order of operations in lazy_vacuum_rel

2010-02-08 Thread Heikki Linnakangas

Alvaro Herrera wrote:
 Tom Lane wrote:
 The point would be to not disable interrupts till after doing the FSM
 vacuuming.  Ignoring CANCEL for longer than we must is bad.
 
 Oh, I see.  I guess the answer is that it depends on what happens if you
 interrupt after vacuuming the FSM.  I have no idea what that is supposed
 to accomplish so I'm of no help here.  FreeSpaceMapVacuum says it's
 about fixing inconsistencies in the FSM, so I'm guessing that it's not
 critical.

Yeah, interrupting FreeSpaceMapVacuum (or right after it) is harmless.

 FWIW I notice that RelationTruncate contains an outdated comment at the
 top about the 'fsm' function argument which is nowadays no longer an
 argument.

Thanks, fixed.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [CFReview] Red-Black Tree

2010-02-08 Thread Alvaro Herrera


It seems a bit strange to have all the rb_free_recursive support and not
use it anywhere ... and a freefunc callback even, whose only caller
seems to set as null currently.  Hmm, even in the knngist patch the
rb_freefunc stuff is unused.  How do we now that it works?

(What, for example, if we were to allocate multiple nodes in a single
palloc chunk?  I'm not familiar with this stuff but that seems
plausible)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Knngist for 8.5

2010-02-08 Thread Oleg Bartunov


List of holidays by country
http://en.wikipedia.org/wiki/List_of_holidays_by_country

I'm not sure how it's valid, though. In Russia, for example,
russian goverment decreed holidays 1-10 January, 2010. I think next time we
should consider december-january as a half.

Oleg
On Sun, 7 Feb 2010, Robert Haas wrote:


2010/2/7 Oleg Bartunov o...@sai.msu.su:

I understand your complaints. I think, the real problem is that some of us
live in the part of word with long holidays in December, while we in Russia
have very long holidays in January. So, about a month we couldn't
synchronize developers and reviewers.  I'm not sure if we took this into
account.


Yeah, that definitely made things harder.  I had the feeling when I
started looking at this stuff over Christmas that it was going to take
a really determined and non-stop effort to get it all done, and we
haven't quite had that, either on the reviewing end or on your end.
Your holidays slowed things down, but we also had a quite small pool
of round-robin reviewers for this CF, and I couldn't get anyone to
sign on for knngist.  Mark Cave-Ayland eventually volunteered but that
was relatively late, and then he hasn't posted anything yet because he
got involved in helping with rbtree (which by the way isn't quite
done; we should really try to finish that up).  So I think it was a
combination of things.

By the way, I wish I had your holiday schedule!  Can you send me a few of those?


In regard to the knngist patch I want to claim, that I and Teodor are here
and willing to answer any questions.


I really hope that Mark (or someone else) will post a review before
this CommitFest is over.  I believe it is out of reach to get this
committed for this CF, but it would sure be nice to see it get at
least some review.  I would like to review it myself at some point,
but I think right now I need to focus on things that are a little
further along and have a better chance of getting in.

...Robert



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Writeable CTEs patch

2010-02-08 Thread Robert Haas

On Mon, Feb 8, 2010 at 1:01 PM, Marko Tiikkaja
marko.tiikk...@cs.helsinki.fi wrote:
 Could we just write, e.g.
 non-SELECT statements are not allowed within a cursor declaration?
 Or we could say INSERT, UPDATE, and DELETE statements are not allowed
 within a cursor declaration, but I'm thinking we may want to allow
 things like COPY and EXPLAIN inside CTEs in the future, too, and
 they'll presumably be treated similarly to DML.

 INSERT, UPDATE and DELETE is quite long and non-SELECT is a bit
 clumsy IMO.  But I don't really have anything better to offer, either.

Yeah, I don't feel good about INSERT, UPDATE, and DELETE because in
most of the relevant contexts the list might get longer if in the
future we allow things like EXPLAIN and COPY within CTEs.  I think
Non-SELECT statement is reasonably clear, though; people might not
know which things are statements, but the message implies that SELECT
is one such thing, and not the one that's the problem, which should
get them pointed in the right direction.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Knngist for 8.5

2010-02-08 Thread Robert Haas

2010/2/8 Oleg Bartunov o...@sai.msu.su:
 List of holidays by country
 http://en.wikipedia.org/wiki/List_of_holidays_by_country

 I'm not sure how it's valid, though. In Russia, for example,
 russian goverment decreed holidays 1-10 January, 2010. I think next time we
 should consider december-january as a half.

Oh, I wasn't asking for a list of your holidays - I was just wishing
that I had as many as it sounds like you do.  :-)

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Writeable CTEs patch

2010-02-08 Thread Alvaro Herrera

Robert Haas escribió:

 Yeah, I don't feel good about INSERT, UPDATE, and DELETE because in
 most of the relevant contexts the list might get longer if in the
 future we allow things like EXPLAIN and COPY within CTEs.  I think
 Non-SELECT statement is reasonably clear, though; people might not
 know which things are statements, but the message implies that SELECT
 is one such thing, and not the one that's the problem, which should
 get them pointed in the right direction.

DML statements other than SELECT perhaps?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Writeable CTEs patch

2010-02-08 Thread Alvaro Herrera

Robert Haas escribió:
 On Mon, Feb 8, 2010 at 1:01 PM, Marko Tiikkaja
 marko.tiikk...@cs.helsinki.fi wrote:
  Could we just write, e.g.
  non-SELECT statements are not allowed within a cursor declaration?
  Or we could say INSERT, UPDATE, and DELETE statements are not allowed
  within a cursor declaration, but I'm thinking we may want to allow
  things like COPY and EXPLAIN inside CTEs in the future, too, and
  they'll presumably be treated similarly to DML.
 
  INSERT, UPDATE and DELETE is quite long and non-SELECT is a bit
  clumsy IMO.  But I don't really have anything better to offer, either.
 
 Yeah, I don't feel good about INSERT, UPDATE, and DELETE because in
 most of the relevant contexts the list might get longer if in the
 future we allow things like EXPLAIN and COPY within CTEs.  I think
 Non-SELECT statement is reasonably clear, though; people might not
 know which things are statements, but the message implies that SELECT
 is one such thing, and not the one that's the problem, which should
 get them pointed in the right direction.

Hmm, how about VALUES?  Isn't that a statement on its own right, that
would similarly unaffected?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Jeff Davis

On Mon, 2010-02-08 at 20:29 +0100, Florian Weimer wrote:
 I'm contemplating to create a new language binding for libpq (or, to
 be more precise, turn an existing language binding into something that
 can be published).  I've been agonizing a bit over how to create a
 bridge between the host language type system and the PostgreSQL type
 system.  If I understand you correctly, you suggest to leave
 everything as strings.  This solution has the appeal of being
 implemented easily.  It also sidesteps a lot of issues revolving
 around different representation choices for numbers.

Agreed. Ultimately, the conversion has to be done somewhere, but I don't
believe the driver is the place for it. Type conversions are always
going to be imperfect, and this has some important consequences:
 * The type conversion system will be endlessly tweaked to improve it
 * Developers will always run into problems with it in any complex
application, so we need to allow them to circumvent the system and do it
themselves when necessary.

Both of these things point to another layer on top of the driver itself.
It could be some extra convenience functions that come with the driver,
or an entirely separate layer (like ActiveRecord). But if we always let
the developer have access to the full power of libpq, it limits the
damage that can be done by a slightly-too-creative API on top of it.

 Do you really suggest to preserve the PQexecParams API verbatim, that
 is, passing in three arrays containing type, value, and format?  That
 seems to be a bit problematic.  I suspect the common case will be to
 use unknown types, text format, and the default conversion from values
 to strings.

I tried to address this specifically in the document:

For example: it should be easy to pass parameters so that PQexecParams
(and others) can be used, avoiding SQL injection risks. The important
thing is to maintain close to a one-to-one mapping between libpq and the
driver's API, and to provide all of the functionality of libpq.

In ruby-pg, you can just do:

  conn.exec(INSERT INTO foo VALUES($1), [Jeff])

And I think that's appropriate. What I'm saying is that there should
still exist some way to pass explicit types or formats (although that
should still be easier than it is in C ;). Here's the long form:

  conn.exec(INSERT INTO foo VALUES($1),
[{:value = Jeff, :format = 0, :type = 0}])

The nice thing about that format is that you can do the easy thing for
most of the parameters in a query, but then choose binary format for
that one BYTEA parameter. That's because, in ruby, you can mix strings
and hashes in the same array.

So I'm not saying we should make everyone code ruby that looks like C.
I'm saying that the job of the driver is to provide full access to
libpq, and anything beyond that should be an optional convenience
routine, and should be free of magic and cleverness (that's the job of a
higher layer).

 Conversely, for result sets, I'm tempted to transparently decode
 escaped BYTEA columns.

Consider the following ruby-pg program, where you have two empty tables
foo and bar, each with a single BYTEA column b:

  conn = PGconn.connect(...)
  conn.exec(INSERT INTO foo VALUES($1), [000])

  # copy the single value in foo into bar
  val = conn.exec(SELECT b FROM foo LIMIT 1)[0][b]
  conn.exec(INSERT INTO bar VALUES($1), [val])

That copies value so that foo and bar have the same contents: a 4 byte
value \000. What would happen though, if val was transparently
decoded? It would decode it once in ruby, and again inside of postgres
(in byteain), leaving you with a one byte value in bar, even though foo
has a four-byte value.

I really think that only higher layers should implement that kind of
magic, no matter how obvious it may seem that the user wants something
extra.

  Note that the ruby-pg driver doesn't 100% adhere to those standards
  (encoding is the primary problem, and that will be fixed).
 
 Lack of Unicode support means that I can punt that to application
 authors, I guess.

Ruby 1.9+ and Python 3.0+ both have string encoding models that can't
just be ignored. We could punt it by always returning byte sequences
rather than strings, but I think that's a particularly extreme version
of my philosophy of not trying to convert between types.

 By the way, the downside of using strings everywhere is that your
 binding API will most likely not work with SQLite (or any other
 SQL-like database which lacks column type information).

I am trying to develop standards suitable for PostgreSQL drivers based
on libpq. These are not meant to be standards for a database-agnostic
API, standards for a high-level database adapter, or even standards for
a driver written against something other than libpq (like the JDBC
driver).

Thank you for your comments. I will try to integrate these thoughts into
the document.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: [HACKERS] Add on_plperl_init and on_plperlu_init to plperl UPDATE 3 [PATCH]

2010-02-08 Thread Tim Bunce

On Mon, Feb 08, 2010 at 01:44:16PM +, Tim Bunce wrote:
 
  I'll continue reviewing the patch, but these things at least need fixing.

Here's an updated patch. The only changes relative to the previous
version are in the docs, as I outlined in the previous message.

I'll add it to the commitfest.

Tim.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add on_plperl_init and on_plperlu_init to plperl UPDATE 3 [PATCH]

2010-02-08 Thread Tim Bunce

[Sigh. This email, unlike the previous, actually includes the patch.]

On Mon, Feb 08, 2010 at 01:44:16PM +, Tim Bunce wrote:
 
  I'll continue reviewing the patch, but these things at least need fixing.

Here's an updated patch. The only changes relative to the previous
version are in the docs, as I outlined in the previous message.

I'll add it to the commitfest.

Tim.
diff --git a/doc/src/sgml/plperl.sgml b/doc/src/sgml/plperl.sgml
index 7018624..f742f0b 100644
*** a/doc/src/sgml/plperl.sgml
--- b/doc/src/sgml/plperl.sgml
*** $$ LANGUAGE plperl;
*** 748,753 
--- 748,759 
 literalreturn $_SHARED{myquote}-gt;($_[0]);/literal
 at the expense of readability.)
/para
+ 
+   para
+ The varname%_SHARED/varname variable and other global state within
+ the language is public data, available to all PL/Perl functions within a
+ session. Use with care.
+   /para
   /sect1
  
   sect1 id=plperl-trusted
*** CREATE TRIGGER test_valid_id_trig
*** 1044,1069 
  
variablelist
  
!  varlistentry id=guc-plperl-on-perl-init xreflabel=plperl.on_perl_init
!   termvarnameplperl.on_perl_init/varname (typestring/type)/term
indexterm
!primaryvarnameplperl.on_perl_init/ configuration parameter/primary
/indexterm
listitem
 para
!Specifies perl code to be executed when a perl interpreter is first initialized.
 The SPI functions are not available when this code is executed.
 If the code fails with an error it will abort the initialization of the interpreter
 and propagate out to the calling query, causing the current transaction
 or subtransaction to be aborted.
 /para
 para
!The perl code is limited to a single string. Longer code can be placed
!into a module and loaded by the literalon_perl_init/ string.
 Examples:
  programlisting
! plplerl.on_perl_init = '$ENV{NYTPROF}=start=no; require Devel::NYTProf::PgPLPerl'
! plplerl.on_perl_init = 'use lib /my/app; use MyApp::PgInit;'
  /programlisting
 /para
 para
--- 1050,1076 
  
variablelist
  
!  varlistentry id=guc-plperl-on-init xreflabel=plperl.on_init
!   termvarnameplperl.on_init/varname (typestring/type)/term
indexterm
!primaryvarnameplperl.on_init/ configuration parameter/primary
/indexterm
listitem
 para
!Specifies Perl code to be executed when a Perl interpreter is first initialized
!and before it is specialized for use by literalplperl/ or literalplperlu/.
 The SPI functions are not available when this code is executed.
 If the code fails with an error it will abort the initialization of the interpreter
 and propagate out to the calling query, causing the current transaction
 or subtransaction to be aborted.
 /para
 para
!The Perl code is limited to a single string. Longer code can be placed
!into a module and loaded by the literalon_init/ string.
 Examples:
  programlisting
! plplerl.on_init = '$ENV{NYTPROF}=start=no; require Devel::NYTProf::PgPLPerl'
! plplerl.on_init = 'use lib /my/app; use MyApp::PgInit;'
  /programlisting
 /para
 para
*** plplerl.on_perl_init = 'use lib /my/app
*** 1077,1082 
--- 1084,1129 
/listitem
   /varlistentry
  
+  varlistentry id=guc-plperl-on-plperl-init xreflabel=plperl.on_plperl_init
+   termvarnameplperl.on_plperl_init/varname (typestring/type)/term
+   termvarnameplperl.on_plperlu_init/varname (typestring/type)/term
+   indexterm
+primaryvarnameplperl.on_plperl_init/ configuration parameter/primary
+   /indexterm
+   indexterm
+primaryvarnameplperl.on_plperlu_init/ configuration parameter/primary
+   /indexterm
+   listitem
+para
+These parameters specify Perl code to be executed when the
+literalplperl/, or literalplperlu/ language is first used in a
+session.  Changes to these parameters after the corresponding language
+has been used will have no effect.
+The SPI functions are not available when this code is executed.
+Only superusers can change these settings.
+The Perl code in literalplperl.on_plperl_init/ can only perform trusted operations.
+/para
+para
+The effect of setting these parameters is very similar to executing a
+literalDO/ command with the Perl code before any other use of the
+language.  The parameters are useful when you want to execute the Perl
+code automatically on every connection, or when a connection is not
+interactive. The parameters can be used by non-superusers by having a
+superuser execute an literalALTER USER ... SET .../ command.
+For example:
+ programlisting
+ ALTER USER joe SET plplerl.on_plperl_init = '$_SHARED{debug} = 1';
+

Re: [HACKERS] review: More frame options in window functions

2010-02-08 Thread Tom Lane

I wrote:
 I started looking at this patch.  I believe that we should commit the 
 AggGetMemoryContext API function --- *not* the window context
 management changes that you included here, but only the API abstraction
 --- to be sure that that gets into 9.0 so that third-party aggregate
 functions can start relying on it instead of looking directly at the
 AggState or WindowAggState node.  The rest of the patch might or might
 not make it, but we can at least help people future-proof their code.

I have committed that little part.  I revised the function API to be

/* AggCheckCallContext can return one of the following codes, or 0: */
#define AGG_CONTEXT_AGGREGATE1/* regular aggregate */
#define AGG_CONTEXT_WINDOW   2/* window function */

extern int  AggCheckCallContext(FunctionCallInfo fcinfo,
MemoryContext *aggcontext);

so that it would be conveniently usable in places that just want to
check aggregate-ness and don't need to fetch a memory context; and
with the thought that maybe someday there would be more than two
possible call contexts.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Writeable CTEs patch

2010-02-08 Thread Robert Haas

On Mon, Feb 8, 2010 at 3:30 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Robert Haas escribió:
 On Mon, Feb 8, 2010 at 1:01 PM, Marko Tiikkaja
 marko.tiikk...@cs.helsinki.fi wrote:
  Could we just write, e.g.
  non-SELECT statements are not allowed within a cursor declaration?
  Or we could say INSERT, UPDATE, and DELETE statements are not allowed
  within a cursor declaration, but I'm thinking we may want to allow
  things like COPY and EXPLAIN inside CTEs in the future, too, and
  they'll presumably be treated similarly to DML.
 
  INSERT, UPDATE and DELETE is quite long and non-SELECT is a bit
  clumsy IMO.  But I don't really have anything better to offer, either.

 Yeah, I don't feel good about INSERT, UPDATE, and DELETE because in
 most of the relevant contexts the list might get longer if in the
 future we allow things like EXPLAIN and COPY within CTEs.  I think
 Non-SELECT statement is reasonably clear, though; people might not
 know which things are statements, but the message implies that SELECT
 is one such thing, and not the one that's the problem, which should
 get them pointed in the right direction.

 Hmm, how about VALUES?  Isn't that a statement on its own right, that
 would similarly unaffected?

Ouch.  You're right, that's a problem.  :-(

TABLE is a similar case.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Andrew McNamara

 http://code.google.com/p/ocpgdb/

I saw your note that you have to specify the types for date values
etc.  Is this really desirable or even necessary?  Can't you specify
the type as unknown (OID 705, I believe)?

At work, we recently used to typelessness of Perl's DBD::Pg with great
effect, introducing a more compact, type-safe representation for a few
columns, without having to change all the existing Perl scripts
accessing the database.  That's why I'm wondering...

I can't see how this would work with binary query parameters - the server
will see a blob of binary data and have no way to know what it represents.

I presume DBD::Pg is using text parameters, rather than binary.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] could not reattach to shared memory on Windows

2010-02-08 Thread Etienne Dube


Hi,

We've come across this issue on 8.2.15 on a Windows Server 2008 
instance. I noticed the patch hasn't been applied to the 8.2 branch yet. 
Any chances that this will be part of an eventual 8.2.16 release? Do you 
need more testing and feedback before commiting the patch?


Thanks,

Etienne Dube



* *From*: Magnus Hagander mag...@hagander.net
* *To*: Tom Lane t...@sss.pgh.pa.us
* *Cc*: Tsutomu Yamada tsut...@sraoss.co.jp, Alvaro Herrera
  alvhe...@commandprompt.com, pgsql-hackers@postgresql.org, Dave
  Page dp...@pgadmin.org
* *Subject*: Re: [PATCH] could not reattach to shared memory on
  Windows
* *Date*: Tue, 11 Aug 2009 17:14:08 +0200
* *Message-id*:
  9837222c0908110814n414b2fcbxcaf7c0e1fcc05...@mail.gmail.com
  http://archives.postgresql.org/pgsql-hackers/2009-08/msg00894.php


On Tue, Aug 11, 2009 at 16:30, Magnus Hagandermag...@hagander.net wrote:
 On Mon, Aug 10, 2009 at 19:33, Magnus Hagandermag...@hagander.net wrote:
 On Mon, Aug 10, 2009 at 16:58, Tom Lanet...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Mon, Aug 10, 2009 at 16:10, Tom Lanet...@sss.pgh.pa.us wrote:
 8.2 as well, no?

 8.2 has a different shmem implementation - the one that emulates sysv
 shmem. The patch will need to be changed around for that, and I
 haven't looked at that. It may be worthwhile to do that, but it's a
 separate patch, so let's get it out in 8.3 and 8.4 first.

 If it's at all hard to do, I could see deprecating 8.2 for Windows
 instead.

 I haven't looked at how much work it would be at all yet. So let's do
 that before we decide to deprecate anything. As mentioned downthread,
 8.2 is a very widespread release, and we really want to avoid
 deprecating it.

 Here's an attempt at a backport to 8.2. I haven't examined it  in
 detail, but it passes make check on mingw.

 Comments?

I've also built a binary that should be copy:able on top of an 8.2.13
installation made from the standard installer, to test this feature.
Anybody on 8.2 on Windows, please give it a shot and let us know how
it works.

http://www.hagander.net/pgsql/postgres_exe_virtualalloc_8_2.zip


--
 Magnus Hagander
  



 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

  


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] could not reattach to shared memory on Windows

2010-02-08 Thread Magnus Hagander

IIRC, we've had zero reports on whether the patch worked at all on 8.2
in an environment where the problem actually existed. So yes, some
testing and feedback would be much apprecaited.

//Magnus

2010/2/8 Etienne Dube etd...@gmail.com:
 Hi,

 We've come across this issue on 8.2.15 on a Windows Server 2008 instance. I 
 noticed the patch hasn't been applied to the 8.2 branch yet. Any chances that 
 this will be part of an eventual 8.2.16 release? Do you need more testing and 
 feedback before commiting the patch?

 Thanks,

 Etienne Dube


    * *From*: Magnus Hagander mag...@hagander.net
    * *To*: Tom Lane t...@sss.pgh.pa.us
    * *Cc*: Tsutomu Yamada tsut...@sraoss.co.jp, Alvaro Herrera
      alvhe...@commandprompt.com, pgsql-hackers@postgresql.org, Dave
      Page dp...@pgadmin.org
    * *Subject*: Re: [PATCH] could not reattach to shared memory on
      Windows
    * *Date*: Tue, 11 Aug 2009 17:14:08 +0200
    * *Message-id*:
      9837222c0908110814n414b2fcbxcaf7c0e1fcc05...@mail.gmail.com
      http://archives.postgresql.org/pgsql-hackers/2009-08/msg00894.php

 
 On Tue, Aug 11, 2009 at 16:30, Magnus Hagandermag...@hagander.net wrote:
  On Mon, Aug 10, 2009 at 19:33, Magnus Hagandermag...@hagander.net wrote:
  On Mon, Aug 10, 2009 at 16:58, Tom Lanet...@sss.pgh.pa.us wrote:
  Magnus Hagander mag...@hagander.net writes:
  On Mon, Aug 10, 2009 at 16:10, Tom Lanet...@sss.pgh.pa.us wrote:
  8.2 as well, no?
 
  8.2 has a different shmem implementation - the one that emulates sysv
  shmem. The patch will need to be changed around for that, and I
  haven't looked at that. It may be worthwhile to do that, but it's a
  separate patch, so let's get it out in 8.3 and 8.4 first.
 
  If it's at all hard to do, I could see deprecating 8.2 for Windows
  instead.
 
  I haven't looked at how much work it would be at all yet. So let's do
  that before we decide to deprecate anything. As mentioned downthread,
  8.2 is a very widespread release, and we really want to avoid
  deprecating it.
 
  Here's an attempt at a backport to 8.2. I haven't examined it  in
  detail, but it passes make check on mingw.
 
  Comments?

 I've also built a binary that should be copy:able on top of an 8.2.13
 installation made from the standard installer, to test this feature.
 Anybody on 8.2 on Windows, please give it a shot and let us know how
 it works.

 http://www.hagander.net/pgsql/postgres_exe_virtualalloc_8_2.zip


 --
  Magnus Hagander


  Me: http://www.hagander.net/
  Work: http://www.redpill-linpro.com/



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [GENERAL] FM format modifier does not remove leading zero from year

2010-02-08 Thread Bruce Momjian

Tom Lane wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  What's the point of not following Oracle here, since this is solely an
  Oracle compatibility function?
 
 Changing FM's behavior like that will break approximately every user
 of to_char() ...

I would love to know why we are finding out about this incompatibility
only in 2010, years after we implemented this.  Is no one porting
multi-specification FM to_char() strings from Oracle to PostgreSQL?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Writeable CTEs and empty relations

2010-02-08 Thread Tom Lane

Marko Tiikkaja marko.tiikk...@cs.helsinki.fi writes:
 I traced this down to heapam.c, which has this:
 ...
 This doesn't exactly work anymore since we modify the snapshot after
 calling ExecInitScan().

So don't do that.  The executor is generally entitled to assume that
parameters given to ExecutorStart are correct.  In particular, changing
the snapshot after the query has started to run seems to me to ensure
all sorts of inconsistent and undesirable behavior.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Jeff Davis

On Tue, 2010-02-09 at 09:15 +1100, Andrew McNamara wrote:
 I can't see how this would work with binary query parameters - the server
 will see a blob of binary data and have no way to know what it represents.

Unknown is unknown, whether in binary or text format. As far as I know,
PostgreSQL never looks inside a literal of unknown type to try to
determine its type -- it only looks at the context (to what function is
it an argument?).

For instance:

  SELECT '5'; -- has no idea what type it is

  SELECT '5' + 1; -- it's an int

  SELECT 'a' + 1; -- it's still an int
  ERROR:  invalid input syntax for integer: a
  LINE 1: SELECT 'a' + 1;

  SELECT '5.0' + 1; -- still an int, bad input format
  ERROR:  invalid input syntax for integer: 5.0
  LINE 1: SELECT '5.0' + 1;

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Andrew McNamara

On Tue, 2010-02-09 at 09:15 +1100, Andrew McNamara wrote:
 I can't see how this would work with binary query parameters - the server
 will see a blob of binary data and have no way to know what it represents.

Unknown is unknown, whether in binary or text format. As far as I know,
PostgreSQL never looks inside a literal of unknown type to try to
determine its type -- it only looks at the context (to what function is
it an argument?).

For instance:

  SELECT '5'; -- has no idea what type it is

  SELECT '5' + 1; -- it's an int

  SELECT 'a' + 1; -- it's still an int
  ERROR:  invalid input syntax for integer: a
  LINE 1: SELECT 'a' + 1;

  SELECT '5.0' + 1; -- still an int, bad input format
  ERROR:  invalid input syntax for integer: 5.0
  LINE 1: SELECT '5.0' + 1;

The problem is deeper than that - when query parameters use the binary
option, the server has no way to decode the binary parameter without an
appropriate type OID.

As you say, postgres will cast types depending on context, however this
is stricter when binary parameters are used (because they only have one
valid interpretation, whereas a text parameter may have several).

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Jeff Davis

On Mon, 2010-02-08 at 09:14 +0100, Massa, Harald Armin wrote:
 And we should not forget to look for the reasons for the incubation of
 that many pure-Python drivers:

All very good points. That's why the doc I wrote:

http://wiki.postgresql.org/wiki/Driver_development

is specifically targeted at libpq-based drivers (which is repeated
several times).

I think it would be valuable to have a complete, pure-python driver
available (like the JDBC driver). That's a large project, however.

People who use a different python implementation understand that
libraries might not be as plentiful. It will be a while before there are
as many pure-python libraries as there are pure-java libraries.

Right now what we need is a driver toward which we can confidently
direct cPython users. It's faster to wrap libpq than to write a complete
driver. And if we don't have such a driver, we risk alienating an
important community for postgresql growth. So, the cost is lower and the
benefits are higher for wrapping libpq for the cPython users. At least,
that seems to be the case right now; things may change in the future.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Jeff Davis

On Tue, 2010-02-09 at 10:46 +1100, Andrew McNamara wrote:
 The problem is deeper than that - when query parameters use the binary
 option, the server has no way to decode the binary parameter without an
 appropriate type OID.

Postgres does not attempt to decode anything (text or binary format)
until it figures out what type it is.

 As you say, postgres will cast types depending on context, however this
 is stricter when binary parameters are used (because they only have one
 valid interpretation, whereas a text parameter may have several).

Type casts are a different matter; they are only done after the unknown
literals' types have been determined:

  create table n(i int);

  -- insert numeric literal, which is cast to int (assignment cast)
  insert into n values(5.0); -- succeeds

  -- insert unknown literal, which is inferred to be of type int
  insert into n values('5.0'); -- fails on integer type input function
  ERROR:  invalid input syntax for integer: 5.0
  LINE 1: insert into n values('5.0');


Can you provide a concrete example in which the text versus binary
format changes the type inference behavior?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Order of operations in lazy_vacuum_rel

2010-02-08 Thread Tom Lane

Alvaro Herrera alvhe...@commandprompt.com writes:
 Tom Lane wrote:
 The point would be to not disable interrupts till after doing the FSM
 vacuuming.  Ignoring CANCEL for longer than we must is bad.

 Oh, I see.  I guess the answer is that it depends on what happens if you
 interrupt after vacuuming the FSM.  I have no idea what that is supposed
 to accomplish so I'm of no help here.  FreeSpaceMapVacuum says it's
 about fixing inconsistencies in the FSM, so I'm guessing that it's not
 critical.

Actually, after thinking about this some more, I realize that this code
has got a significantly bigger problem than just whether it will respond
to CANCEL promptly.  If we truncate the table, and then get an error
sometime before commit, the relcache inval message will not be sent,
leaving other backends at significant risk of strange errors due to
having rd_targblock pointing somewhere past the end of the table.
So we should reorder these operations just to reduce the risk window,
and I've done so.

It might be a good idea to develop a nontransactional signaling method
for causing other backends to reset rd_targblock --- perhaps we could
tie it to the smgr inval signaling that already happens on a truncate?
That would probably require moving rd_targblock down to the smgr
level, but offhand I see no reason that that'd be a bad thing to do.

I'm not going to panic about it right now, since the code has been like
this for a long time and we've had few if any complaints.  But it seems
like something to fix sometime.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot standby documentation

2010-02-08 Thread Fujii Masao

On Mon, Feb 8, 2010 at 10:34 PM, Bruce Momjian br...@momjian.us wrote:
 Ahh, good point.  I had not considered the table would change.  What I
 did was to mark Slaves accept read-only queries as Hot only.

Can the warm standby still reside in v9.0? If not, the mark of
Hot only seems odd for me.

 I did not change Master failure will never lose data because the 9.0
 streaming implementation is not sychronous (see wal_sender_delay in
 postgresql.conf), and I don't think even setting that to zero makes the
 operation synchronous.  I think we will have to wait for PG 9.1 for
 _synchronous_ streaming replication.

You are right.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] buildfarm breakage

2010-02-08 Thread Andrew Dunstan



It looks like some recent patches have broken a couple of things on the 
buildfarm.


Mingw builds are missing in6addr_any in backend/libpq/auth.c, added by a 
recent RADIUS support fix. Looks like we might need to include win32.h 
in there.


MSVC builds are broken from a missing _isnan function on the ECPG tests. 
Do we need to link in a math lib or something there?


Our Solaris *moth members seem to have stopped building. Have we lost them?

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Order of operations in lazy_vacuum_rel

2010-02-08 Thread Alvaro Herrera

Tom Lane wrote:

 Actually, after thinking about this some more, I realize that this code
 has got a significantly bigger problem than just whether it will respond
 to CANCEL promptly.  If we truncate the table, and then get an error
 sometime before commit, the relcache inval message will not be sent,
 leaving other backends at significant risk of strange errors due to
 having rd_targblock pointing somewhere past the end of the table.
 So we should reorder these operations just to reduce the risk window,
 and I've done so.

Err, that problem was exactly why I added the interrupt holdoff in
there, so if you've got a better/more invasive solution, it's very
welcome.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] buildfarm breakage

2010-02-08 Thread Tom Lane

Andrew Dunstan and...@dunslane.net writes:
 Mingw builds are missing in6addr_any in backend/libpq/auth.c, added by a 
 recent RADIUS support fix. Looks like we might need to include win32.h 
 in there.

That was discussed already.  I assume Magnus is going to address it
as soon as he gets back from FOSDEM.

 MSVC builds are broken from a missing _isnan function on the ECPG tests. 
 Do we need to link in a math lib or something there?

It looks to me like the problem is that that test is being compiled
without benefit of any platform-dependent code whatsoever.  In the rest
of the system, isnan and isinf work on WIN32 because the compiles can
see the macro definitions in port/win32.h.  nan_test is apparently not
including that.  I'm not sure of Michael's plan for portability of
these test cases --- if he doesn't want to include c.h or something
close to that, I think the nan test has to go away.

 Our Solaris *moth members seem to have stopped building. Have we lost them?

They're not *all* dead, but it sure looks like Oracle scaled that lab
way back the moment they owned it.  I'm surprised any of them are still
alive :-(

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] buildfarm breakage

2010-02-08 Thread Mark Wong

On Mon, Feb 8, 2010 at 5:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Our Solaris *moth members seem to have stopped building. Have we lost them?

 They're not *all* dead, but it sure looks like Oracle scaled that lab
 way back the moment they owned it.  I'm surprised any of them are still
 alive :-(

We still have a T2000 system with solaris on it.  It was not in the
buildfarm because it was felt this configuration was already covered.
Let me know if we want to set it up for the buildfarm.

Regards,
Mark

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Order of operations in lazy_vacuum_rel

2010-02-08 Thread Tom Lane

Alvaro Herrera alvhe...@commandprompt.com writes:
 Tom Lane wrote:
 Actually, after thinking about this some more, I realize that this code
 has got a significantly bigger problem than just whether it will respond
 to CANCEL promptly.

 Err, that problem was exactly why I added the interrupt holdoff in
 there, so if you've got a better/more invasive solution, it's very
 welcome.

Well, that's a pretty incomplete solution :-(.  Maybe we should do
something about this.  There wasn't any obvious solution before,
but now that we have the nontransactional smgr-level sinval messages
being sent on drops and truncates, it seems like tying rd_targblock
clearing to those would fix the problem.  The easiest way to do that
would involve moving rd_targblock down to the SMgrRelation struct.
Probably rd_fsm_nblocks and rd_vm_nblocks too.  Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Order of operations in lazy_vacuum_rel

2010-02-08 Thread Alvaro Herrera

Tom Lane wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  Tom Lane wrote:
  Actually, after thinking about this some more, I realize that this code
  has got a significantly bigger problem than just whether it will respond
  to CANCEL promptly.
 
  Err, that problem was exactly why I added the interrupt holdoff in
  there, so if you've got a better/more invasive solution, it's very
  welcome.
 
 Well, that's a pretty incomplete solution :-(.

Yeah, we were well aware of that :-)  It solved our problem (which was
related to interrupts from autovac)

 Maybe we should do
 something about this.  There wasn't any obvious solution before,
 but now that we have the nontransactional smgr-level sinval messages
 being sent on drops and truncates, it seems like tying rd_targblock
 clearing to those would fix the problem.

Hmm, sounds good, though I confess not having heard about
nontransactional sinval messages before.

 The easiest way to do that
 would involve moving rd_targblock down to the SMgrRelation struct.
 Probably rd_fsm_nblocks and rd_vm_nblocks too.  Comments?

Can't say it doesn't look like a modularity violation from here --
insertion target block doesn't really belong into smgr, does it?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Andrew McNamara

On Tue, 2010-02-09 at 10:46 +1100, Andrew McNamara wrote:
 The problem is deeper than that - when query parameters use the binary
 option, the server has no way to decode the binary parameter without an
 appropriate type OID.

Postgres does not attempt to decode anything (text or binary format)
until it figures out what type it is.

How does it figure out what type it is? Either by the type oid passed by
the caller, or by the context if the type oid is unknown.  

Now, with the text format parameters, the parser usually does the right
thing, since text formats have plenty of hints for us humans.

However, with the binary format, unless the caller tells us, there's no way
to tell whether we're correctly parsing the data. If the context implies
one type, but the user passes another, we'll either get an ugly error or,
worse, silently misparse their data.

Generally this isn't a big problem with python, as we have good type
information available. It's only an issue because people have gotten
used to the text parameter parsing being so forgiving.

Using my ocpgdb module, and interacting directly with the libpq wrapping
code, you can see how postgres reacts to various inputs:

 from oclibpq import *
 from ocpgdb import pgoid
 db=PgConnection('')

No parameters:

 r=db.execute('select 1', ())
 r.status
PGRES_TUPLES_OK
 list(r)
[(PyPgCell name '?column?', type 23, modifier -1, value '\x00\x00\x00\x01' 
at 0xb7514200,)]

Int4 parameter, type specified:

 r=db.execute('select $1', [(pgoid.int4, '\x00\x00\x00\x01')])
 r.status
PGRES_TUPLES_OK
 list(r)
[(PyPgCell name '?column?', type 23, modifier -1, value '\x00\x00\x00\x01' 
at 0xb75141c0,)]

Int4 parameter, type unknown, can't be determined from context:

 r=db.execute('select $1', [(pgoid.unknown, '\x00\x00\x00\x01')])
 r.status
PGRES_FATAL_ERROR
 r.errorMessage
'ERROR:  could not determine data type of parameter $1\n'

Int4 parameter, type unknown, can be determined from context:

 r=db.execute('select $1 + 1', [(pgoid.unknown, '\x00\x00\x00\x01')])
 r.status
PGRES_TUPLES_OK
 list(r)
[(PyPgCell name '?column?', type 23, modifier -1, value '\x00\x00\x00\x02' 
at 0xb7514200,)]

Text parameter, type unknown, mismatching context - surprising:

 r=db.execute('select $1 + 1', [(pgoid.unknown, '')])
 r.status
PGRES_TUPLES_OK
 list(r)
[(PyPgCell name '?column?', type 23, modifier -1, value '1112' at 
0xb7514360,)]

Date parameter, type unknown, int context, the value gets misinterpreted:

 r=db.execute('select $1 + 1', [(pgoid.unknown, '\x00\x00\x01n')])
 r.status
PGRES_TUPLES_OK
 list(r)
[(PyPgCell name '?column?', type 23, modifier -1, value '\x00\x00\x01o' at 
0xb75144a0,)]

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Order of operations in lazy_vacuum_rel

2010-02-08 Thread Tom Lane

Alvaro Herrera alvhe...@commandprompt.com writes:
 Tom Lane wrote:
 Maybe we should do
 something about this.  There wasn't any obvious solution before,
 but now that we have the nontransactional smgr-level sinval messages
 being sent on drops and truncates, it seems like tying rd_targblock
 clearing to those would fix the problem.

 Hmm, sounds good, though I confess not having heard about
 nontransactional sinval messages before.

Hey, they've been in there almost a week ;-)
http://archives.postgresql.org/pgsql-committers/2010-02/msg00026.php

 The easiest way to do that
 would involve moving rd_targblock down to the SMgrRelation struct.
 Probably rd_fsm_nblocks and rd_vm_nblocks too.  Comments?

 Can't say it doesn't look like a modularity violation from here --
 insertion target block doesn't really belong into smgr, does it?

It never belonged in relcache, either.  Trying to keep dynamic state
(not backed by a catalog entry) in the relcache has always been a
pretty klugy thing.  smgr at least has a reasonable excuse for trying
to keep track of physical truncation events, which is the thing we need
here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Jeff Davis

On Tue, 2010-02-09 at 12:51 +1100, Andrew McNamara wrote:
 Now, with the text format parameters, the parser usually does the right
 thing, since text formats have plenty of hints for us humans.

The parser doesn't care whether it's text format or binary format when
determining the type.

 However, with the binary format, unless the caller tells us, there's no way
 to tell whether we're correctly parsing the data. If the context implies
 one type, but the user passes another, we'll either get an ugly error or,
 worse, silently misparse their data.

The difference between text and binary format is this: after it has
already determined the type of the parameter,
  (a) if the format is text, it passes it to the type input function
  to construct the value; or
  (b) if the format is binary, it passes it to the type recv function
  to construct the value.

The argument to the input or recv functions may:
  (a) be valid input; or
  (b) be invalid input, and be detected as an error by the input or
  recv function; or
  (c) be invalid input, and not be detected as an error by the input
  or recv function.

For a given type, the input function may be more likely to catch an
input error than the recv function; or the reverse. Either way, it is
very type-specific, and the only difference is the whether the input is
misinterpreted (type error not caught; bad) or an error is thrown (type
error caught; better). 

 Using my ocpgdb module, and interacting directly with the libpq wrapping
 code, you can see how postgres reacts to various inputs:

None of the examples show a difference in the inferred type of a text
versus binary parameter for the same query.

 No parameters:
 
  r=db.execute('select 1', ())
  r.status
 PGRES_TUPLES_OK
  list(r)
 [(PyPgCell name '?column?', type 23, modifier -1, value 
 '\x00\x00\x00\x01' at 0xb7514200,)]

Expected, because the literal 1 (without quotes) is an integer literal,
not an unknown literal.

 Int4 parameter, type specified:
 
  r=db.execute('select $1', [(pgoid.int4, '\x00\x00\x00\x01')])
  r.status
 PGRES_TUPLES_OK
  list(r)
 [(PyPgCell name '?column?', type 23, modifier -1, value 
 '\x00\x00\x00\x01' at 0xb75141c0,)]

Expected, because you specified the type, and sent the binary data to
the integer recv function, and it was valid input.

 Int4 parameter, type unknown, can't be determined from context:
 
  r=db.execute('select $1', [(pgoid.unknown, '\x00\x00\x00\x01')])
  r.status
 PGRES_FATAL_ERROR
  r.errorMessage
 'ERROR:  could not determine data type of parameter $1\n'

Expected -- there is no context to determine the type.

Why do you call it an int4 parameter? It's just bytes, and you never
told postgres what they are (as you did in the previous example).

 Int4 parameter, type unknown, can be determined from context:
 
  r=db.execute('select $1 + 1', [(pgoid.unknown, '\x00\x00\x00\x01')])
  r.status
 PGRES_TUPLES_OK
  list(r)
 [(PyPgCell name '?column?', type 23, modifier -1, value 
 '\x00\x00\x00\x02' at 0xb7514200,)]

Expected: the function + provides the context that allows the server to
interpret the left argument as an integer.

(Again, not an int4 parameter, it's unknown)

 Text parameter, type unknown, mismatching context - surprising:
 
  r=db.execute('select $1 + 1', [(pgoid.unknown, '')])
  r.status
 PGRES_TUPLES_OK
  list(r)
 [(PyPgCell name '?column?', type 23, modifier -1, value '1112' at 
 0xb7514360,)]

Expected, because this is exactly the same as the previous one except
for the data you pass in. Notice that the same type is inferred (23).

Why do you call this mismatching context when the context is exactly
the same as above? The only difference is which 4 bytes you provide. You
never told postgres that the bytes were text bytes anywhere.

You may think that it's doing  + 1, but it's actually doing addition
on the bytes. That is apparent in the next example:

 Date parameter, type unknown, int context, the value gets misinterpreted:
 
  r=db.execute('select $1 + 1', [(pgoid.unknown, '\x00\x00\x01n')])
  r.status
 PGRES_TUPLES_OK
  list(r)
 [(PyPgCell name '?column?', type 23, modifier -1, value '\x00\x00\x01o' 
 at 0xb75144a0,)]
 

Expected, because the only thing that could possibly detect the error is
the int4recv function, which happens to accept any 4-byte input (so it
will never fail on any 4 bytes of data).

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Andrew McNamara

For a given type, the input function may be more likely to catch an
input error than the recv function; or the reverse. Either way, it is
very type-specific, and the only difference is the whether the input is
misinterpreted (type error not caught; bad) or an error is thrown (type
error caught; better). 

This is the crux of the matter: the type input functions are universally
more forgiving since, by their nature, text formats are designed for us
fuzzy humans, and users of adapters have come to expect this. 

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] CVS checkout source code for different branches

2010-02-08 Thread M Z

Hi,

I am trying to checkout code from different branches (such as 8.3, 8.4).

I found a few ways to checkout code from CVS:
1. webCVS: http://anoncvs.postgresql.org/cvsweb.cgi/

2. cvs -z3 -d :pserver:anoncvs:passw...@anoncvs.postgresql.org:/projects/cvsroot
co -P pgsql

3. $ rsync --progress -avzCH --delete anoncvs.postgresql.org::pgsql-cvs
$CVSROOT

However, how can I checkout code from different branches (such as 8.3, 8.4)?

Thank you.
M Z

Re: [HACKERS] [CFReview] Red-Black Tree

2010-02-08 Thread Robert Haas

On Mon, Feb 8, 2010 at 3:05 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 It seems a bit strange to have all the rb_free_recursive support and not
 use it anywhere ... and a freefunc callback even, whose only caller
 seems to set as null currently.  Hmm, even in the knngist patch the
 rb_freefunc stuff is unused.

I don't think it's inappropriate;  it doesn't seem implausible that
someone might want to free an rbtree someday.  I suppose we could
comment it out but I guess I don't see the point.

 How do we now that it works?

Visual inspection?  It's not very complicated.

 (What, for example, if we were to allocate multiple nodes in a single
 palloc chunk?  I'm not familiar with this stuff but that seems
 plausible)

Well, then you could have the freefunc do something ((MyStruct *)
a)-is_allocated = false.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: More frame options in window functions

2010-02-08 Thread Hitoshi Harada

2010/2/9 Tom Lane t...@sss.pgh.pa.us:
 Hitoshi Harada umi.tan...@gmail.com writes:
 2010/1/23 Robert Haas robertmh...@gmail.com:
 Would it make sense to pull some of the infrastructure bits out of
 this patch and commit those bits separately, so as to reduce the size
 of the main patch?  In particular, the AggGetMemoryContext() stuff
 looks like a good candidate for that treatment.

 Fair enough. Attached contains that part only.

 I started looking at this patch.  I believe that we should commit the
 AggGetMemoryContext API function --- *not* the window context
 management changes that you included here, but only the API abstraction
 --- to be sure that that gets into 9.0 so that third-party aggregate
 functions can start relying on it instead of looking directly at the
 AggState or WindowAggState node.  The rest of the patch might or might
 not make it, but we can at least help people future-proof their code.

 I'm fairly desperately unhappy with the RANGE PRECEDING/FOLLOWING parts
 of the patch.  We have expended a great deal of sweat over the years
 to avoid hard-wiring assumptions about particular operator names into
 the code, but this patch is blithely ignoring that history and assuming
 that + and - will do the right thing.  Also LookupOperName is
 probably not the right thing, since it insists on exact or
 binary-compatible match.  To the extent that there is any justification
 at all for assuming that +/- are the right operators, it is that the
 spec speaks in terms of the range bounds being VSK+V2F etc --- but if
 someone were to actually write out such an expression, the parser would
 allow for implicit casts to happen, so this code is not implementing
 what that expression would produce.  Plus the results are dependent on
 the current search path, which for example means it'll fail if the
 window sort column is a user-defined type whose operators happen to be
 out of path at the moment --- even though we would have found its
 default sort opclass despite that.  And lastly, I'm totally unconvinced
 that it's a good idea to accept an operator that returns a type other
 than the type of the window sort column.  That seems to eliminate
 whatever little protection we had against picking up an unsuitable
 operator; and it complicates the code as well.

I know +/- part is an issue. But as far as I know there's been no
infrastructure to handle such situation. My ideal plan is to introduce
some mechanism to make +/- operation abstract enough such like
sort opclass/opfamily, although I wasn't sure if that is to be
introduced for this (ie RANGE frame) purpose only.

Now that specialized hard-coding +/- in source seems unacceptable,
I would like to hear how to handle this. Is there any better than
introducing new mechanism such like opclass?

 Given the lack of time remaining in this CF, I'm tempted to propose
 ripping out the RANGE support and just trying to get ROWS committed.
 That should be substantially less controversial from a semantic
 standpoint, and it still seems like a considerable improvement in
 functionality.

 Thoughts?

As expected. I don't mind splitting patch to be committable if users
who expected this feature don't mind.

Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CVS checkout source code for different branches

2010-02-08 Thread Andrew Dunstan




M Z wrote:

Hi,

I am trying to checkout code from different branches (such as 8.3, 8.4).

I found a few ways to checkout code from CVS:
1. webCVS: http://anoncvs.postgresql.org/cvsweb.cgi/

2. cvs -z3 -d 
:pserver:anoncvs:passw...@anoncvs.postgresql.org:/projects/cvsroot co 
-P pgsql


3. $ rsync --progress -avzCH --delete 
anoncvs.postgresql.org::pgsql-cvs $CVSROOT


However, how can I checkout code from different branches (such as 8.3, 
8.4)?



CVS is documented here, among other places: 
http://ximbiot.com/cvs/manual/cvs-1.11.20/cvs.html


To check out a particular branch such as REL8_4_STABLE, use the -r 
option on the checkout command:


   cvs checkout -r branchname

cheers

andrew





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [CFReview] Red-Black Tree

2010-02-08 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Mon, Feb 8, 2010 at 3:05 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 It seems a bit strange to have all the rb_free_recursive support and not
 use it anywhere ... and a freefunc callback even, whose only caller
 seems to set as null currently.  Hmm, even in the knngist patch the
 rb_freefunc stuff is unused.

 I don't think it's inappropriate;  it doesn't seem implausible that
 someone might want to free an rbtree someday.  I suppose we could
 comment it out but I guess I don't see the point.

I think the suggestion was to *remove* it not comment it out.  I'm
skeptical of carrying dead code.  If the functionality is not used
in the proposed gist patches then it's very fair to question whether
it ever will be used.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] set the cost of an aggregate function

2010-02-08 Thread Jaime Casanova

On Thu, Dec 3, 2009 at 7:19 AM, Simon Riggs si...@2ndquadrant.com wrote:

 The answer is nobody got round to enhancing this yet and well considered
 proposals and subsequent patches would be welcome.


what is a well considered proposal in this area? the transition
function cost should be applied to every row cost (maybe multiplied to
cpu_operator_cost or cpu_tuple_cost) and the final function cost
should be applied just once, no?

-- 
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CVS checkout source code for different branches

2010-02-08 Thread M Z

Hi Andrew,

Could you please give a little more detail how I can find different CVS
branches?

Thanks,
M Z


On Mon, Feb 8, 2010 at 10:38 PM, Andrew Dunstan and...@dunslane.net wrote:



 M Z wrote:

 Hi,

 I am trying to checkout code from different branches (such as 8.3, 8.4).

 I found a few ways to checkout code from CVS:
 1. webCVS: http://anoncvs.postgresql.org/cvsweb.cgi/

 2. cvs -z3 -d 
 :pserver:anoncvs:passw...@anoncvs.postgresql.org:/projects/cvsroot
 co -P pgsql

 3. $ rsync --progress -avzCH --delete anoncvs.postgresql.org::pgsql-cvs
 $CVSROOT

 However, how can I checkout code from different branches (such as 8.3,
 8.4)?



 CVS is documented here, among other places: 
 http://ximbiot.com/cvs/manual/cvs-1.11.20/cvs.html

 To check out a particular branch such as REL8_4_STABLE, use the -r option
 on the checkout command:

   cvs checkout -r branchname

 cheers

 andrew

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Jeff Davis

On Tue, 2010-02-09 at 13:56 +1100, Andrew McNamara wrote:
 For a given type, the input function may be more likely to catch an
 input error than the recv function; or the reverse. Either way, it is
 very type-specific, and the only difference is the whether the input is
 misinterpreted (type error not caught; bad) or an error is thrown (type
 error caught; better). 
 
 This is the crux of the matter: the type input functions are universally
 more forgiving since, by their nature, text formats are designed for us
 fuzzy humans, and users of adapters have come to expect this. 

Except that it's exactly the opposite with integers. Pass any 4 bytes to
in4recv(), and it will accept it. However, try passing '4.0' to
int4in(), and you get an error.

If I had to make an educated guess about the forgiveness of various type
input and type recv functions, I would say that the recv functions are
more forgiving. After all, you would expect the binary format to be less
redundant, and therefore less likely to catch inconsistencies. I don't
see much of a universal truth there, however.

This is getting pretty far off-topic, so let's just leave it at that.
The drivers should support both formats; the type inference logic
doesn't care at all about the contents of the unknown literals (text or
binary); and queries should be written in such a way that the types are
unambiguous and unsurprising.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CVS checkout source code for different branches

2010-02-08 Thread M Z

For example, how can I list all the branches for postgresql 8.3 (and 8.4)?
Now I can checkout code using:
cvs -z3 -d :pserver:anoncvs:passw...@anoncvs.postgresql.org:/projects/cvsroot
co -P pgsql

But I don't know when version it is, and I want get code from some
postgresql 8.3 and 8.4 branches but I don't know the their cvsroot

Thanks
M Z

On Mon, Feb 8, 2010 at 11:04 PM, M Z jm80...@gmail.com wrote:

 Hi Andrew,

 Could you please give a little more detail how I can find different CVS
 branches?

 Thanks,
 M Z



 On Mon, Feb 8, 2010 at 10:38 PM, Andrew Dunstan and...@dunslane.netwrote:



 M Z wrote:

 Hi,

 I am trying to checkout code from different branches (such as 8.3, 8.4).

 I found a few ways to checkout code from CVS:
 1. webCVS: http://anoncvs.postgresql.org/cvsweb.cgi/

 2. cvs -z3 -d 
 :pserver:anoncvs:passw...@anoncvs.postgresql.org:/projects/cvsroot
 co -P pgsql

 3. $ rsync --progress -avzCH --delete anoncvs.postgresql.org::pgsql-cvs
 $CVSROOT

 However, how can I checkout code from different branches (such as 8.3,
 8.4)?



 CVS is documented here, among other places: 
 http://ximbiot.com/cvs/manual/cvs-1.11.20/cvs.html

 To check out a particular branch such as REL8_4_STABLE, use the -r option
 on the checkout command:

   cvs checkout -r branchname

 cheers

 andrew

Re: [HACKERS] Hot standby documentation

2010-02-08 Thread Bruce Momjian

Fujii Masao wrote:
 On Mon, Feb 8, 2010 at 10:34 PM, Bruce Momjian br...@momjian.us wrote:
  Ahh, good point. ?I had not considered the table would change. ?What I
  did was to mark Slaves accept read-only queries as Hot only.
 
 Can the warm standby still reside in v9.0? If not, the mark of
 Hot only seems odd for me.

Yes, both hot and warm standby is supported in 9.0.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Jeff Davis

On Mon, 2010-02-08 at 20:50 +0100, Florian Weimer wrote:
 I saw your note that you have to specify the types for date values
 etc.  Is this really desirable or even necessary?  Can't you specify
 the type as unknown (OID 705, I believe)?

I believe the problem that Andrew is describing is that:

  SELECT $1 + 1;

will infer that $1 is of type int4. But if you really intended $1 to be
a date (which is also valid), it will cause a problem.

If the date is passed in text format, it will cause an error in
int4in(), because the text representation of a date isn't a valid text
representation for an integer.

If the date is passed in binary format, it will pass it to int4recv() --
but because the date is 4 bytes, and int4recv is defined for any 4-byte
input, it won't cause an error; it will produce a wrong result. In other
words, the binary representation for a date _is_ a valid binary
representation for an integer. The type inference has found the wrong
type, but the recv function still accepts it, which causes a problem.

The solution is to write the query in an unambiguous way:

  SELECT $1::date + 1;

which is good practice, anyway. If it's not obvious to the type
inference system, it's probably not obvious to you, and will probably
surprise you ;)

Or, as Andrew suggests, you can pass the type oid along with the
parameter so that postgresql knows the right type.

Either way, relying on a type input or a recv function to cause a type
error is much more fragile.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [CFReview] Red-Black Tree

2010-02-08 Thread Robert Haas

On Mon, Feb 8, 2010 at 10:43 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Feb 8, 2010 at 3:05 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 It seems a bit strange to have all the rb_free_recursive support and not
 use it anywhere ... and a freefunc callback even, whose only caller
 seems to set as null currently.  Hmm, even in the knngist patch the
 rb_freefunc stuff is unused.

 I don't think it's inappropriate;  it doesn't seem implausible that
 someone might want to free an rbtree someday.  I suppose we could
 comment it out but I guess I don't see the point.

 I think the suggestion was to *remove* it not comment it out.  I'm
 skeptical of carrying dead code.  If the functionality is not used
 in the proposed gist patches then it's very fair to question whether
 it ever will be used.

I don't think the question is unfair; I just don't happen to agree
with the conclusion.  But I don't care enough to argue about it
either...

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [CFReview] Red-Black Tree

2010-02-08 Thread Robert Haas

2010/2/8 Teodor Sigaev teo...@sigaev.ru:
 I think that the code in ginInsertRecordBA() is needlessly complex.
 As far as I can see, nNodesOnCurrentLevel is always exactly one more
 than nNodesOnPreviousLevel, and I think step is also basically
 redundant with both of these although the relationship is a little
 more complex.  What I would suggest is something like:

 - initialize step to the largest power of 2 s.t. step  nentry
 - while step  0
 -- for (i = step; true; i += 2 * step)
 --- insert entry #i-1
 --- if i  nentry - (2 * step)  /* must test before incrementing i, to
 guard against overflow */
  break
 -- step = step / 2

 Good idea, implemented.

Hmm.  I think your implementation is prone to overflow in two places -
both when computing step, and also when stepping through the array.
Proposed revision attached, with also some rewriting of the comment
for that function.

...Robert


rbtree-0.12-rmh
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: More frame options in window functions

2010-02-08 Thread Robert Haas

On Mon, Feb 8, 2010 at 10:37 PM, Hitoshi Harada umi.tan...@gmail.com wrote:
 2010/2/9 Tom Lane t...@sss.pgh.pa.us:
 Hitoshi Harada umi.tan...@gmail.com writes:
 2010/1/23 Robert Haas robertmh...@gmail.com:
 Would it make sense to pull some of the infrastructure bits out of
 this patch and commit those bits separately, so as to reduce the size
 of the main patch?  In particular, the AggGetMemoryContext() stuff
 looks like a good candidate for that treatment.

 Fair enough. Attached contains that part only.

 I started looking at this patch.  I believe that we should commit the
 AggGetMemoryContext API function --- *not* the window context
 management changes that you included here, but only the API abstraction
 --- to be sure that that gets into 9.0 so that third-party aggregate
 functions can start relying on it instead of looking directly at the
 AggState or WindowAggState node.  The rest of the patch might or might
 not make it, but we can at least help people future-proof their code.

 I'm fairly desperately unhappy with the RANGE PRECEDING/FOLLOWING parts
 of the patch.  We have expended a great deal of sweat over the years
 to avoid hard-wiring assumptions about particular operator names into
 the code, but this patch is blithely ignoring that history and assuming
 that + and - will do the right thing.  Also LookupOperName is
 probably not the right thing, since it insists on exact or
 binary-compatible match.  To the extent that there is any justification
 at all for assuming that +/- are the right operators, it is that the
 spec speaks in terms of the range bounds being VSK+V2F etc --- but if
 someone were to actually write out such an expression, the parser would
 allow for implicit casts to happen, so this code is not implementing
 what that expression would produce.  Plus the results are dependent on
 the current search path, which for example means it'll fail if the
 window sort column is a user-defined type whose operators happen to be
 out of path at the moment --- even though we would have found its
 default sort opclass despite that.  And lastly, I'm totally unconvinced
 that it's a good idea to accept an operator that returns a type other
 than the type of the window sort column.  That seems to eliminate
 whatever little protection we had against picking up an unsuitable
 operator; and it complicates the code as well.

 I know +/- part is an issue. But as far as I know there's been no
 infrastructure to handle such situation. My ideal plan is to introduce
 some mechanism to make +/- operation abstract enough such like
 sort opclass/opfamily, although I wasn't sure if that is to be
 introduced for this (ie RANGE frame) purpose only.

 Now that specialized hard-coding +/- in source seems unacceptable,
 I would like to hear how to handle this. Is there any better than
 introducing new mechanism such like opclass?

 Given the lack of time remaining in this CF, I'm tempted to propose
 ripping out the RANGE support and just trying to get ROWS committed.
 That should be substantially less controversial from a semantic
 standpoint, and it still seems like a considerable improvement in
 functionality.

 Thoughts?

 As expected. I don't mind splitting patch to be committable if users
 who expected this feature don't mind.

Well, they'll likely be happier with a partial feature than no feature
at all...  I agree with Tom that there's no time time now to resolve
the issue of how + and - should be handled.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CVS checkout source code for different branches

2010-02-08 Thread Andrew Dunstan



The only sane things to check out apart from HEAD are normally the 
STABLE branches. For release m.n those are always called RELm_n_STABLE. 
You can also get the tag set for a specific release. Those are called 
RELm_n_o for m.n.o releases.


If you look at the output for cvs log configure.in you can see near 
the top a list of tag sets under the heading symbolic names.


HTH.



M Z wrote:
For example, how can I list all the branches for postgresql 8.3 (and 
8.4)? Now I can checkout code using:
cvs -z3 -d 
:pserver:anoncvs:passw...@anoncvs.postgresql.org:/projects/cvsroot co 
-P pgsql


But I don't know when version it is, and I want get code from some 
postgresql 8.3 and 8.4 branches but I don't know the their cvsroot


Thanks
M Z

On Mon, Feb 8, 2010 at 11:04 PM, M Z jm80...@gmail.com 
mailto:jm80...@gmail.com wrote:


Hi Andrew,

Could you please give a little more detail how I can find
different CVS branches?

Thanks,
M Z



On Mon, Feb 8, 2010 at 10:38 PM, Andrew Dunstan
and...@dunslane.net mailto:and...@dunslane.net wrote:



M Z wrote:

Hi,

I am trying to checkout code from different branches (such
as 8.3, 8.4).

I found a few ways to checkout code from CVS:
1. webCVS: http://anoncvs.postgresql.org/cvsweb.cgi/

2. cvs -z3 -d
:pserver:anoncvs:passw...@anoncvs.postgresql.org:/projects/cvsroot
co -P pgsql

3. $ rsync --progress -avzCH --delete
anoncvs.postgresql.org::pgsql-cvs $CVSROOT

However, how can I checkout code from different branches
(such as 8.3, 8.4)?



CVS is documented here, among other places:
http://ximbiot.com/cvs/manual/cvs-1.11.20/cvs.html

To check out a particular branch such as REL8_4_STABLE, use
the -r option on the checkout command:

  cvs checkout -r branchname

cheers

andrew








--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Largeobject Access Controls (r2460)

2010-02-08 Thread KaiGai Kohei


(2010/02/08 22:23), Alvaro Herrera wrote:

Takahiro Itagaki escribió:


KaiGai Koheikai...@kaigai.gr.jp  wrote:


default:both contents and metadata
--data-only:same
--schema-only:  neither


However, it means only large object performs an exceptional object class
that dumps its owner, acl and comment even if --data-only is given.
Is it really what you suggested, isn't it?


I wonder we still need to have both BLOB ITEM and BLOB DATA
even if we will take the all-or-nothing behavior. Can we handle
BLOB's owner, acl, comment and data with one entry kind?


I don't think this is necessarily a good idea.  We might decide to treat
both things separately in the future and it having them represented
separately in the dump would prove useful.


I agree. From design perspective, the single section approach is more
simple than dual section, but its change set is larger than the dual.

The attached patch revised the previous implementation which have
two types of sections, to handle options correctly, as follows:

* default:  both contents and metadata
* --data-only:  same
* --schema-only:neither

Below is the points to be introduced.

The _tocEntryRequired() makes its decision whether the given TocEntry
to be dumped here, or not, based on the given context.
Previously, all the sections which needs cleanups and access privileges
were not belonging to SECTION_DATA, so, data sections were ignored, even
if it needs to restore cleanup code and access privileges.

At the pg_backup_archiver.c:329 chunk, it checks whether we need to clean
up the existing object specified by the TocEntry. If the entry is BLOB
ITEM, _tocEntryRequired() returns REQ_DATA (if --data-only given), then
it does not write out the cleanup code.
(We have to unlink the existing large objects to be restored prior to
creation of them, so it got unavailable to clean up at _StartBlob().)

At the pg_backup_archiver.c:381 chunk, it checks whether we need to restore
access privileges, or not, using the given ACL TocEntry. In similar way,
the caller does not expect access privileges being restored when --data-only
is given.

The _tocEntryRequired() was also modified to handle large objects correctly.
Previously, when TocEntry does not have its own dumper (except for SEQUENCE
SET section), it was handled as a SECTION_SCHEMA.
If the 16th argument of ArchiveEntry() was NULL, it does not have its own
dumper function, even if the section is SECTION_DATA. Also, the dumpBlobItem()
calls ArchiveEntry() without its dumper, so it is misidentified as a schema
section. The ACL section of large objects are also misidentified.
So, I had to add these special treatments.

The dumpACL() is a utility function to write out GRANT/REVOKE statement for
the given acl string. When --data-only is given, it returns immediately
without any works. It prevented to dump access privileges of large objects.
However, all the caller already checks if (dataOnly) cases prior to its
invocation. So, I removed this check from the dumpACL().

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei kai...@ak.jp.nec.com


pgsql-fix-pg_dump-blob-privs.6.patch
Description: application/octect-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Writeable CTEs and empty relations

2010-02-08 Thread Marko Tiikkaja

On 2010-02-09 01:09 +0200, Tom Lane wrote:
 Marko Tiikkaja marko.tiikk...@cs.helsinki.fi writes:
 I traced this down to heapam.c, which has this:
 ...
 This doesn't exactly work anymore since we modify the snapshot after
 calling ExecInitScan().
 
 So don't do that.  The executor is generally entitled to assume that
 parameters given to ExecutorStart are correct.  In particular, changing
 the snapshot after the query has started to run seems to me to ensure
 all sorts of inconsistent and undesirable behavior.

You do remember that the whole patch in its current form depends on
modifying the snapshot?


Regards,
Marko Tiikkaja

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Listen / Notify - what to do when the queue is full

2010-02-08 Thread Jeff Davis

In this version of the patch, there is a compiler warning:

  async.c: In function ‘AtPrepare_Notify’:
  async.c:593: warning: unused variable ‘p’

and also two trivial merge conflicts: an OID conflict for the functions
you added, and a trivial code conflict.

On Sun, 2010-02-07 at 17:32 +0100, Joachim Wieland wrote:
 On Wed, Feb 3, 2010 at 2:05 AM, Jeff Davis pg...@j-davis.com wrote:
  The original comment was a part of the NotifyStmt case, and I don't
  think we can support NOTIFY issued on a standby system -- surely there's
  no way for the standby to communicate the notification to the master.
  Anyway, this is getting a little sidetracked; I don't think we need to
  worry about HS right now.
 
 True but I was not talking about moving any notifications to different
 servers. Clients listening on one server should receive the
 notifications from NOTIFYs executed on this server, no matter if it is
 a standby or the master server.

I'm not sure I agree with that philosophy. If the driving use-case for
LISTEN/NOTIFY is cache invalidation, then a NOTIFY on the master should
make it to all listening backends on the slaves.

 This is still kind of an open item but it's an slru issue and should
 also be true for other functionality that uses slru queues.

I haven't found out anything new here.

 This I don't understand... If power goes out and we restart, we'd
 first put all notifications from the prepared transactions into the
 queue. We know that they fit because they have fit earlier as well (we
 wouldn't allow user connections until we have worked through all 2PC
 state files).
 

Ok, it appears you've thought the 2PC interaction through more than I
have. Even if we don't include it this time, I'm glad to hear that there
is a plan to do so. Feel free to include support if you have it ready,
but it's late in the CF so I don't want you to get sidetracked on that
issue.

 There was another problem that the slru files did not all get deleted
 at server restart, which is fixed now.

Good catch.

 Regarding the famous ASCII-restriction open item I have now realized
 what I haven't thought of previously: notifications are not
 transferred between databases, they always stay in one database. Since
 libpq does the conversion between server and client encoding, it is
 questionable if we really need to restrict this at all... But I am not
 an encoding expert so whoever feels like he can confirm or refute
 this, please speak up.

I would like to support encoded text, but I think there are other
problems. For instance, what if one server has a client_encoding that
doesn't support some of the glyphs being sent by the notifying backend?
Then we have a mess, because we can't deliver it.

I think we just have to say ASCII only here, because there's no
reasonable way to handle this, regardless of implementation. If the user
issues SELECT and the client_encoding can't support some of the glyphs,
we can throw an error. But for a notification? We just have no mechanism
for that.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Andrew McNamara

If the date is passed in binary format, it will pass it to int4recv() --
but because the date is 4 bytes, and int4recv is defined for any 4-byte
input, it won't cause an error; it will produce a wrong result. In other
words, the binary representation for a date _is_ a valid binary
representation for an integer. The type inference has found the wrong
type, but the recv function still accepts it, which causes a problem.

Yes - of the worst kind: silent data corruption.

The solution is to write the query in an unambiguous way:

  SELECT $1::date + 1;

which is good practice, anyway. If it's not obvious to the type
inference system, it's probably not obvious to you, and will probably
surprise you ;)

That address this specific case, but it's ugly and not general. The right
thing is to set the correct type when you're marshalling the parameters...

Or, as Andrew suggests, you can pass the type oid along with the
parameter so that postgresql knows the right type.

That's right - if using the binary parameters, you *must* pass an
appropriate type oid for the data you send to the server. If you use the
unknown oid, bad things will happen (sooner or later).

While this is strictly true of both binary and text parameters, text
parameters have enough redundancy built into the format that it's rarely
a problem. Users have come to expect this leniency.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

97 matches

Mail list logo