Re: Plugging fd leaks (was Re: [HACKERS] Switching timeline over streaming replication)

2012-11-27 Thread Heikki Linnakangas

On 26.11.2012 14:53, Amit Kapila wrote:

On Friday, November 23, 2012 7:03 PM Heikki Linnakangas

This is what I came up with. It adds a new function, OpenFile, that
returns a raw file descriptor like BasicOpenFile, but the file
descriptor is associated with the current subtransaction and
automatically closed at end-of-xact, in the same way that AllocateFile
and AllocateDir work. In other words, OpenFile is to open() what
AllocateFile is to fopen(). BasicOpenFile is unchanged, it returns a raw
fd and it's solely the caller's responsibility to close it, but many of
the places that used to call BasicOpenFile now use the safer OpenFile
function instead.

This patch plugs three existing fd (or virtual fd) leaks:

1. copy_file() - fixed by by using OpenFile instead of BasicOpenFile 2.
XLogFileLinit() - fixed by adding close() calls to the error cases.
Can't use OpenFile here because the fd is supposed to persist over
transaction boundaries.
3. lo_import/lo_export - fixed by using OpenFile instead of
PathNameOpenFile.


I have gone through the patch and find it okay except for one minor
suggestion
1. Can we put below log in OpenFile as well
+DO_DB(elog(LOG, CloseFile: Allocated %d, numAllocatedDescs));


Thanks. Added that and committed.

I didn't dare to backpatch this, even though it could be fairly easily 
backpatched. The leaks exist in older versions too, but since they're 
extremely rare (zero complaints from the field and it's been like that 
forever), I didn't want to take the risk. Maybe later, after this has 
had more testing in master.



One thing I'm not too fond of is the naming. I'm calling the new
functions OpenFile and CloseFile. There's some danger of confusion
there, as the function to close a virtual file opened with
PathNameOpenFile is called FileClose. OpenFile is really the same kind
of operation as AllocateFile and AllocateDir, but returns an unbuffered
fd. So it would be nice if it was called AllocateSomething, too. But
AllocateFile is already taken. And I don't much like the Allocate*
naming for these anyway, you really would expect the name to contain
open.


OpenFileInTrans
OpenTransactionAwareFile

In anycase OpenFile is also okay.


I ended up calling the functions OpenTransientFile and 
CloseTransientFile. Windows has a library function called OpenFile, so 
that was a pretty bad choice after all.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Jeff Davis
On Sun, 2012-11-25 at 21:08 -0500, Robert Haas wrote:
 That, however, is a separate question from what's under discussion
 here, because the case at issue for the proposed patch is the one in
 which only one possible candidate exists, and the question is whether
 we ought to allow the use of assignment casts to allow the call to
 work rather than fail, NOT which of several overloaded functions we
 ought to pick.  In any situation in which overloading is in use, the
 patch as proposed changes nothing.  I'm not generally very good at
 interpreting the SQL standard text, but if it says that you ought to
 use assignment casts to match actual argument types to the chosen
 candidate function, then that seems like it's advocating for
 essentially the same position that you arrived at independently and
 that the patch also takes, which furthermore happens to be compatible
 with what other RDBMS systems do, at least in the no-overloading case.

Let's say you have only one function foo. All your queries are coded
into your application, and everything works fine, using assignment casts
where necessary.

Then the user is foolish enough to CREATE FUNCTION foo... and now their
queries start failing left and right.

In other words, only one possible candidate exists should be followed
by right now to be more precise.

That's a major violation of the principle of least astonishment, that
CREATE FUNCTION could cause such a disaster. I know that it can now, but
what you're proposing will come into play much more frequently because
most people start off with just one function by a particular name and
define more as needed.

If we do something like this, I think we should explicitly opt out of
the overloading feature at DDL time (somewhat like what Simon suggested
in another reply). E.g. CREATE {UNIQUE|OVERLOADED} FUNCTION ...

I'm not proposing that; in general I am very wary of changes to the type
system. I'm just saying that, if we do have special rules, we should
have a way to make sure that users know when the rules are changing.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gset updated patch

2012-11-27 Thread Piyush Newe
Just another thought !

When we are setting up the variable using \gset, I feel their should be a
provision
to drop a particular variable.

Internally, all the variables are set into VariableSpace linked-list. We
should provide
a command to Drop a particular variable, because in some cases unnecessary
the
variable count is increasing  consuming a VariableSpace.

We might use two different variables for two different queries, but if we
are not going
to use the first variable in further execution, then unnecessary we are
consuming a
space for 1st variable in the VariableSpace. In such cases, user will
drop the 1st
variable.

This particular feature/mechanism is useful for a queries which returns a
single row.
So user will be using such technique for multiple queries. In such cases,
user might
need to create multiple variables. Hence I thoughts so.

Let me know if such mechanism is already exists  I am missing the same.


On Mon, Nov 19, 2012 at 9:42 PM, Dimitri Fontaine dimi...@2ndquadrant.frwrote:

 Karl O. Pinc k...@meme.com writes:
  Yes. I'm wrong.  For some reason I thought you could use DO to make
  an anonymous code block that would act as a SETOF function,
  allowing RETURN NEXT expr (et-al) to be used in the
  plpgsql code, allowing DO to return table results.
  (Or, perhaps, instead, be used in place of a table in a SELECT
  statement.)  Oh well.

 My key for remembering about that point is that DO is a utility command,
 not a query. Now, the proposal I pushed last time we opened that very
 can of worms was to have inline functions rather than anonymous code
 blocks:

WITH FUNCTION foo(integer) returns bigint language SQL AS $$
 SELECT $1 + 1;
$$,

 Not sure how much that relates to $topic, but still something that
 raises in my mind with enough presence that I need to write about it so
 that it stops calling for attention :)

 Regards,
 --
 Dimitri Fontaine
 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
-- 
Piyush S Newe
Principal Engineer
EnterpriseDB
office: +91 20 3058 9500
www.enterprisedb.com

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are
not the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.


Re: [HACKERS] ilist.h fails cpluspluscheck

2012-11-27 Thread Andres Freund
On 2012-11-27 01:14:27 -0500, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
  In file included from ./src/include/utils/catcache.h:25:0,
   from /tmp/cpluspluscheck.bt8VZr/test.cpp:3:
  src/include/lib/ilist.h: In function ‘dlist_node* 
  dlist_head_node(dlist_head*)’:
  src/include/lib/ilist.h:470:39: error: invalid conversion from ‘void*’ to 
  ‘dlist_node*’ [-fpermissive]

  Maybe some ifndef __cplusplus would help.

 Or maybe we need to recommend use of -fpermissive?  If C++ thinks
 casting void * to something else is illegitimate, it's basically not
 going to cope with most things we might try to inline in Postgres.
 And I don't think that saying you don't get to call these fundamental
 support functions from C++ is likely to fly, so just hiding the
 functions won't help much.

Its rather easy to fix in the ilist code at least - the cases it points
out are those where I took a slightly ugly shortcut to reduce some very
minor code duplication...

Some casts fix it, alternatively the

 Assert(!dlist_is_empty());
 return head-head.next
could be moved into [ds]list_(head|tail)_node instead of relying on the
*_elemen_off support functions.

Greetings,

Andres

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 8089edcc6d448529dbd091ec95a69e138a55efa6 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Tue, 27 Nov 2012 10:55:37 +0100
Subject: [PATCH] Fix some warnings in ilist.h pointed out by cpluspluscheck
 via Peter

---
 src/include/lib/ilist.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/include/lib/ilist.h b/src/include/lib/ilist.h
index bc684b8..8f58486 100644
--- a/src/include/lib/ilist.h
+++ b/src/include/lib/ilist.h
@@ -467,7 +467,7 @@ dlist_head_element_off(dlist_head *head, size_t off)
 STATIC_IF_INLINE dlist_node *
 dlist_head_node(dlist_head *head)
 {
-	return dlist_head_element_off(head, 0);
+	return (dlist_node *) dlist_head_element_off(head, 0);
 }
 
 /* internal support function to get address of tail element's struct */
@@ -484,7 +484,7 @@ dlist_tail_element_off(dlist_head *head, size_t off)
 STATIC_IF_INLINE dlist_node *
 dlist_tail_node(dlist_head *head)
 {
-	return dlist_tail_element_off(head, 0);
+	return (dlist_node *) dlist_tail_element_off(head, 0);
 }
 #endif   /* PG_USE_INLINE || ILIST_INCLUDE_DEFINITIONS */
 
@@ -677,7 +677,7 @@ slist_head_element_off(slist_head *head, size_t off)
 STATIC_IF_INLINE slist_node *
 slist_head_node(slist_head *head)
 {
-	return slist_head_element_off(head, 0);
+	return (slist_node *) slist_head_element_off(head, 0);
 }
 #endif   /* PG_USE_INLINE || ILIST_INCLUDE_DEFINITIONS */
 
-- 
1.7.12.289.g0ce9864.dirty


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Jeff Davis
On Wed, 2012-11-21 at 15:27 +, Simon Riggs wrote:
 It would be useful if we issued a NOTICE when an ambiguity is
 introduced, rather than when using it.
 
 Like Bison's reporting of reduce conflicts.

This brings up a very important point, which is that a lot of the code
is frozen in applications yet invisible at DDL time. So we have to be
careful that DDL changes have a reasonable impact on the ability to
continue to compile and execute the previously-working SQL received from
the applications.

In other words, as I said in another reply, we want to avoid cases where
something seemingly innocuous (like creating a function) causes
previously-working SQL to fail due to ambiguity.

As Tom said, detecting the ambiguity at DDL time is not easy, so I'm not
suggesting that. And I know that creating a function can already cause
previously-working SQL to fail. I'm just saying we should be careful of
these situations and not make them more likely than necessary.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gset updated patch

2012-11-27 Thread Pavel Stehule
Hello

2012/11/27 Piyush Newe piyush.n...@enterprisedb.com:

 Just another thought !

 When we are setting up the variable using \gset, I feel their should be a
 provision
 to drop a particular variable.

 Internally, all the variables are set into VariableSpace linked-list. We
 should provide
 a command to Drop a particular variable, because in some cases unnecessary
 the
 variable count is increasing  consuming a VariableSpace.

 We might use two different variables for two different queries, but if we
 are not going
 to use the first variable in further execution, then unnecessary we are
 consuming a
 space for 1st variable in the VariableSpace. In such cases, user will drop
 the 1st
 variable.

 This particular feature/mechanism is useful for a queries which returns a
 single row.
 So user will be using such technique for multiple queries. In such cases,
 user might
 need to create multiple variables. Hence I thoughts so.


sorry, I don't understand

now, when we set variable by empty value, then we remove variable

???

regards

Pavel

 Let me know if such mechanism is already exists  I am missing the same.


 On Mon, Nov 19, 2012 at 9:42 PM, Dimitri Fontaine dimi...@2ndquadrant.fr
 wrote:

 Karl O. Pinc k...@meme.com writes:
  Yes. I'm wrong.  For some reason I thought you could use DO to make
  an anonymous code block that would act as a SETOF function,
  allowing RETURN NEXT expr (et-al) to be used in the
  plpgsql code, allowing DO to return table results.
  (Or, perhaps, instead, be used in place of a table in a SELECT
  statement.)  Oh well.

 My key for remembering about that point is that DO is a utility command,
 not a query. Now, the proposal I pushed last time we opened that very
 can of worms was to have inline functions rather than anonymous code
 blocks:

WITH FUNCTION foo(integer) returns bigint language SQL AS $$
 SELECT $1 + 1;
$$,

 Not sure how much that relates to $topic, but still something that
 raises in my mind with enough presence that I need to write about it so
 that it stops calling for attention :)

 Regards,
 --
 Dimitri Fontaine
 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




 --
 --
 Piyush S Newe
 Principal Engineer
 EnterpriseDB
 office: +91 20 3058 9500
 www.enterprisedb.com

 Website: www.enterprisedb.com
 EnterpriseDB Blog: http://blogs.enterprisedb.com/
 Follow us on Twitter: http://www.twitter.com/enterprisedb

 This e-mail message (and any attachment) is intended for the use of the
 individual or entity to whom it is addressed. This message contains
 information from EnterpriseDB Corporation that may be privileged,
 confidential, or exempt from disclosure under applicable law. If you are not
 the intended recipient or authorized to receive this for the intended
 recipient, any use, dissemination, distribution, retention, archiving, or
 copying of this communication is strictly prohibited. If you have received
 this e-mail in error, please notify the sender immediately by reply e-mail
 and delete this message.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Pavel Stehule
Hello all

2012/11/27 Jeff Davis pg...@j-davis.com:
 On Wed, 2012-11-21 at 15:27 +, Simon Riggs wrote:
 It would be useful if we issued a NOTICE when an ambiguity is
 introduced, rather than when using it.

 Like Bison's reporting of reduce conflicts.

 This brings up a very important point, which is that a lot of the code
 is frozen in applications yet invisible at DDL time. So we have to be
 careful that DDL changes have a reasonable impact on the ability to
 continue to compile and execute the previously-working SQL received from
 the applications.

 In other words, as I said in another reply, we want to avoid cases where
 something seemingly innocuous (like creating a function) causes
 previously-working SQL to fail due to ambiguity.

 As Tom said, detecting the ambiguity at DDL time is not easy, so I'm not
 suggesting that. And I know that creating a function can already cause
 previously-working SQL to fail. I'm just saying we should be careful of
 these situations and not make them more likely than necessary.


from my view - a current design works well, but for someone who see pg
first time, there can be lot of surprises.

a) PostgreSQL reports missing functions -- but there are issue in parameters
b) PostgreSQL requests explicit typing string literals to text -- and
again it reports not informative message

so minimally we can enhance a error messages

Regards

Pavel

 Regards,
 Jeff Davis



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update

2012-11-27 Thread Andres Freund
On 2012-11-26 16:39:39 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2012-10-31 11:41:37 +0530, Amit Kapila wrote:
  There seems to be a problem in behavior of Create Index Concurrently and 
  Hot
  Update in HEAD code .

  At pgcon.it I had a chance to discuss with Simon how to fix this
  bug. Please check the attached patches - and their commit messages - for
  the fix and some regression tests.

 I looked at this a bit.  I am very unhappy with the proposed kluge to
 open indexes NoLock in some places.  Even if that's safe today, which
 I don't believe in the least, any future change in this area could break
 it.

I am not happy with it either. But I failed to see a better way. Note
that in most of the cases a lock actually is acquired shortly
afterwards, just a -indisvalid or an -indisready check away. The only
exception is RelationGetIndexAttrBitmap which actually needs to look at
!indisvalid  !indisready indexes because of HOT. All the former cases
could just do a syscache lookup beforehand and skip if it doesn't match
their criterion. I wasn't sure about the (performance, notational)
overhead of doing a second syscache lookup in those paths.

Note that any -indisvalid/indisready change is protected by waiting for
all concurrent backends to finish, so I don't think the separate
syscache lookup/index_open would be a problem.

For RelationGetIndexAttrBitmap, I don't really see a point in the locks
acquired - after all were not protecting the RelationGetIndexList
either.

Greetings,

Andres

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2012-11-27 Thread Dimitri Fontaine
Peter Eisentraut pete...@gmx.net writes:
 I'm not fond of overloading LOAD as the refresh command.  Maybe you
 could go the Oracle route here as well and use a stored procedure.  That
 would also allow things like SELECT pg_refresh_mv(oid) FROM ... more
 easily.

I would like that we have a way to refresh a Materialized View by
calling a stored procedure, but I don't think it should be the main UI.

The wholesale refreshing of a matview appears to me to be comparable to
TRUNCATE is that it's both a DDL and a DML. The incremental refreshing
modes we want to have later are clearly DML only, either on commit
refresh or incrementally on demand.

I would then propose that we use ALTER MATERIALIZED VIEW as the UI for
the wholesale on demand refreshing operation, and UPDATE MATERIALIZED
VIEW as the incremental command (to come later).

So my proposal for the current feature would be:

  ALTER MATERIALIZED VIEW mv UPDATE [ CONCURRENTLY ];
  UPDATE MATERIALIZED VIEW mv;

The choice of keywords and syntax here hopefully clearly hint the user
about the locking behavior of the commands, too. And as we said, the
bare minimum for this patch does *not* include the CONCURRENTLY option,
which we still all want to have (someday). :)

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update

2012-11-27 Thread Andres Freund
On 2012-11-26 22:31:50 -0500, Tom Lane wrote:
 I wrote:
  I'm a bit inclined to think that DROP INDEX CONCURRENTLY should have an
  additional step that somehow marks the pg_index row in a new way that
  makes RelationGetIndexList ignore it, and then wait out existing
  transactions before taking the final step of dropping the index.  The
  Right Way to do this would likely have been to add another bool column,
  but we don't have that option anymore in 9.2.  Maybe we could abuse
  indcheckxmin somehow, ie use a state that can't otherwise occur (in
  combination with the values of indisvalid/indisready) to denote an index
  being dropped.

 I looked into this some more.  There are currently three possible states
 of the indisvalid/indisready flags:

 indisvalid = false, indisready = false

   initial state during CREATE INDEX CONCURRENTLY; this should
   result in sessions honoring the index for HOT-safety decisions,
   but not using it for searches or insertions

 indisvalid = false, indisready = true

   sessions should now insert into the index, but not use it
   for searches

 indisvalid = true, indisready = true

   normal, fully valid index

 Either state of indcheckxmin is valid with all three of these
 combinations, so the specific kluge I was contemplating above doesn't
 work.  But there is no valid reason for an index to be in this state:

 indisvalid = true, indisready = false

 I suggest that to fix this for 9.2, we could abuse these flags by
 defining that combination as meaning ignore this index completely,
 and having DROP INDEX CONCURRENTLY set this state during its final
 wait before removing the index.

 Obviously, an additional flag column would be a much cleaner fix,
 and I think we should add one in HEAD.  But it's too late to do
 that in 9.2.

While I aggree that more states would make some stuff cleaner, as long
as were still locking entries in RelationGetIndexAttrBitmap that have
indisvalid = false, indisready = false we still have broken DROP INDEX
CONCURRENTLY due to the exlusive lock acquired during actually dropping
the index. Which can take quite a while on a big index.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2012-11-27 Thread Pavel Stehule
2012/11/27 Dimitri Fontaine dimi...@2ndquadrant.fr:
 Peter Eisentraut pete...@gmx.net writes:
 I'm not fond of overloading LOAD as the refresh command.  Maybe you
 could go the Oracle route here as well and use a stored procedure.  That
 would also allow things like SELECT pg_refresh_mv(oid) FROM ... more
 easily.

 I would like that we have a way to refresh a Materialized View by
 calling a stored procedure, but I don't think it should be the main UI.

 The wholesale refreshing of a matview appears to me to be comparable to
 TRUNCATE is that it's both a DDL and a DML. The incremental refreshing
 modes we want to have later are clearly DML only, either on commit
 refresh or incrementally on demand.

 I would then propose that we use ALTER MATERIALIZED VIEW as the UI for
 the wholesale on demand refreshing operation, and UPDATE MATERIALIZED
 VIEW as the incremental command (to come later).

 So my proposal for the current feature would be:

   ALTER MATERIALIZED VIEW mv UPDATE [ CONCURRENTLY ];
   UPDATE MATERIALIZED VIEW mv;

 The choice of keywords and syntax here hopefully clearly hint the user
 about the locking behavior of the commands, too. And as we said, the
 bare minimum for this patch does *not* include the CONCURRENTLY option,
 which we still all want to have (someday). :)


+1

Pavel

 Regards,
 --
 Dimitri Fontaine
 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] pg_ping utility

2012-11-27 Thread Dimitri Fontaine
Peter Eisentraut pete...@gmx.net writes:
 Sure, PQping is useful for this very specific use case of seeing whether
 the server has finished starting up.  If someone came with a plausible

Rename the utility to pg_isready?

-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update

2012-11-27 Thread Andres Freund
On 2012-11-27 10:18:37 +0530, Pavan Deolasee wrote:
 On Tue, Nov 27, 2012 at 9:01 AM, Tom Lane t...@sss.pgh.pa.us wrote:

  I wrote:
 
  Either state of indcheckxmin is valid with all three of these
  combinations, so the specific kluge I was contemplating above doesn't
  work.  But there is no valid reason for an index to be in this state:
 
  indisvalid = true, indisready = false
 
  I suggest that to fix this for 9.2, we could abuse these flags by
  defining that combination as meaning ignore this index completely,
  and having DROP INDEX CONCURRENTLY set this state during its final
  wait before removing the index.
 
 
 Yeah, this looks much better, given our inability to add a new catalog
 column in 9.2. Can we cheat a little though and use a value other than 0
 and 1 for indisvalid or indisready to tell the server to interpret it
 differently ? I would assume not, but can't see a reason unless these
 values are converted to other types and back to boolean.

 Andres complained about the fact that many callers of RelationGetIndexList
 are probably not ready to process invalid or not-yet-ready indexes and
 suggested that those changes should be backpatched to even older releases.
 But IMO we should do that with a test case that demonstrates that there is
 indeed a bug.

I haven't yet looked deeply enough to judge whether there are actually
bugs. But I can say that the e.g. the missing indisvalid checks in
transformFkeyCheckAttrs makes me pretty uneasy. Vacuum not checking
whether indexes are ready isn't nice either.

 Also, we should teach RelationGetIndexList to take a flags
 argument and filter out indexes that the caller is not interested instead
 of every caller doing the checks separately.

Given that RelationGetIndexList currently is just returning a cached
list I don't see how thats going to work without significant overhead.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update

2012-11-27 Thread Andres Freund
On 2012-11-27 11:48:08 +0100, Andres Freund wrote:
 On 2012-11-27 10:18:37 +0530, Pavan Deolasee wrote:
  On Tue, Nov 27, 2012 at 9:01 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 
   I wrote:
  
   Either state of indcheckxmin is valid with all three of these
   combinations, so the specific kluge I was contemplating above doesn't
   work.  But there is no valid reason for an index to be in this state:
  
   indisvalid = true, indisready = false
  
   I suggest that to fix this for 9.2, we could abuse these flags by
   defining that combination as meaning ignore this index completely,
   and having DROP INDEX CONCURRENTLY set this state during its final
   wait before removing the index.
  
  
  Yeah, this looks much better, given our inability to add a new catalog
  column in 9.2. Can we cheat a little though and use a value other than 0
  and 1 for indisvalid or indisready to tell the server to interpret it
  differently ? I would assume not, but can't see a reason unless these
  values are converted to other types and back to boolean.
 
  Andres complained about the fact that many callers of RelationGetIndexList
  are probably not ready to process invalid or not-yet-ready indexes and
  suggested that those changes should be backpatched to even older releases.
  But IMO we should do that with a test case that demonstrates that there is
  indeed a bug.
 
 I haven't yet looked deeply enough to judge whether there are actually
 bugs. But I can say that the e.g. the missing indisvalid checks in
 transformFkeyCheckAttrs makes me pretty uneasy. Vacuum not checking
 whether indexes are ready isn't nice either.

At least the former was easy enough to verify after thinking about it
for a minute (=9.1):

CREATE TABLE clusterbug(id serial primary key, data int);
INSERT INTO clusterbug(data) VALUES(2),(2);
CREATE UNIQUE INDEX CONCURRENTLY clusterbug_data ON clusterbug(data);
CREATE TABLE clusterbug_ref(id serial primary key, clusterbug_id int references 
clusterbug(data));

Now a !indisready index is getting queried (strangely enough that
doesn't cause an error).

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update

2012-11-27 Thread Andres Freund
On 2012-11-27 11:52:11 +0100, Andres Freund wrote:
 On 2012-11-27 11:48:08 +0100, Andres Freund wrote:
  On 2012-11-27 10:18:37 +0530, Pavan Deolasee wrote:
   On Tue, Nov 27, 2012 at 9:01 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  
I wrote:
   
Either state of indcheckxmin is valid with all three of these
combinations, so the specific kluge I was contemplating above doesn't
work.  But there is no valid reason for an index to be in this state:
   
indisvalid = true, indisready = false
   
I suggest that to fix this for 9.2, we could abuse these flags by
defining that combination as meaning ignore this index completely,
and having DROP INDEX CONCURRENTLY set this state during its final
wait before removing the index.
   
   
   Yeah, this looks much better, given our inability to add a new catalog
   column in 9.2. Can we cheat a little though and use a value other than 0
   and 1 for indisvalid or indisready to tell the server to interpret it
   differently ? I would assume not, but can't see a reason unless these
   values are converted to other types and back to boolean.
  
   Andres complained about the fact that many callers of RelationGetIndexList
   are probably not ready to process invalid or not-yet-ready indexes and
   suggested that those changes should be backpatched to even older releases.
   But IMO we should do that with a test case that demonstrates that there is
   indeed a bug.
 
  I haven't yet looked deeply enough to judge whether there are actually
  bugs. But I can say that the e.g. the missing indisvalid checks in
  transformFkeyCheckAttrs makes me pretty uneasy. Vacuum not checking
  whether indexes are ready isn't nice either.

 At least the former was easy enough to verify after thinking about it
 for a minute (=9.1):

 CREATE TABLE clusterbug(id serial primary key, data int);
 INSERT INTO clusterbug(data) VALUES(2),(2);
 CREATE UNIQUE INDEX CONCURRENTLY clusterbug_data ON clusterbug(data);
 CREATE TABLE clusterbug_ref(id serial primary key, clusterbug_id int 
 references clusterbug(data));

 Now a !indisready index is getting queried (strangely enough that
 doesn't cause an error).

That should work in 9.2 as well, although its slightly more
complicated. You just need a indisready  !indisvalid index and the
foreign key will happily be created. Easy enough to achieve with two
sessions.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update

2012-11-27 Thread Pavan Deolasee
On Tue, Nov 27, 2012 at 4:22 PM, Andres Freund and...@2ndquadrant.comwrote:

 On 2012-11-27 11:48:08 +0100, Andres Freund wrote:

 
  I haven't yet looked deeply enough to judge whether there are actually
  bugs. But I can say that the e.g. the missing indisvalid checks in
  transformFkeyCheckAttrs makes me pretty uneasy. Vacuum not checking
  whether indexes are ready isn't nice either.

 At least the former was easy enough to verify after thinking about it
 for a minute (=9.1):

 CREATE TABLE clusterbug(id serial primary key, data int);
 INSERT INTO clusterbug(data) VALUES(2),(2);
 CREATE UNIQUE INDEX CONCURRENTLY clusterbug_data ON clusterbug(data);
 CREATE TABLE clusterbug_ref(id serial primary key, clusterbug_id int
 references clusterbug(data));

 Now a !indisready index is getting queried (strangely enough that
 doesn't cause an error).


There might be a bug there as you are suggesting, but for me the CREATE
INDEX itself fails (and rightly so) because of duplicate keys. Do I need to
run these statements in different sessions ?

Thanks,
Pavan


Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY

2012-11-27 Thread Andres Freund
On 2012-11-26 22:44:49 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2012-10-05 19:56:40 -0400, Tom Lane wrote:
  I think this could possibly be fixed by using nontransactional
  update-in-place when we're trying to change indisvalid and/or
  indisready, but I've not really thought through the details.

  I couldn't really think of any realistic method to fix this other than
  update in place. I thought about it for a while and I think it should
  work, but I have to say it makes me slightly uneasy.

 I looked through the code a bit, and I think we should be able to make
 this work, but note there are also places that update indcheckxmin using
 heap_update, and that's just as dangerous as changing the other two
 flags via heap_update, if you don't have exclusive lock on the table.
 This is going to need some careful thought, because we currently expect
 that such an update will set the pg_index row's xmin to the current XID,
 which is something an in-place update can *not* do.  I think this is a
 non-problem during construction of a new index, since the xmin ought to
 be the current XID already anyway, but it's less clear what to do in
 REINDEX.  In the short term there may be no problem since REINDEX takes
 exclusive lock on the parent table anyway (and hence could safely do
 a transactional update) but if we ever want REINDEX CONCURRENTLY we'll
 need a better answer.

Isn't setting indexcheckxmin already problematic in the case of CREATE
INDEX CONCURRENTLY? index_build already runs in a separate transaction
there.

I am really surprised that we haven't seen any evidence of a problem
with this. On a database with a busy  bigger catalog that ought to be
a real problem.

I wonder whether we couldn't fix it by introducing a variant/wrapper of
heap_fetch et al. that follows t_ctid if the tuple became invisible
recently. That ought to be able to fix most of these issues in a
pretty general fashion.
That would make a nice implementation of REINDEX CONCURRENTLY easier as
well...

Btw, even if we manage to find a sensible fix for this I would suggest
postponing it after the next back branch release.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update

2012-11-27 Thread Andres Freund
On 2012-11-27 16:39:58 +0530, Pavan Deolasee wrote:
 On Tue, Nov 27, 2012 at 4:22 PM, Andres Freund and...@2ndquadrant.comwrote:
 
  On 2012-11-27 11:48:08 +0100, Andres Freund wrote:
 
  
   I haven't yet looked deeply enough to judge whether there are actually
   bugs. But I can say that the e.g. the missing indisvalid checks in
   transformFkeyCheckAttrs makes me pretty uneasy. Vacuum not checking
   whether indexes are ready isn't nice either.
 
  At least the former was easy enough to verify after thinking about it
  for a minute (=9.1):
 
  CREATE TABLE clusterbug(id serial primary key, data int);
  INSERT INTO clusterbug(data) VALUES(2),(2);
  CREATE UNIQUE INDEX CONCURRENTLY clusterbug_data ON clusterbug(data);
  CREATE TABLE clusterbug_ref(id serial primary key, clusterbug_id int
  references clusterbug(data));
 
  Now a !indisready index is getting queried (strangely enough that
  doesn't cause an error).
 
 
 There might be a bug there as you are suggesting, but for me the CREATE
 INDEX itself fails (and rightly so) because of duplicate keys. Do I need to
 run these statements in different sessions ?

Sure, the CREATE INDEX fails. The point is that we successfully can
create the foreign key afterwards anyway.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update

2012-11-27 Thread Pavan Deolasee
On Tue, Nov 27, 2012 at 11:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Pavan Deolasee pavan.deola...@gmail.com writes:
  On Tue, Nov 27, 2012 at 9:01 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  Either state of indcheckxmin is valid with all three of these
  combinations, so the specific kluge I was contemplating above doesn't
  work.  But there is no valid reason for an index to be in this state:
  indisvalid = true, indisready = false

  Yeah, this looks much better, given our inability to add a new catalog
  column in 9.2. Can we cheat a little though and use a value other than 0
  and 1 for indisvalid or indisready to tell the server to interpret it
  differently ?

 No, not unless you'd like select * from pg_index to misbehave.
 Personally, I like being able to look at catalogs.


Hmm.. I thought so. Thanks.

If we add a new column to the catalog for HEAD, I think it will be a good
idea to add an indflags kind of column which can store a bitmap of flags.
We probably don't get into this kind of situation often, but if we do, then
we can more flexibility without impacting rebuilding the catalogs. My two
cents.

Thanks,
Pavan


Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update

2012-11-27 Thread Pavan Deolasee
On Tue, Nov 27, 2012 at 4:49 PM, Andres Freund and...@2ndquadrant.comwrote:

 On 2012-11-27 16:39:58 +0530, Pavan Deolasee wrote:

  
  There might be a bug there as you are suggesting, but for me the CREATE
  INDEX itself fails (and rightly so) because of duplicate keys. Do I need
 to
  run these statements in different sessions ?

 Sure, the CREATE INDEX fails. The point is that we successfully can
 create the foreign key afterwards anyway.


Ah OK. Got it.

Thanks,
Pavan


Re: [HACKERS] Review: Patch to compute Max LSN of Data Pages

2012-11-27 Thread Amit kapila
Friday, November 23, 2012 5:38 AM Muhammad Usama wrote:

 Observations and Comments
 ---
 - If no option is given to pg_computemaxlsn utility then we get a wrong
 error message
 ./pg_computemaxlsn  ../data
 pg_computemaxlsn: file or directory (null) exists
 In my opinion we should consider the -P (Print max LSN from whole data
 directory) as default in case no option is specified, or otherwise throw a
 more descriptive error.



Fixed. By considering default option as Whole data directory.


 - Options -p and -P should be mutually exclusive

Fixed. Both options will not be allowed at the same time.



- Long options missing for -p and -P



Fixed. introduced long options as --path  --data-directory.


 - Help string for -P  should be something like print max LSN from whole
 data directory instead of  print max LSN from whole database

Fixed, by changing the message as suggested.

 - Help menu is silent about -V or --version option

Fixed.

 - I think when finding the max LSN of single file the utility should
 consider all relation segments.

IMHO, as this utility is not aware of relation or any other logical entity and 
deals with only file or directory,

  it is okay to find only for that file.



 - There is no check for extra arguments
 e.g
 ./pg_computemaxlsn  -P . ../data/ ../data/base/ ../data2/

Fixed.
- Extra options gives error.
- Multiple -p options also gives error.
Eg:
./pg_computemaxlsn -p base/12286/12200 -p base/12286/12201 data



 - For -p {file | dir}  option the utility expects the file path relative to
 the specified data directory path which makes thing little confusing
 for example
 ./pg_computemaxlsn -p data/base/12286/12200 data
 pg_computemaxlsn: file or directory data/base/12286/12200 does not exists
 although data/base/12286/12200 is valid file path but we gets file does not
 exists error
As the path of file is relative to data directory, that's why in error it 
prints the path as data/base/12286/12200.
Do you have any suggestion what should be done for this?

 - Utility should print the max LSN number in hexadecimal notation and LSN
 format (logid/segno) for consistency with PG,

Fixed

 and may also print the file name which contains that LSN.

It might be confusing to users for the cases where it has to follow link 
(pg_tblspc) and then print some absolute path which is not related to relative 
path given by user.
Also I am not sure if it will be of any use to user.



 - When finding max LSN from directory, the utility does not considers the
 sub directories, which I think could be useful in some cases, like if we
 want to find the max LSN in tablespace directory. With the current
 implementation we would require to run the utility multiple times, once for
 each database.

It will consider sub-directories as well.


 Code Review
 -

 Code generally looks good, I have few small points.

 - In main() function variable maxLSN is defined as uint64,
 Although XLogRecPtr and uint64 are the same thing, but I think we should
 use XLogRecPtr instead.
Changed as per suggestion.

 - Just a small thing although will hardly improve anything but still for
 sake of saving few cycles, I think you should use XLByteLT (  ) instead
 of XLByteLE ( = )
 to find if the previous max LSN is lesser then current.

Changed as per suggestion.

 - '\n' is missing from one the printf in the code :-)
 fprintf(stderr, _(%s: file or directory \%s\ does not exists),
 progname, LsnSearchPath);

Changed as per suggestion.

 - While finding the max LSN from single file, you are not verifying that
 the file actually exists inside the data directory path provided. and can
 end up looking
at wrong data directory for checking if the server is running.
 For example:
 bin/pg_computemaxlsn -p $PWD/data/base/12286/12200
 $PWD/otherinstallation/data
 Maximum LSN found is: 0, WAL segment file name (fileid,seg):
 

 - Code is not verifying, if the provided data directory is the valid data
 directory, This could make pg_computemaxlsn to compute max LSN from the
 running server file.
 For example:
 bin/pg_computemaxlsn -p $PWD/data/base/12286/12200 NONDATADIR/
 Maximum LSN found is: 0, WAL segment file name (fileid,seg):


Fixed, by allowing only relative path for file or directory and check if that 
is relative and below data directory with function 
path_is_relative_and_below_cwd().

 Questions
 -

 - I am wondering why are you not following the link inside the pg_tblspc
 directory to get to the table space location instead of building the
 directory path. I am thinking if you follow the link instead, The utility
 can be used across the different versions of PG.
Changed as per suggestion.


Apart from above, updated the documentation.
With Regards,
Amit Kapila.


pg_computemaxlsn.v2.patch
Description: pg_computemaxlsn.v2.patch

-- 
Sent via pgsql-hackers mailing list 

Re: [HACKERS] [WIP] pg_ping utility

2012-11-27 Thread Michael Paquier
On Tue, Nov 27, 2012 at 7:35 PM, Dimitri Fontaine dimi...@2ndquadrant.frwrote:

 Peter Eisentraut pete...@gmx.net writes:
  Sure, PQping is useful for this very specific use case of seeing whether
  the server has finished starting up.  If someone came with a plausible

 Rename the utility to pg_isready?

+1, the current version of the patch is already fitted for that and would
not need extra options like the number of packages sent.
-- 
Michael Paquier
http://michael.otacoo.com


Re: [HACKERS] review: pgbench - aggregation of info written into log

2012-11-27 Thread Pavel Stehule
Hello

2012/11/12 Tomas Vondra t...@fuzzy.cz:
 Hi,

 attached is a v4 of the patch. There are not many changes, mostly some
 simple tidying up, except for handling the Windows.

 After a bit more investigation, it seems to me that we can't really get
 the same behavior as in other systems - basically the timestamp is
 unavailable so we can't log the interval start. So it seems to me the
 best we can do is to disable this option on Windows (and this is done in
 the patch). So when you try to use --aggregate-interval on Win it will
 complain it's an unknown option.

 Now that I think of it, there might be a better solution that would not
 mimic the Linux/Unix behaviour exactly - we do have elapsed time since
 the start of the benchmark, so maybe we should use this instead of the
 timestamp? I mean on systems with reasonable gettimeofday we'd get

 1345828501 5601 1542744 483552416 61 2573
 1345828503 7884 1979812 565806736 60 1479
 1345828505 7208 1979422 567277552 59 1391
 1345828507 7685 1980268 569784714 60 1398
 1345828509 7073 1979779 573489941 236 1411

 but on Windows we'd get

 0 5601 1542744 483552416 61 2573
 2 7884 1979812 565806736 60 1479
 4 7208 1979422 567277552 59 1391
 6 7685 1980268 569784714 60 1398
 8 7073 1979779 573489941 236 1411

 i.e. the first column is seconds since start of the test. H, and
 maybe we could call 'gettimeofday' at the beginning, to get the
 timestamp of the test start (AFAIK the issue on Windows is the
 resolution, but it should be enough). Then we could add it up with the
 elapsed time and we'd get the same output as on Linux.

 And maybe this could be done in regular pgbench execution too? But I
 really need help from someone who knows Windows and can test this.

 Comments regarding Pavel's reviews are below:

 On 2.10.2012 20:08, Pavel Stehule wrote:
 Hello

 I did review of pgbench patch
 http://archives.postgresql.org/pgsql-hackers/2012-09/msg00737.php

 * this patch is cleanly patched

 * I had a problem with doc

 make[1]: Entering directory `/home/pavel/src/postgresql/doc/src/sgml'
 openjade  -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D .
 -c /usr/share/sgml/docbook/dsssl-stylesheets/catalog -d stylesheet.dsl
 -t sgml -i output-html -V html-index postgres.sgml
 openjade:pgbench.sgml:767:8:E: document type does not allow element
 TITLE here; missing one of ABSTRACT, AUTHORBLURB, MSGSET,
 CALLOUTLIST, ITEMIZEDLIST, ORDEREDLIST, SEGMENTEDLIST,
 VARIABLELIST, CAUTION, IMPORTANT, NOTE, TIP, WARNING,
 FORMALPARA, BLOCKQUOTE, EQUATION, EXAMPLE, FIGURE, TABLE,
 PROCEDURE, SIDEBAR, QANDASET, REFSECT3 start-tag
 make[1]: *** [HTML.index] Error 1
 make[1]: *** Deleting file `HTML.index'
 make[1]: Leaving directory `/home/pavel/src/postgresql/doc/src/sgml

 Hmmm, I've never got the docs to build at all, all I do get is a heap of
 some SGML/DocBook related issues. Can you investigate a bit more where's
 the issue? I don't remember messing with the docs in a way that might
 cause this ... mostly copy'n'paste edits.

 * pgbench is compiled without warnings

 * there is a few issues in source code

 +
 + /* should we aggregate the results or not? */
 + if (use_log_agg)
 + {
 +
 + /* are we still in the same interval? if yes, 
 accumulate the
 +  * values (print them otherwise) */
 + if (agg-start_time + use_log_agg = 
 INSTR_TIME_GET_DOUBLE(now))
 + {
 +

 Errr, so what are the issues here?

using a use_log_agg variable - bad name for variable - it is fixed



 * a real time range for aggregation is dynamic (pgbench is not
 realtime application), so I am not sure if following constraint has
 sense

  +if ((duration  0)  (use_log_agg  0)  (duration % use_log_agg != 
 0)) {
 + fprintf(stderr, duration (%d) must be a multiple of 
 aggregation
 interval (%d)\n, duration, use_log_agg);
 + exit(1);
 + }

 I'm not sure what might be the issue here either. If the test duration
 is set (using -T option), then this kicks in and says that the
 duration should be a multiple of aggregation interval. Seems like a
 sensible assumption to me. Or do you think this is check should be removed?

 * it doesn't flush last aggregated data (it is mentioned by Tomas:
 Also, I've realized the last interval may not be logged at all - I'll
 take a look into this in the next version of the patch.). And it can
 be significant for higher values of use_log_agg

 Yes, and I'm still not sure how to fix this in practice. But I do have
 this on my TODO.


 * a name of variable use_log_agg is not good - maybe log_agg_interval ?

 I've changed it to agg_interval.

ok

issues:

* empty lines with invisible chars (tabs) + and sometimes empty lines
after and before {}

* adjustment of start_time

+   * the desired interval */
+   

Re: [HACKERS] Materialized views WIP patch

2012-11-27 Thread Kevin Grittner
Pavel Stehule wrote:
 2012/11/27 Dimitri Fontaine dimi...@2ndquadrant.fr:

 I would like that we have a way to refresh a Materialized View
 by calling a stored procedure, but I don't think it should be
 the main UI.

I agree. I saw that Oracle uses a function for that without any
statement-level support, and that would probably be easier to
implement; but it felt wrong to do it that way. I couldn't think of
any other cases where similar action is taken without statement
syntax for it.

 The wholesale refreshing of a matview appears to me to be
 comparable to TRUNCATE is that it's both a DDL and a DML. The
 incremental refreshing modes we want to have later are clearly
 DML only, either on commit refresh or incrementally on demand.

Personally, I expect the most popular update method to eventually
become a queued update. I've looked ahead far enough to see that I
want to structure the incremental updates to be controlled through
an API where changes to supporting tables produce records saying
what was done which are fed to consumers which do the updating.
Then it becomes a matter of whether that consumer performs the
related updates to the MV during commit processing of the producer,
by pulling from a queue, or by reading accumulated rows when the MV
is referenced.

But I'm getting ahead of things with such speculation...

 I would then propose that we use ALTER MATERIALIZED VIEW as the
 UI for the wholesale on demand refreshing operation, and UPDATE
 MATERIALIZED VIEW as the incremental command (to come later).

Honestly, I have managed to keep myself from speculating on syntax
for incremental updates. There will be enough complexity involved
that I expect months of bikeshedding. :-/

 So my proposal for the current feature would be:

 ALTER MATERIALIZED VIEW mv UPDATE [ CONCURRENTLY ];
 UPDATE MATERIALIZED VIEW mv;

An ALTER MATERIALIZED VIEW option was my first thought on syntax to
do what LOAD does in the current patch. But it bothered me that I
couldn't think of any other cases where ALTER some-object-type
only changed the data contained within the object and had no other
impact. Are you both really comfortable with an ALTER MATERIALIZED
VIEW which has no effect other than to update the data? It seems
wrong to me.

 The choice of keywords and syntax here hopefully clearly hint
 the user about the locking behavior of the commands, too. And as
 we said, the bare minimum for this patch does *not* include the
 CONCURRENTLY option, which we still all want to have (someday).
 :)
 
 +1

Sure -- a CONCURRENTLY option for LMV (or AMVU) seems like one of
the next steps. I'll feel more confident about implementing that
when it appears that we have shaken the last bugs out of
CREATE/DROP INDEX CONCURRENTLY, since anything which affects those
statements will probably also matter here.

-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: Deparsing DDL command strings

2012-11-27 Thread Dimitri Fontaine
Hi,

Pavel Stehule pavel.steh...@gmail.com writes:
 ** missing doc

I still would prefer to have an ack about the premises of this patch
before spending time on writing docs for it, namely:

 - we want a normalized command string (schema, auto naming of some
   objects such as constraints and indexes, exporting default values,
   etc);

 - this normalisation can not happen as an extension given the current
   source code organisation and the new ddl_rewrite module is a good fit
   to solve that problem;

 - we offer a new event alias ddl_command_trace that will get
   activated start of a DROP and end of an ALTER or a CREATE command so
   that it's easy to hook at the right place when you want to trace
   things;

 - it's now possible and easy to process GENERATED commands if you wish
   to do so (explicit sequence and index operations for a create table
   containing a serial primary key column).

I think no complaints is encouraging though, so I will probably begin
the docs effort soonish.

 ** CREATE FUNCTION is not supported

I've added support for create and drop function in the attached patch.
Not all commands are rewritten yet though, and not all of them even have
support for the TG_OBJECTID, TG_OBJECTNAME and TG_SCHEMANAME variables.

I think we could still apply that patch + docs and a limited set of
commands, and continue filling the gaps till the end of this cycle. Once
we agree on how to attack the problem at hand, do we really need support
for ALTER OPERATOR FAMILY for an intermediate commit? You tell me.

 * some wrong in deparsing - doesn't support constraints

I've been fixing that and reviewing ALTER TABLE rewriting, should be ok
now. Note that we have to analyze the alter table work queue, not the
parsetree, if we want to normalize automatic constraints names and such
like.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support



event_trigger_infos.2.patch.gz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DEALLOCATE IF EXISTS

2012-11-27 Thread Heikki Linnakangas

On 09.10.2012 17:44, Vik Reykja wrote:

On Tue, Oct 9, 2012 at 4:09 PM, Tom Lanet...@sss.pgh.pa.us  wrote:


=?ISO-8859-1?Q?S=E9bastien_Lardi=E8re?=slardi...@hi-media.com  writes:

Indeed, brackets was not correct, it's better now (I think), and correct
some comments.


Still wrong ... at the very least you missed copyfuncs/equalfuncs.
In general, when adding a field to a struct, it's good practice to
grep for all uses of that struct.



I don't see Sébastien's message, but I made the same mistake in my patch.
Another one is attached with copyfuncs and equalfuncs.  I did a grep for
DeallocateStmt and I don't believe I have missed anything else.

Also, I'm changing the subject so as not to hijack this thread any further.


I fail to see the point of DEALLOCATE IF EXISTS. Do you have real use 
case for this, or was this just a case of adding IF EXISTS to all 
commands for the sake of completeness?


Usually the client knows what statements have been prepared, but perhaps 
you want to make sure everything is deallocated in some error handling 
case or similar. But in that case, you might as well just issue a 
regular DEALLOCATE and ignore errors. Or even more likely, you'll want 
to use DEALLOCATE ALL.


- Heikki

--
- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2012-11-27 Thread David Johnston
On Nov 27, 2012, at 5:25, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:

 
 So my proposal for the current feature would be:
 
  ALTER MATERIALIZED VIEW mv UPDATE [ CONCURRENTLY ];
  UPDATE MATERIALIZED VIEW mv;
 
 The choice of keywords and syntax here hopefully clearly hint the user
 about the locking behavior of the commands, too. And as we said, the
 bare minimum for this patch does *not* include the CONCURRENTLY option,
 which we still all want to have (someday). :)
 

I dislike using ALTER syntax to perform a data-only action.

The other advantage of non-functional syntax is that you could more easily 
supply some form of where clause should you only want to perform a partial 
refresh.  With a function call that becomes more obtuse.

David J.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2012-11-27 Thread Dimitri Fontaine
Kevin Grittner kgri...@mail.com writes:
 An ALTER MATERIALIZED VIEW option was my first thought on syntax to
 do what LOAD does in the current patch. But it bothered me that I
 couldn't think of any other cases where ALTER some-object-type
 only changed the data contained within the object and had no other
 impact. Are you both really comfortable with an ALTER MATERIALIZED
 VIEW which has no effect other than to update the data? It seems
 wrong to me.

I think you can already do that with some clever use of alter table ...
type using, or alter table set default.

 Sure -- a CONCURRENTLY option for LMV (or AMVU) seems like one of
 the next steps. I'll feel more confident about implementing that
 when it appears that we have shaken the last bugs out of
 CREATE/DROP INDEX CONCURRENTLY, since anything which affects those
 statements will probably also matter here.

Sure.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2012-11-27 Thread Kevin Grittner
Dimitri Fontaine wrote:
 Kevin Grittner kgri...@mail.com writes:
 An ALTER MATERIALIZED VIEW option was my first thought on syntax
 to do what LOAD does in the current patch. But it bothered me
 that I couldn't think of any other cases where ALTER
 some-object-type only changed the data contained within the
 object and had no other impact. Are you both really comfortable
 with an ALTER MATERIALIZED VIEW which has no effect other than
 to update the data? It seems wrong to me.
 
 I think you can already do that with some clever use of alter
 table ... type using, or alter table set default.

You mean, specifying an ALTER TABLE which appears to specify a
change to some non-data aspect of a table but which is really
setting it to the existing state? And it therefore rewrites the
table? I suppose that with USING you could actually even have it
rewritten with data different from what was there before without
changing the structure of the table. Somehow I don't find that
pursuasive as an argument for what ALTER MATERIALIZED VIEW should
rescan the source relations and build a whole new set of data for
exactly the same MV definition.

Consider that in relational theory a table is considered a relation
variable. ALTER is supposed to change the definition of the
variable in some way. Other statements are used to change the value
contained in the variable. Sure there are some grey areas already,
but I don't see where we need to muddy the waters in this case.

-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] pg_ping utility

2012-11-27 Thread Phil Sorber
On Tue, Nov 27, 2012 at 8:45 AM, Michael Paquier
michael.paqu...@gmail.com wrote:


 On Tue, Nov 27, 2012 at 7:35 PM, Dimitri Fontaine dimi...@2ndquadrant.fr
 wrote:

 Peter Eisentraut pete...@gmx.net writes:
  Sure, PQping is useful for this very specific use case of seeing whether
  the server has finished starting up.  If someone came with a plausible

 Rename the utility to pg_isready?

 +1, the current version of the patch is already fitted for that and would
 not need extra options like the number of packages sent.

I am 100% fine with this. I can make this change tomorrow.

 --
 Michael Paquier
 http://michael.otacoo.com

Sorry I haven't jumped in on this thread more, I have limited
connectivity where I am.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2012-11-27 Thread Dimitri Fontaine
Kevin Grittner kgri...@mail.com writes:
 changing the structure of the table. Somehow I don't find that
 pursuasive as an argument for what ALTER MATERIALIZED VIEW should
 rescan the source relations and build a whole new set of data for
 exactly the same MV definition.

Fair enough.

 Consider that in relational theory a table is considered a relation
 variable. ALTER is supposed to change the definition of the
 variable in some way. Other statements are used to change the value
 contained in the variable. Sure there are some grey areas already,
 but I don't see where we need to muddy the waters in this case.

Under that light, using ALTER is strange indeed. I still don't like
using LOAD that much, allow me to try a last syntax proposal. Well all I
can find just now would be:

  UPDATE MATERIALIZED VIEW mv FOR EACH ROW;
  UPDATE MATERIALIZED VIEW mv FOR EACH STATEMENT [ CONCURRENTLY ];

The only value of such a proposal is that it's not LOAD and it's still
not introducing any new keyword. Oh it's also avoiding to overload the
SNAPSHOT keyword. Well, it still does not look like the best candidate.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-27 Thread Alvaro Herrera
Alvaro Herrera wrote:
 Here's version 24.

Old review emails still contains some items that didn't lead to any
changes.  I tried to keep close track of those.  To that list I add a
couple of things of my own.  Here they are, for those following along.
I welcome suggestions.


- Fix the multixact cache in multixact.c -- see mXactCacheEnt.

- ResetHintBitMulti() was removed; need to remove old XMAX_IS_MULTI
  somewhere; perhaps during HOT page pruning?

- EvalPlanQual and ExecLockRows maybe need a different overall locking
  strategy.  Right now follow_updates=false is used to avoid deadlocks.

- Ensure multixact.c behaves sanely on wraparound.

- Is the case of a a non-key-update followed by a key-update actually handled
  when doing a heap_lock_tuple with mode = LockTupleKeyShare and follow_updates
  = false? I don't really see how, so it seems to deserve at least a comment.

- if oldestMultiXactId + db is set and then that database is dropped we seem to
  have a problem because MultiXactAdvanceOldest won't overwrite those
  values. Should probably use SetMultiXactIdLimit directly.

- what stop multixacts only being filled out (i.e RecordNewMultiXact()) *after*
  the XLogInsert() *and* after a MultiXactGetCheckptMulti()? Afaics
  MultiXactGenLock is not hold in CreateMultiXactId(). If we crash in that
  moment we loose the multixact data which now means potential data loss...

- multixact member group data crossing 512 sector boundaries makes me uneasy
  (as its 5 bytes). I don't really see a scenario where its dangerous, but
  ... Does anybody see a problem here?

- there are quite some places that do
multiStopLimit = multiWrapLimit - 100;
if (multiStopLimit  FirstMultiXactId)
multiStopLimit -= FirstMultiXactId;

  perhaps MultiXactIdAdvance and MultiXactIdRetreat macros are in order?

- not really padding, MultiXactStatus is 4bytes...
/*
 * XXX Note: there's a lot of padding space in MultiXactMember.  We 
could
 * find a more compact representation of this Xlog record -- perhaps 
all the
 * status flags in one XLogRecData, then all the xids in another one?  
Not
 * clear that it's worth the trouble though.
 */

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2012-11-27 Thread Tom Lane
Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 Under that light, using ALTER is strange indeed.

Agreed, seems like a poor choice.

 I still don't like
 using LOAD that much, allow me to try a last syntax proposal. Well all I
 can find just now would be:

   UPDATE MATERIALIZED VIEW mv FOR EACH ROW;
   UPDATE MATERIALIZED VIEW mv FOR EACH STATEMENT [ CONCURRENTLY ];

 The only value of such a proposal is that it's not LOAD and it's still
 not introducing any new keyword. Oh it's also avoiding to overload the
 SNAPSHOT keyword. Well, it still does not look like the best candidate.

I think this syntax would require making MATERIALIZED (and possibly also
VIEW) fully reserved keywords, which would be better avoided.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need so many hint bits?

2012-11-27 Thread Robert Haas
On Mon, Nov 19, 2012 at 2:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jeff Davis pg...@j-davis.com writes:
 As I said elsewhere in the thread, I'm not planning to introduce any
 additional locking. There is already precedent in IndexOnlyNext.

 Of course, that just begs the question of whether the code in
 IndexOnlyNext is correct.  And after looking at it, it seems pretty
 broken, or at least the argument for its correctness is broken.
 That comment seems to consider only the case where the target tuple has
 just been inserted --- but what of the case where the target tuple is in
 process of being deleted?  The deleter will not make a new index entry
 for that.  Of course, there's a pretty long code path from marking a
 tuple deleted to committing the deletion, and it seems safe to assume
 that the deleter will hit a write barrier in that path.  But I don't
 believe the argument here that the reader must hit a read barrier in
 between.

Ouch.  I don't know how I missed that case when writing that comment,
but I agree it should have been discussed there.  The timing
requirements for updates and deletes seem to be generally less strict,
because the update or delete won't generally be visible to our
snapshot anyway, so it doesn't really matter if we think the page is
all-visible for slightly longer than it really is.  However, it would
be a problem if we thought the page was still all-visible when in fact
it contained a tuple that was not visible to our snapshot.  For that
to happen, the updating or deleting transaction would have to commit
before we took our snapshot, yet the visibility map update would have
to still be visible after we took our snapshot.  That clearly can't
happen, since both committing and taking a snapshot require LWLock
acquire/release.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2012-11-27 Thread David Johnston
 -Original Message-
 From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-
 ow...@postgresql.org] On Behalf Of Dimitri Fontaine
 Sent: Tuesday, November 27, 2012 10:03 AM
 To: Kevin Grittner
 Cc: Pavel Stehule; Peter Eisentraut; Pgsql Hackers
 Subject: Re: [HACKERS] Materialized views WIP patch
 
 Kevin Grittner kgri...@mail.com writes:
  changing the structure of the table. Somehow I don't find that
  pursuasive as an argument for what ALTER MATERIALIZED VIEW should
  rescan the source relations and build a whole new set of data for
  exactly the same MV definition.
 
 Fair enough.
 
  Consider that in relational theory a table is considered a relation
  variable. ALTER is supposed to change the definition of the variable
  in some way. Other statements are used to change the value contained
  in the variable. Sure there are some grey areas already, but I don't
  see where we need to muddy the waters in this case.
 
 Under that light, using ALTER is strange indeed. I still don't like using
LOAD
 that much, allow me to try a last syntax proposal. Well all I can find
just now
 would be:
 
   UPDATE MATERIALIZED VIEW mv FOR EACH ROW;
   UPDATE MATERIALIZED VIEW mv FOR EACH STATEMENT [ CONCURRENTLY ];
 
 The only value of such a proposal is that it's not LOAD and it's still not
 introducing any new keyword. Oh it's also avoiding to overload the
 SNAPSHOT keyword. Well, it still does not look like the best candidate.
 
 Regards,

Just a thought but how about something like:

DO REFRESH OF MATERIALIZED VIEW mat_view;

In effect we begin overloading the meaning of DO to not only mean
anonymous code blocks but to also call pre-defined internal routines that
can be executed without having to use function-call syntax.  MATERIALIZED
VIEW can be more generic e.g., TABLE if the need arises, the REFRESH
Action is generic, and additional clauses can be added after the object
name (FOR, CONCURRENTLY, WHERE, etc...)

David J.










-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2012-11-27 Thread Pavel Stehule
2012/11/27 David Johnston pol...@yahoo.com:
 -Original Message-
 From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-
 ow...@postgresql.org] On Behalf Of Dimitri Fontaine
 Sent: Tuesday, November 27, 2012 10:03 AM
 To: Kevin Grittner
 Cc: Pavel Stehule; Peter Eisentraut; Pgsql Hackers
 Subject: Re: [HACKERS] Materialized views WIP patch

 Kevin Grittner kgri...@mail.com writes:
  changing the structure of the table. Somehow I don't find that
  pursuasive as an argument for what ALTER MATERIALIZED VIEW should
  rescan the source relations and build a whole new set of data for
  exactly the same MV definition.

 Fair enough.

  Consider that in relational theory a table is considered a relation
  variable. ALTER is supposed to change the definition of the variable
  in some way. Other statements are used to change the value contained
  in the variable. Sure there are some grey areas already, but I don't
  see where we need to muddy the waters in this case.

 Under that light, using ALTER is strange indeed. I still don't like using
 LOAD
 that much, allow me to try a last syntax proposal. Well all I can find
 just now
 would be:

   UPDATE MATERIALIZED VIEW mv FOR EACH ROW;
   UPDATE MATERIALIZED VIEW mv FOR EACH STATEMENT [ CONCURRENTLY ];

 The only value of such a proposal is that it's not LOAD and it's still not
 introducing any new keyword. Oh it's also avoiding to overload the
 SNAPSHOT keyword. Well, it still does not look like the best candidate.

 Regards,

 Just a thought but how about something like:

 DO REFRESH OF MATERIALIZED VIEW mat_view;

 In effect we begin overloading the meaning of DO to not only mean
 anonymous code blocks but to also call pre-defined internal routines that
 can be executed without having to use function-call syntax.  MATERIALIZED
 VIEW can be more generic e.g., TABLE if the need arises, the REFRESH
 Action is generic, and additional clauses can be added after the object
 name (FOR, CONCURRENTLY, WHERE, etc...)

-1

I  unlike using keywords DO for this purpose - when we use it for
anonymous blocks

Regards

Pavel


 David J.










-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ilist.h fails cpluspluscheck

2012-11-27 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2012-11-27 01:14:27 -0500, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
 Maybe some ifndef __cplusplus would help.

 Its rather easy to fix in the ilist code at least - the cases it points
 out are those where I took a slightly ugly shortcut to reduce some very
 minor code duplication...

 Some casts fix it,

Seems reasonable, committed.

I wonder if we could get a buildfarm member or two running
cpluspluscheck.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] splitting *_desc routines

2012-11-27 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Tom Lane escribió:
 FWIW, I'd vote for dumping all of these into *one* rmgrdesc directory
 (which may as well be under access/ since that's where the xlog code is),
 regardless of where the corresponding replay code is in the source tree.
 I don't think splitting them up into half a dozen directories adds
 anything except confusion.  If you want, the header comment for each
 file could mention where the corresponding replay code lives.

 Done that way.

Looks pretty sane to me.  The only thing that seems worth discussing is
whether to put the smgr-related XLOG record declarations in storage.h
as you have here, or to invent a new, more private header for them.

There are XLOG record declarations cluttering a lot of other fairly
public headers; but given that we've invented files like heapam_xlog.h
of late, maybe we should start trying to push those declarations to
less-visible spots.  Or maybe it's not worth the trouble.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Bruce Momjian
On Tue, Nov 27, 2012 at 01:59:04AM -0800, Jeff Davis wrote:
 On Wed, 2012-11-21 at 15:27 +, Simon Riggs wrote:
  It would be useful if we issued a NOTICE when an ambiguity is
  introduced, rather than when using it.
  
  Like Bison's reporting of reduce conflicts.
 
 This brings up a very important point, which is that a lot of the code
 is frozen in applications yet invisible at DDL time. So we have to be
 careful that DDL changes have a reasonable impact on the ability to
 continue to compile and execute the previously-working SQL received from
 the applications.
 
 In other words, as I said in another reply, we want to avoid cases where
 something seemingly innocuous (like creating a function) causes
 previously-working SQL to fail due to ambiguity.
 
 As Tom said, detecting the ambiguity at DDL time is not easy, so I'm not
 suggesting that. And I know that creating a function can already cause
 previously-working SQL to fail. I'm just saying we should be careful of
 these situations and not make them more likely than necessary.

For me this highlights why looking at how application languages handle
overloading might not be as relevant --- most language don't have
possible-conflicting functions being created at run-time like a database
does.  The parallels in how other databases treat overloading is
relevant.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Pavel Stehule
2012/11/27 Bruce Momjian br...@momjian.us:
 On Tue, Nov 27, 2012 at 01:59:04AM -0800, Jeff Davis wrote:
 On Wed, 2012-11-21 at 15:27 +, Simon Riggs wrote:
  It would be useful if we issued a NOTICE when an ambiguity is
  introduced, rather than when using it.
 
  Like Bison's reporting of reduce conflicts.

 This brings up a very important point, which is that a lot of the code
 is frozen in applications yet invisible at DDL time. So we have to be
 careful that DDL changes have a reasonable impact on the ability to
 continue to compile and execute the previously-working SQL received from
 the applications.

 In other words, as I said in another reply, we want to avoid cases where
 something seemingly innocuous (like creating a function) causes
 previously-working SQL to fail due to ambiguity.

 As Tom said, detecting the ambiguity at DDL time is not easy, so I'm not
 suggesting that. And I know that creating a function can already cause
 previously-working SQL to fail. I'm just saying we should be careful of
 these situations and not make them more likely than necessary.

 For me this highlights why looking at how application languages handle
 overloading might not be as relevant --- most language don't have
 possible-conflicting functions being created at run-time like a database
 does.  The parallels in how other databases treat overloading is
 relevant.

it is a basic problem - PostgreSQL has unique possibilities -
polymorphic parameters and almost all databases doesn't support
overloading

probably our system is very similar to Haskell


 --
   Bruce Momjian  br...@momjian.ushttp://momjian.us
   EnterpriseDB http://enterprisedb.com

   + It's impossible for everything to be true. +


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] binary heap implementation

2012-11-27 Thread Robert Haas
[ Sorry for the slow response on this, Thanksgiving interfered. ]

On Wed, Nov 21, 2012 at 3:41 PM, Andres Freund and...@2ndquadrant.com wrote:
 One very minor nitpick I unfortunately just found now, not sure when
 that changed:
 binaryheap_replace_first() hardcodes the indices for the left/right node
 of the root node. I would rather have it use (left|right)_offset(0).

Hmm, yeah... but come to think of it, why do we need that special case
at all?  Why not just call sift_down on the root node and call it
good?  See the attached version, which simplifies the code
considerably and also makes some comment adjustments per Abhijit's
comments.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


binaryheap-rmh4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update

2012-11-27 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2012-11-26 16:39:39 -0500, Tom Lane wrote:
 I looked at this a bit.  I am very unhappy with the proposed kluge to
 open indexes NoLock in some places.  Even if that's safe today, which
 I don't believe in the least, any future change in this area could break
 it.

 I am not happy with it either. But I failed to see a better way. Note
 that in most of the cases a lock actually is acquired shortly
 afterwards, just a -indisvalid or an -indisready check away.

I think you're missing the point.  The proposed patch will result in
RelationGetIndexAttrBitmap trying to do index_open() on an index that
might be getting dropped *right now* by a concurrent DROP INDEX
CONCURRENTLY.  If the DROP commits while index_open is running, its
SnapshotNow catalog searches will stop finding relevant rows.  From
the point of view of index_open, the relation data structure is then
corrupt (for example, it might not find pg_attribute entries for all
the columns) and so it will throw an error.

We need to fix things so that the final pre-deletion state of the
pg_index row tells observer backends not to even try to open the index.
(Thus, it will not matter whether they come across the pg_index row just
before or just after the deleting transaction commits --- either way,
they don't try to touch the index.)  So that has to be a different state
from the initial state used during CREATE INDEX CONCURRENTLY.

The point of not wanting to open the index NoLock (and for that matter
of having DROP INDEX CONCURRENTLY take AccessExclusiveLock once it
thinks nobody is touching the index) is to make sure that if somehow
somebody is touching the index anyway, they don't see the index's
catalog entries as corrupt.  They'll either all be there or all not.
It's only a belt-and-suspenders safety measure, but I don't want to
give it up.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Merlin Moncure
On Tue, Nov 27, 2012 at 10:52 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 it is a basic problem - PostgreSQL has unique possibilities -
 polymorphic parameters and almost all databases doesn't support
 overloading

Speaking of polymorphism, why not just implement lpad()'s first
argument as 'anyelement'?  ISTM this comes up in mostly in porting
code from other database that is utilizing standard sql functions.
This should be appropriate when the function's basic functionality and
argument signature is not dependent on input type (constrast:
to_timestamp) and there is a good portability case to be made.
Essentially, this applies to a handful of string processing routines
AFAICT.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update

2012-11-27 Thread Robert Haas
On Tue, Nov 27, 2012 at 12:02 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The point of not wanting to open the index NoLock (and for that matter
 of having DROP INDEX CONCURRENTLY take AccessExclusiveLock once it
 thinks nobody is touching the index) is to make sure that if somehow
 somebody is touching the index anyway, they don't see the index's
 catalog entries as corrupt.  They'll either all be there or all not.
 It's only a belt-and-suspenders safety measure, but I don't want to
 give it up.

+1.  There's a whole crapload of commits that I did for 9.2 with
commit messages like improve concurrent DDL in case XYZ.  A lot of
that had to do with fixing cases where we were examining system
catalogs in unsafe ways before locks had been taken.  I didn't manage
to fix them all, unfortunately, but it's significantly better than it
used to be, and I'd really like it if we could try not to go
backwards.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Robert Haas
On Tue, Nov 27, 2012 at 4:46 AM, Jeff Davis pg...@j-davis.com wrote:
 Let's say you have only one function foo. All your queries are coded
 into your application, and everything works fine, using assignment casts
 where necessary.

 Then the user is foolish enough to CREATE FUNCTION foo... and now their
 queries start failing left and right.

 In other words, only one possible candidate exists should be followed
 by right now to be more precise.

 That's a major violation of the principle of least astonishment, that
 CREATE FUNCTION could cause such a disaster. I know that it can now, but
 what you're proposing will come into play much more frequently because
 most people start off with just one function by a particular name and
 define more as needed.

I admit that there are cases where this could happen, and that it will
happen a little more than it does now.  But, as you say, this can
happen now, and yet we get very few if any complaints about it,
whereas we get regular complaints about the need to insert casts that
other database systems do not require.  The fact is that most
functions are not overloaded, so the esoterica of overloading affect
only a tiny number of relatively sophisticated users.  The need for
extra casts cuts a much broader swath through our user base.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update

2012-11-27 Thread Andres Freund
On 2012-11-27 12:02:09 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2012-11-26 16:39:39 -0500, Tom Lane wrote:
  I looked at this a bit.  I am very unhappy with the proposed kluge to
  open indexes NoLock in some places.  Even if that's safe today, which
  I don't believe in the least, any future change in this area could break
  it.

  I am not happy with it either. But I failed to see a better way. Note
  that in most of the cases a lock actually is acquired shortly
  afterwards, just a -indisvalid or an -indisready check away.

 I think you're missing the point.

True.

  The proposed patch will result in
 RelationGetIndexAttrBitmap trying to do index_open() on an index that
 might be getting dropped *right now* by a concurrent DROP INDEX
 CONCURRENTLY.  If the DROP commits while index_open is running, its
 SnapshotNow catalog searches will stop finding relevant rows.  From
 the point of view of index_open, the relation data structure is then
 corrupt (for example, it might not find pg_attribute entries for all
 the columns) and so it will throw an error.

 We need to fix things so that the final pre-deletion state of the
 pg_index row tells observer backends not to even try to open the index.
 (Thus, it will not matter whether they come across the pg_index row just
 before or just after the deleting transaction commits --- either way,
 they don't try to touch the index.)  So that has to be a different state
 from the initial state used during CREATE INDEX CONCURRENTLY.

 The point of not wanting to open the index NoLock (and for that matter
 of having DROP INDEX CONCURRENTLY take AccessExclusiveLock once it
 thinks nobody is touching the index) is to make sure that if somehow
 somebody is touching the index anyway, they don't see the index's
 catalog entries as corrupt.  They'll either all be there or all not.
 It's only a belt-and-suspenders safety measure, but I don't want to
 give it up.

So the idea would be to skip about-to-be-dropped indexes in
RelationGetIndexList directly - we don't ever need to watch those, not
even for HOT - because we only have the necessary knowledge there. The
normal valid/ready checks will be done at the callsites of
RelationGetIndexList(). But those checks can be done in a locked manner
now.

Are you already working on a fix? Or shall I work on an updated patch?

I vote for introducing wrapper functions/macro to do the
about-to-be-dropped check, its hard enough to understand as-is.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I admit that there are cases where this could happen, and that it will
 happen a little more than it does now.  But, as you say, this can
 happen now, and yet we get very few if any complaints about it,
 whereas we get regular complaints about the need to insert casts that
 other database systems do not require.  The fact is that most
 functions are not overloaded, so the esoterica of overloading affect
 only a tiny number of relatively sophisticated users.  The need for
 extra casts cuts a much broader swath through our user base.

I find this argument a bit specious.  It probably is true that most
*user defined* functions aren't overloaded --- but that's not so true
for system-defined functions, and even less true for operators.  So
the parser's behavior with overloaded calls affects all users, whether
they know it or not.  It also affects developers, in that adding a
new overloaded version of a system function (that previously wasn't
overloaded) could actually reduce the number of cases for which the
function works without an explicit cast.

We have got to be really careful with changing the parser's behavior
here, or we're going to break cases that work today.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Problem Observed in behavior of Create Index Concurrently and Hot Update

2012-11-27 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 So the idea would be to skip about-to-be-dropped indexes in
 RelationGetIndexList directly - we don't ever need to watch those, not
 even for HOT - because we only have the necessary knowledge there. The
 normal valid/ready checks will be done at the callsites of
 RelationGetIndexList(). But those checks can be done in a locked manner
 now.

Right.

 Are you already working on a fix? Or shall I work on an updated patch?

I'll work on it.

 I vote for introducing wrapper functions/macro to do the
 about-to-be-dropped check, its hard enough to understand as-is.

Meh.  If it's only going to be done in RelationGetIndexList, I'm
not sure that a wrapper macro is worth the trouble.  If we needed
the test in multiple places I'd agree, but what other spots do you
see?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update

2012-11-27 Thread Andres Freund
On 2012-11-27 12:50:36 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I vote for introducing wrapper functions/macro to do the
  about-to-be-dropped check, its hard enough to understand as-is.

 Meh.  If it's only going to be done in RelationGetIndexList, I'm
 not sure that a wrapper macro is worth the trouble.  If we needed
 the test in multiple places I'd agree, but what other spots do you
 see?

I don't really see any other querying locations - but such a macro would
make the code easier backpatchable when we introduce a third column for
the about-to-be-dropped case.

Anyway, don't feel all too strongly about it.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY

2012-11-27 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2012-11-26 22:44:49 -0500, Tom Lane wrote:
 I looked through the code a bit, and I think we should be able to make
 this work, but note there are also places that update indcheckxmin using
 heap_update, and that's just as dangerous as changing the other two
 flags via heap_update, if you don't have exclusive lock on the table.

 Isn't setting indexcheckxmin already problematic in the case of CREATE
 INDEX CONCURRENTLY? index_build already runs in a separate transaction
 there.

Yeah, you are right, except that AFAICS indcheckxmin is really only
needed for regular non-concurrent CREATE INDEX, which needs it because
it commits without waiting for readers that might be bothered by broken
HOT chains.  In a concurrent CREATE INDEX, we handle that problem by
waiting out all such readers before setting indisvalid.  So the
concurrent code path should not be bothering to set indcheckxmin at all,
I think.  (This is underdocumented.)

Looking closer at the comment in reindex_index, what it's really full of
angst about is that simple_heap_update will update the tuple's xmin
*when we would rather that it didn't*.  So switching to update-in-place
there will solve a problem, not create one.

In short, all flag changes in pg_index should be done by
update-in-place, and the tuple's xmin will never change from the
original creating transaction (until frozen).

What we want the xmin to do, for indcheckxmin purposes, is reflect the
time at which the index started to be included in HOT-safety decisions.
Since we're never going to move the xmin, that means that *we cannot
allow REINDEX to mark a dead index live again*.  Once DROP INDEX
CONCURRENTLY has reached the final state, you can't revalidate the index
by reindexing it, you'll have to drop it and then make a brand new one.
That seems like a pretty minor compromise.

What I propose to do next is create a patch for HEAD that includes a
new pg_index flag column, since I think the logic will be clearer
with that.  Once we're happy with that, we can back-port the patch
into a form that uses the existing flag columns.

Anybody feel like bikeshedding the flag column name?  I'm thinking
indislive but maybe there's a better name.

Note: I'm not impressed by the proposal to replace these columns with
a single integer flag column.  Aside from any possible incompatibility
with existing client code, it just isn't going to be easy to read the
index's state manually if we do that.  We could maybe dodge that
complaint with a char (pseudo-enum) status column but I don't think
that will simplify the code at all, and it's got the same or worse
compatibility issues.

 Btw, even if we manage to find a sensible fix for this I would suggest
 postponing it after the next back branch release.

AFAICS this is a data loss/corruption problem, and as such a must fix.
If we can't have it done by next week, I'd rather postpone the releases
until it is done.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Jeff Davis
On Tue, 2012-11-27 at 12:19 -0500, Robert Haas wrote:
 I admit that there are cases where this could happen, and that it will
 happen a little more than it does now.  But, as you say, this can
 happen now, and yet we get very few if any complaints about it,
 whereas we get regular complaints about the need to insert casts that
 other database systems do not require.  The fact is that most
 functions are not overloaded, so the esoterica of overloading affect
 only a tiny number of relatively sophisticated users.  The need for
 extra casts cuts a much broader swath through our user base.

Well, I did offer a suggestion that would make your idea safer, which is
to explicitly opt out of the overloading feature at the time the
function is created, rather than making it implicit based on how many
functions happen to have the same name.

The fact that it can only hurt sophisticated users is not convincing to
me. For one thing, our users are programmers, so they should all feel
comfortable defining their own functions, and I don't want to make them
any less so. Next, sophisticated users also make mistakes.

I could also make a security argument. Even today, any user who can
create a function in your search path can make your queries start
failing. If we locked down most of the system-defined functions as
non-overloadable, and allowed users to do the same for their functions
(maybe even the default one day?), then that would greatly reduce the
exposure.

The current strictness of the overloaded functions tends to make users
more explicit about argument types, which reduces the chance of problems
at the expense of usability and compatibility. Not ideal, but if we make
it more permissive then we are permanently stuck with less information
about what types the user intended and which function they intended to
call. In such an extensible system, that worries me on several fronts.

That being said, I'm not outright in opposition to the idea of making
improvements like this, I just think we should do so cautiously.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Enabling frontend-only xlog desc routines

2012-11-27 Thread Alvaro Herrera
I mentioned the remaining issues in a previous email (see
message-id 20121025161751.ge6...@alvh.no-ip.org).  Attached is a
patch that enables xlogdump to #include xlog_internal.h by way of
removing that file's inclusion of fmgr.h, which is problematic.  I don't
think this should be too contentious.

The other interesting question remaining is what to do about the rm_desc
function in rmgr.c.  We're split between these two ideas:

1. Have this in rmgr.c:

#ifdef FRONTEND
#define RMGR_REDO_FUNC(func) NULL
#else
#define RMGR_REDO_FUNC(func) func
#endif /* FRONTEND */

and then use RMGR_REDO_FUNC() in the table.


2. Have this in rmgr.c:

#ifndef RMGR_REDO_FUNC
#define RMGR_REDO_FUNC(func) func
#endif

And then have the xlogdump Makefile use -D to define a suitable
RMGR_REDO_FUNC.

Opinions please?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
commit 74986720979c1c2bb39a58133563fd8da82c301b
Author: Alvaro Herrera alvhe...@alvh.no-ip.org
Date:   Tue Nov 27 15:33:18 2012 -0300

Split SQL-callable function declarations into xlog_fn.h

This lets xlog_internal.h go without including fmgr.h, which is useful
to let it compile in a frontend-only environment.

diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index d345761..40c0bd6 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -18,6 +18,7 @@
 
 #include access/htup_details.h
 #include access/xlog.h
+#include access/xlog_fn.h
 #include access/xlog_internal.h
 #include access/xlogutils.h
 #include catalog/catalog.h
diff --git a/src/include/access/xlog_fn.h b/src/include/access/xlog_fn.h
new file mode 100644
index 000..65376fe
--- /dev/null
+++ b/src/include/access/xlog_fn.h
@@ -0,0 +1,35 @@
+/*
+ * xlog_fn.h
+ *
+ * PostgreSQL transaction log SQL-callable function declarations
+ *
+ * Portions Copyright (c) 1996-2012, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/access/xlog_fn.h
+ */
+#ifndef XLOG_FN_H
+#define XLOG_FN_H
+
+#include fmgr.h
+
+extern Datum pg_start_backup(PG_FUNCTION_ARGS);
+extern Datum pg_stop_backup(PG_FUNCTION_ARGS);
+extern Datum pg_switch_xlog(PG_FUNCTION_ARGS);
+extern Datum pg_create_restore_point(PG_FUNCTION_ARGS);
+extern Datum pg_current_xlog_location(PG_FUNCTION_ARGS);
+extern Datum pg_current_xlog_insert_location(PG_FUNCTION_ARGS);
+extern Datum pg_last_xlog_receive_location(PG_FUNCTION_ARGS);
+extern Datum pg_last_xlog_replay_location(PG_FUNCTION_ARGS);
+extern Datum pg_last_xact_replay_timestamp(PG_FUNCTION_ARGS);
+extern Datum pg_xlogfile_name_offset(PG_FUNCTION_ARGS);
+extern Datum pg_xlogfile_name(PG_FUNCTION_ARGS);
+extern Datum pg_is_in_recovery(PG_FUNCTION_ARGS);
+extern Datum pg_xlog_replay_pause(PG_FUNCTION_ARGS);
+extern Datum pg_xlog_replay_resume(PG_FUNCTION_ARGS);
+extern Datum pg_is_xlog_replay_paused(PG_FUNCTION_ARGS);
+extern Datum pg_xlog_location_diff(PG_FUNCTION_ARGS);
+extern Datum pg_is_in_backup(PG_FUNCTION_ARGS);
+extern Datum pg_backup_start_time(PG_FUNCTION_ARGS);
+
+#endif   /* XLOG_FN_H */
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index b70a620..5802d1d 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -17,7 +17,6 @@
 #define XLOG_INTERNAL_H
 
 #include access/xlog.h
-#include fmgr.h
 #include pgtime.h
 #include storage/block.h
 #include storage/relfilenode.h
@@ -253,26 +252,4 @@ extern bool XLogArchiveCheckDone(const char *xlog);
 extern bool XLogArchiveIsBusy(const char *xlog);
 extern void XLogArchiveCleanup(const char *xlog);
 
-/*
- * These aren't in xlog.h because I'd rather not include fmgr.h there.
- */
-extern Datum pg_start_backup(PG_FUNCTION_ARGS);
-extern Datum pg_stop_backup(PG_FUNCTION_ARGS);
-extern Datum pg_switch_xlog(PG_FUNCTION_ARGS);
-extern Datum pg_create_restore_point(PG_FUNCTION_ARGS);
-extern Datum pg_current_xlog_location(PG_FUNCTION_ARGS);
-extern Datum pg_current_xlog_insert_location(PG_FUNCTION_ARGS);
-extern Datum pg_last_xlog_receive_location(PG_FUNCTION_ARGS);
-extern Datum pg_last_xlog_replay_location(PG_FUNCTION_ARGS);
-extern Datum pg_last_xact_replay_timestamp(PG_FUNCTION_ARGS);
-extern Datum pg_xlogfile_name_offset(PG_FUNCTION_ARGS);
-extern Datum pg_xlogfile_name(PG_FUNCTION_ARGS);
-extern Datum pg_is_in_recovery(PG_FUNCTION_ARGS);
-extern Datum pg_xlog_replay_pause(PG_FUNCTION_ARGS);
-extern Datum pg_xlog_replay_resume(PG_FUNCTION_ARGS);
-extern Datum pg_is_xlog_replay_paused(PG_FUNCTION_ARGS);
-extern Datum pg_xlog_location_diff(PG_FUNCTION_ARGS);
-extern Datum pg_is_in_backup(PG_FUNCTION_ARGS);
-extern Datum pg_backup_start_time(PG_FUNCTION_ARGS);
-
 #endif   /* XLOG_INTERNAL_H */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Robert Haas
On Tue, Nov 27, 2012 at 12:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I admit that there are cases where this could happen, and that it will
 happen a little more than it does now.  But, as you say, this can
 happen now, and yet we get very few if any complaints about it,
 whereas we get regular complaints about the need to insert casts that
 other database systems do not require.  The fact is that most
 functions are not overloaded, so the esoterica of overloading affect
 only a tiny number of relatively sophisticated users.  The need for
 extra casts cuts a much broader swath through our user base.

 I find this argument a bit specious.  It probably is true that most
 *user defined* functions aren't overloaded --- but that's not so true
 for system-defined functions, and even less true for operators.  So
 the parser's behavior with overloaded calls affects all users, whether
 they know it or not.  It also affects developers, in that adding a
 new overloaded version of a system function (that previously wasn't
 overloaded) could actually reduce the number of cases for which the
 function works without an explicit cast.

 We have got to be really careful with changing the parser's behavior
 here, or we're going to break cases that work today.

Well, the whole point of writing the patch the way I did was that it
*doesn't* break any cases that work today.

But as to your point about the system catalogs, it is true that adding
an additional function could reduce the number of cases where things
work today.  But I think in many cases it would eliminate the need for
overloading that we already have, and simplify things for future
developers.  Right now, quote_literal() allows implicit casts to text
by having a second version that takes any anyelement argument; on the
other hand, concat() allows implicit casts to text by accepting any
rather than text as an argument; and || allows implicit casts to text
by defining operators for anynonarray || text, text || anynonarray,
and text || text.  So we've got three quite different methods to
create implicit-cast-to-text behavior in particular cases.  That's got
developer complexity too, and while this proposal wouldn't do anything
about the third case since || actually sometimes has a different
meaning, namely array concatenation, the first two wouldn't need
overloading any more.  They'd just work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY

2012-11-27 Thread Andres Freund
On 2012-11-27 13:45:08 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2012-11-26 22:44:49 -0500, Tom Lane wrote:
  I looked through the code a bit, and I think we should be able to make
  this work, but note there are also places that update indcheckxmin using
  heap_update, and that's just as dangerous as changing the other two
  flags via heap_update, if you don't have exclusive lock on the table.

  Isn't setting indexcheckxmin already problematic in the case of CREATE
  INDEX CONCURRENTLY? index_build already runs in a separate transaction
  there.

 Yeah, you are right, except that AFAICS indcheckxmin is really only
 needed for regular non-concurrent CREATE INDEX, which needs it because
 it commits without waiting for readers that might be bothered by broken
 HOT chains.  In a concurrent CREATE INDEX, we handle that problem by
 waiting out all such readers before setting indisvalid.  So the
 concurrent code path should not be bothering to set indcheckxmin at all,
 I think.  (This is underdocumented.)

Seems to be correct to me.

 Looking closer at the comment in reindex_index, what it's really full of
 angst about is that simple_heap_update will update the tuple's xmin
 *when we would rather that it didn't*.  So switching to update-in-place
 there will solve a problem, not create one.

It strikes me that the whole idea of reusing xmin when indexcheckxmin is
set is overly clever and storing the xid itself would have be been
better... Too late though.

 In short, all flag changes in pg_index should be done by
 update-in-place, and the tuple's xmin will never change from the
 original creating transaction (until frozen).

Hm. That doesn't sound that easy to guarantee. Several ALTER TABLE and
ALTER INDEX operations are expected to work transactionally right
now. As far as I see there is nothing that prohibits a indexcheckxmin
requiring index to be altered while indexcheckxmin is still required?

 What we want the xmin to do, for indcheckxmin purposes, is reflect the
 time at which the index started to be included in HOT-safety decisions.
 Since we're never going to move the xmin, that means that *we cannot
 allow REINDEX to mark a dead index live again*.  Once DROP INDEX
 CONCURRENTLY has reached the final state, you can't revalidate the index
 by reindexing it, you'll have to drop it and then make a brand new one.
 That seems like a pretty minor compromise.

That would be a regression compared with the current state though. We
have officially documented REINDEX as a way out of INVALID indexes...

If we store the xid of the reindexing transaction there that might be
pessimal (because there should be not HOT safety problems) but should
always be correct, or am I missing something?

 What I propose to do next is create a patch for HEAD that includes a
 new pg_index flag column, since I think the logic will be clearer
 with that.  Once we're happy with that, we can back-port the patch
 into a form that uses the existing flag columns.

 Anybody feel like bikeshedding the flag column name?  I'm thinking
 indislive but maybe there's a better name.

I personally would slightly favor indisdead instead...

  Btw, even if we manage to find a sensible fix for this I would suggest
  postponing it after the next back branch release.

 AFAICS this is a data loss/corruption problem, and as such a must fix.
 If we can't have it done by next week, I'd rather postpone the releases
 until it is done.

Ok, just seemed like a rather complex fix in a short time for something
that seemingly hasn't been noticed since 8.3. I am a bit worried about
introducing something worse while fixing this.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Problem Observed in behavior of Create Index Concurrently and Hot Update

2012-11-27 Thread Tom Lane
BTW, I was thinking that the DROP INDEX CONCURRENTLY logic needed to be:

1. Unset indisvalid, commit, wait out all reading transactions.

2. Unset indisready, commit, wait out all writing transactions.

3. Unset indislive, commit (with parent table relcache flush),
wait out all reading-or-writing transactions.

4. Drop the index.

However, I wonder whether we couldn't combine steps 2 and 3, ie once
there are no readers of the index go directly to the dead state.
I don't see a need for a period where the index isn't being inserted
into but is still used for HOT-safety decisions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Problem Observed in behavior of Create Index Concurrently and Hot Update

2012-11-27 Thread Andres Freund
On 2012-11-27 14:08:13 -0500, Tom Lane wrote:
 BTW, I was thinking that the DROP INDEX CONCURRENTLY logic needed to be:

 1. Unset indisvalid, commit, wait out all reading transactions.

 2. Unset indisready, commit, wait out all writing transactions.

 3. Unset indislive, commit (with parent table relcache flush),
 wait out all reading-or-writing transactions.

 4. Drop the index.

 However, I wonder whether we couldn't combine steps 2 and 3, ie once
 there are no readers of the index go directly to the dead state.
 I don't see a need for a period where the index isn't being inserted
 into but is still used for HOT-safety decisions.

I think you're right, that state isn't interesting for anyone.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Robert Haas
On Tue, Nov 27, 2012 at 1:45 PM, Jeff Davis pg...@j-davis.com wrote:
 On Tue, 2012-11-27 at 12:19 -0500, Robert Haas wrote:
 I admit that there are cases where this could happen, and that it will
 happen a little more than it does now.  But, as you say, this can
 happen now, and yet we get very few if any complaints about it,
 whereas we get regular complaints about the need to insert casts that
 other database systems do not require.  The fact is that most
 functions are not overloaded, so the esoterica of overloading affect
 only a tiny number of relatively sophisticated users.  The need for
 extra casts cuts a much broader swath through our user base.

 Well, I did offer a suggestion that would make your idea safer, which is
 to explicitly opt out of the overloading feature at the time the
 function is created, rather than making it implicit based on how many
 functions happen to have the same name.

 The fact that it can only hurt sophisticated users is not convincing to
 me. For one thing, our users are programmers, so they should all feel
 comfortable defining their own functions, and I don't want to make them
 any less so. Next, sophisticated users also make mistakes.

 I could also make a security argument. Even today, any user who can
 create a function in your search path can make your queries start
 failing. If we locked down most of the system-defined functions as
 non-overloadable, and allowed users to do the same for their functions
 (maybe even the default one day?), then that would greatly reduce the
 exposure.

 The current strictness of the overloaded functions tends to make users
 more explicit about argument types, which reduces the chance of problems
 at the expense of usability and compatibility. Not ideal, but if we make
 it more permissive then we are permanently stuck with less information
 about what types the user intended and which function they intended to
 call. In such an extensible system, that worries me on several fronts.

 That being said, I'm not outright in opposition to the idea of making
 improvements like this, I just think we should do so cautiously.

Fair enough.  I certainly admit that I wouldn't like to release with
this code in place and then find out that it's got some critical flaw,
security or otherwise.  A couple of embarrassing bugs have been found
recently in patches I wrote and committed, and I'm not looking to up
that number.  That having been said, I remain unconvinced that any of
the things proposed so far are compelling reasons not to do this.
That doesn't mean there aren't any such reasons, but I am personally
unconvinced that we've found them yet.  Most of the arguments so far
advanced seem to involve overloading (where this proposal doesn't
change anything vs. today); I think you're the only one who has
proposed a situation where it causes a problem (namely, a function
that is overloaded later) but in my personal opinion that's not going
to happen often enough to justify the amount of user pain the current
system imposes.  Of course that's a judgement call.

I do think that applying some kind of explicit flag to the function
indicating whether it should allow implicit assignment
casting/implicit casting to text/overloading/whatever is a possibly
interesting alternative.  It seems clear from our system catalogs that
implicit casting to text is sometimes a desirable behavior and
sometimes not, so it's reasonable to think that perhaps we should put
that under user control.  What I like about my proposal (really
Tom's idea) is that it seems like it solves a pretty high percentage
of the problem cases without requiring any explicit user action.  I
actually suspect we could get the right behavior even more often by
attaching flags to the function or argument position, but that would
also put more of the onus on the user to get the flags right -- and we
might not even agree amongst ourselves on how the flags should be set.
 The fact that quote_literal() allows (by the expedient of
overloading) implicit casts to text and that lpad() does not seems
fairly random to me in hindsight; is there a general principle there
that we'd all sign on to?  The nice thing about this proposal is that
it doesn't require any explicit user action.  Of course that's no help
if it does the wrong thing, but since it only fixes cases that are
unambiguous and which currently fail, it's hard for me to see how
that's a real danger.  That doesn't mean there ISN'T a real danger,
but I want to make sure that if we don't do this we have a clear and
understandable reason, and not just bad memories of the last time we
made a change in this area.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 But as to your point about the system catalogs, it is true that adding
 an additional function could reduce the number of cases where things
 work today.  But I think in many cases it would eliminate the need for
 overloading that we already have, and simplify things for future
 developers.  Right now, quote_literal() allows implicit casts to text
 by having a second version that takes any anyelement argument; on the
 other hand, concat() allows implicit casts to text by accepting any
 rather than text as an argument; and || allows implicit casts to text
 by defining operators for anynonarray || text, text || anynonarray,
 and text || text.  So we've got three quite different methods to
 create implicit-cast-to-text behavior in particular cases.  That's got
 developer complexity too, and while this proposal wouldn't do anything
 about the third case since || actually sometimes has a different
 meaning, namely array concatenation, the first two wouldn't need
 overloading any more.  They'd just work.

Uh, no, not really, and I think that assertion just goes to show that
this area is more subtle than you think.  quote_literal() for instance
presently works for any datatype that has an explicit cast to text.
After making the change you propose above, it would only work for types
for which the cast was assignment-grade or less.  concat() is even
looser: as now implemented, it works for *anything at all*, because it
relies on datatype output functions not casts to text.  I'm dubious that
that inconsistency is a good thing, actually, but that's how the
committed code is written.

Now, some of us might think that backing these conversions down to only
allowing assignment-grade casts would be an improvement, in the sense
that it would actually make the type system tighter not looser than it
is today for these particular functions.  But I suspect you wouldn't see
it as an improvement, given the position you're arguing from.

In fact, I'm afraid that making this change would result in requests to
downgrade existing explicit casts to be assignment-only, so that people
could be even lazier about not casting function arguments; and that is
something up with which I will not put.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP json generation enhancements

2012-11-27 Thread Andrew Dunstan


On 11/26/2012 12:31 PM, Robert Haas wrote:

On Mon, Nov 26, 2012 at 11:43 AM, Andrew Dunstan and...@dunslane.net wrote:

I don't understand why you would want to create such a cast. If the cast
doesn't exist it will do exactly what it does now, i.e. use the type's
output function and then json quote and escape it, which in the case of
citext is the Right Thing (tm):

andrew=# select to_json('foobar'::citext);
   to_json

  foo\bar

I'm not sure either, but setting up a system where seemingly innocuous
actions can in fact have surprising and not-easily-fixable
consequences in other parts of the system doesn't seem like good
design to me.  Of course, maybe I'm misunderstanding what will happen;
I haven't actually tested it myself.




I'm all for caution, but the argument here seems a bit nebulous. We 
could create a sort of auxiliary type, as has been previously suggested, 
that would be in all respects the same as the json type except that it 
would be the target of the casts that would be used in to_json() and 
friends. But, that's a darned ugly thing to have to do, so I'd want a 
good concrete reason for doing it. Right now I'm having a hard time 
envisioning a problem that could be caused by just using the 
straightforward solution that's in my latest patch.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY

2012-11-27 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2012-11-27 13:45:08 -0500, Tom Lane wrote:
 In short, all flag changes in pg_index should be done by
 update-in-place, and the tuple's xmin will never change from the
 original creating transaction (until frozen).

 Hm. That doesn't sound that easy to guarantee. Several ALTER TABLE and
 ALTER INDEX operations are expected to work transactionally right
 now. As far as I see there is nothing that prohibits a indexcheckxmin
 requiring index to be altered while indexcheckxmin is still required?

I said in pg_index.  There is no reason to ever alter an index's
pg_index entry transactionally, because we don't support redefining
the index columns.  The stuff you are allowed to ALTER is pretty much
irrelevant to the index's life as an index.

 What we want the xmin to do, for indcheckxmin purposes, is reflect the
 time at which the index started to be included in HOT-safety decisions.
 Since we're never going to move the xmin, that means that *we cannot
 allow REINDEX to mark a dead index live again*.

 That would be a regression compared with the current state though. We
 have officially documented REINDEX as a way out of INVALID indexes...

It's a way out of failed CREATE operations.  If DROP fails at the last
step, you won't be able to go back, but why would you want to?  Just
do the DROP again.

 Anybody feel like bikeshedding the flag column name?  I'm thinking
 indislive but maybe there's a better name.

 I personally would slightly favor indisdead instead...

Meh ... the other two flags are positive, in the sense of
true-is-the-normal-state, so I thought this one should be too.

I had also toyed with indishot, to reflect the idea that this controls
whether the index is included in HOT-safety decisions, but that seems
maybe a little too cute.

 Btw, even if we manage to find a sensible fix for this I would suggest
 postponing it after the next back branch release.

 AFAICS this is a data loss/corruption problem, and as such a must fix.
 If we can't have it done by next week, I'd rather postpone the releases
 until it is done.

 Ok, just seemed like a rather complex fix in a short time for something
 that seemingly hasn't been noticed since 8.3. I am a bit worried about
 introducing something worse while fixing this.

Hm?  The fact that the DROP patch broke CREATE INDEX CONCURRENTLY is a
new and very nasty bug in 9.2.  I would agree with you if we were
considering the unsafe-row-update problem alone, but it seems like we
might as well fix both aspects while we're looking at this code.

There is a question of whether we should risk trying to back-patch the
in-place-update changes further than 9.2.  Given the lack of complaints
I'm inclined not to, but could be persuaded differently.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql_check_function - rebase for 9.3

2012-11-27 Thread Pavel Stehule
Hello

I am sending a updated version - now it is prepared for event triggers
and it is little bit more robust

I run pgindent, but I have no experience with it, so I am not sure about success

Regards

Pavel Stehule


2012/10/7 Selena Deckelmann sel...@chesnok.com:
 Hi!

 On Tue, Jun 26, 2012 at 1:19 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:

 I am sending lightly refreshed patch for checking plpgsql functions..

 I checked different implementation, but without success: a) enhancing
 of SPI to some fake mode can has negative impact on application, and
 patch was not clear, b) generic plpgsql walker doesn't save lines too.

 I invite any ideas how to improve this patch

 I reviewed this and did a clean up for bitrot and a little whitespace.
 In particular, it needed to learn a little about event triggers.

 This patch is a follow on from an earlier review thread I found:
 http://archives.postgresql.org/message-id/d960cb61b694cf459dcfb4b0128514c2072df...@exadv11.host.magwien.gv.at

 I dug through that thread a bit, and I believe issues raised by
 Laurenz, Petr and Alvaro were resolved by Pavel over time.

 All tests pass, and after a read-through, the code seems fine.

 This also represents my inaugural use of pg_bsd_indent. I ran it on
 pl_check.c - which made things mostly better. Happy to try and fix it
 up more if someone can explain to me what (if anything) I did
 incorrectly when using it.

 -selena

 --
 http://chesnok.com


plpgsql_check_function_20121127_01.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY

2012-11-27 Thread Andres Freund
On 2012-11-27 14:41:34 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2012-11-27 13:45:08 -0500, Tom Lane wrote:
  In short, all flag changes in pg_index should be done by
  update-in-place, and the tuple's xmin will never change from the
  original creating transaction (until frozen).

  Hm. That doesn't sound that easy to guarantee. Several ALTER TABLE and
  ALTER INDEX operations are expected to work transactionally right
  now. As far as I see there is nothing that prohibits a indexcheckxmin
  requiring index to be altered while indexcheckxmin is still required?

 I said in pg_index.  There is no reason to ever alter an index's
 pg_index entry transactionally, because we don't support redefining
 the index columns.  The stuff you are allowed to ALTER is pretty much
 irrelevant to the index's life as an index.

Isn't inisprimary updated when an ALTER TABLE ... ADD PRIMARY KEY
... USING someindex ; is done? Also I think indoption might be written
to as well.

  What we want the xmin to do, for indcheckxmin purposes, is reflect the
  time at which the index started to be included in HOT-safety decisions.
  Since we're never going to move the xmin, that means that *we cannot
  allow REINDEX to mark a dead index live again*.

  That would be a regression compared with the current state though. We
  have officially documented REINDEX as a way out of INVALID indexes...

 It's a way out of failed CREATE operations.  If DROP fails at the last
 step, you won't be able to go back, but why would you want to?  Just
 do the DROP again.

Oh, sorry, misunderstood you.


  Anybody feel like bikeshedding the flag column name?  I'm thinking
  indislive but maybe there's a better name.

  I personally would slightly favor indisdead instead...

 Meh ... the other two flags are positive, in the sense of
 true-is-the-normal-state, so I thought this one should be too.

Good point.

 I had also toyed with indishot, to reflect the idea that this controls
 whether the index is included in HOT-safety decisions, but that seems
 maybe a little too cute.

indislive seems better than that, yes.

  Btw, even if we manage to find a sensible fix for this I would suggest
  postponing it after the next back branch release.

  AFAICS this is a data loss/corruption problem, and as such a must fix.
  If we can't have it done by next week, I'd rather postpone the releases
  until it is done.

  Ok, just seemed like a rather complex fix in a short time for something
  that seemingly hasn't been noticed since 8.3. I am a bit worried about
  introducing something worse while fixing this.

 Hm?  The fact that the DROP patch broke CREATE INDEX CONCURRENTLY is a
 new and very nasty bug in 9.2.  I would agree with you if we were
 considering the unsafe-row-update problem alone, but it seems like we
 might as well fix both aspects while we're looking at this code.

 There is a question of whether we should risk trying to back-patch the
 in-place-update changes further than 9.2.  Given the lack of complaints
 I'm inclined not to, but could be persuaded differently.

Oh, I only was talking about the inplace changes. The DROP INDEX
CONCURRENTLY breakage definitely needs to get backpatched.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Jeff Davis
On Tue, 2012-11-27 at 14:13 -0500, Robert Haas wrote:
 I do think that applying some kind of explicit flag to the function
 indicating whether it should allow implicit assignment
 casting/implicit casting to text/overloading/whatever is a possibly
 interesting alternative.  It seems clear from our system catalogs that
 implicit casting to text is sometimes a desirable behavior and
 sometimes not, so it's reasonable to think that perhaps we should put
 that under user control.  What I like about my proposal (really
 Tom's idea) is that it seems like it solves a pretty high percentage
 of the problem cases without requiring any explicit user action.

What user action are you concerned about? If we (eventually) made the
non-overloaded case the default, would that resolve your concerns?

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I do think that applying some kind of explicit flag to the function
 indicating whether it should allow implicit assignment
 casting/implicit casting to text/overloading/whatever is a possibly
 interesting alternative.

That idea seems possibly worth pursuing.  The thing that I find scary
about the current proposal is that it applies to all functions (and
operators) willy-nilly, which seems to raise the risk of unexpected
side effects pretty high.  If we could confine the behavioral change
to a relatively small number of functions for which there was consensus
that they should accept most anything, I'd feel better about it.

(Of course, we might then conclude that something close to the
quote_literal solution would work as well as a new function property.
But it's worth thinking about.)

  The fact that quote_literal() allows (by the expedient of
 overloading) implicit casts to text and that lpad() does not seems
 fairly random to me in hindsight; is there a general principle there
 that we'd all sign on to?

I don't find that random in the slightest.  The entire purpose of
quote_literal is manufacture a SQL-literal string representation of
this value, and that clearly might apply to data of any type.  lpad()
is, first last and only, a textual operation.  Somebody who thinks it
should apply directly to an integer is guilty of sloppy thinking at
best, or not even understanding what a data type is at worst.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 It also affects developers, in that adding a
 new overloaded version of a system function (that previously wasn't
 overloaded) could actually reduce the number of cases for which the
 function works without an explicit cast.
 
 We have got to be really careful with changing the parser's behavior
 here, or we're going to break cases that work today.

For my 2c- we have to be really careful making changes to the system
functions as well as the parser's behavior.

If we're worried about users creating overloaded versions of system
functions, well, I'd probably suggest a don't do that then kind of
approach..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Robert Haas
On Tue, Nov 27, 2012 at 2:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Uh, no, not really, and I think that assertion just goes to show that
 this area is more subtle than you think.  quote_literal() for instance
 presently works for any datatype that has an explicit cast to text.

That doesn't appear to be the behavior I'm seeing:

rhaas=# select quote_literal(17);
 quote_literal
---
 '17'
(1 row)

rhaas=# select * from pg_cast where castsource = 'int4'::regtype and
casttarget = 'text'::regtype;
 castsource | casttarget | castfunc | castcontext | castmethod
++--+-+
(0 rows)


 After making the change you propose above, it would only work for types
 for which the cast was assignment-grade or less.

...but that's everything, because there's a hardcoded exception in the
code that dictates that even if there is no entry in pg_cast, an
assignment cast to text exists for every data type.

 concat() is even
 looser: as now implemented, it works for *anything at all*, because it
 relies on datatype output functions not casts to text.  I'm dubious that
 that inconsistency is a good thing, actually, but that's how the
 committed code is written.

I argued at the time that CONCAT should take variadic text rather than
variadic any and was roundly ignored on the grounds that the implicit
casting to text behavior was what everyone wanted in that particular
case.  My feeling is that we need to adopt a solution to this problem
partly so that people don't keep inventing (even in core code!)
one-off, hackish solutions that make certain cases behave completely
differently from the general rules.

 Now, some of us might think that backing these conversions down to only
 allowing assignment-grade casts would be an improvement, in the sense
 that it would actually make the type system tighter not looser than it
 is today for these particular functions.  But I suspect you wouldn't see
 it as an improvement, given the position you're arguing from.

Actually, I think it wouldn't matter a bit, because of the exception
that says there's an assignment cast to text for everything.

 In fact, I'm afraid that making this change would result in requests to
 downgrade existing explicit casts to be assignment-only, so that people
 could be even lazier about not casting function arguments; and that is
 something up with which I will not put.

While I'm personally not excited about it, it is certainly imaginable
that someone might prefer something like text - xml to be an
assignment casts rather than an explicit cast.  But we've got an easy
response to that, which is fine, change it for your database, but
we're not changing it in the upstream copy.  As a compatibility issue
with other databases, it's not really an issue; I can't remember a
single complaint about needing an explicit cast from text to xml or
integer to boolean or any of the other things that appear in pg_cast
with castcontext = 'e'.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Robert Haas
On Tue, Nov 27, 2012 at 2:59 PM, Jeff Davis pg...@j-davis.com wrote:
 What user action are you concerned about? If we (eventually) made the
 non-overloaded case the default, would that resolve your concerns?

I can't quite see how a non-overloaded flag would work, unless we get
rid of schemas.  But I think there are a variety of other kinds of
labeling that I think would work.  I'm still not sure that's as good
as a general solution, because if nothing else it relies on us to make
the right decision as to which type to use in each case, and
considering that neither Tom nor I are particularly sold on what we
did with CONCAT(), nor am I sure that we even agree with each other on
what the right thing to do would have been there, I'm a bit skeptical
about our ability to get these decisions right.  But it might still be
an improvement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Nov 27, 2012 at 2:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 After making the change you propose above, it would only work for types
 for which the cast was assignment-grade or less.

 ...but that's everything, because there's a hardcoded exception in the
 code that dictates that even if there is no entry in pg_cast, an
 assignment cast to text exists for every data type.

Ugh.  I had been thinking that automatic CoerceViaIO casting only
happened for explicit casts.  If that can be invoked via assignment
casts, then what you're proposing really destroys the type system
entirely, at least for functions taking text: there is absolutely
no argument such a function won't accept.  I cannot support this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] why can't plpgsql return a row-expression?

2012-11-27 Thread Alvaro Herrera
Asif Rehman escribió:
 Hi,
 
 I have tried to solve this issue. Please see the attached patch.
 
 With this patch, any expression is allowed in the return statement. For any
 invalid expression an error is generated without doing any special handling.
 When a row expression is used in the return statement then the resultant
 tuple will have rowtype in a single column that needed to be extracted.
 Hence I have handled that case in exec_stmt_return().
 
 any comments/suggestions?

Hmm.  We're running an I/O cast during do_tup_convert() now, and look
up the required functions for each tuple.  I think if we're going to
support this operation at that level, we need to look up the necessary
functions at convert_tuples_by_foo level, and then apply unconditionally
if they've been set up.

Also, what are the implicancies for existing users of tupconvert?  Do we
want to apply casting during ANALYZE for example?  AFAICS the patch
shouldn't break any case that works today, but I don't see that there
has been any analysis of this.

(I looked at the patch posted in the thread started by Pavel elsewhere.
I'm replying to both emails in the interest of keeping things properly
linked.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Robert Haas
On Tue, Nov 27, 2012 at 3:21 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I do think that applying some kind of explicit flag to the function
 indicating whether it should allow implicit assignment
 casting/implicit casting to text/overloading/whatever is a possibly
 interesting alternative.

 That idea seems possibly worth pursuing.  The thing that I find scary
 about the current proposal is that it applies to all functions (and
 operators) willy-nilly, which seems to raise the risk of unexpected
 side effects pretty high.  If we could confine the behavioral change
 to a relatively small number of functions for which there was consensus
 that they should accept most anything, I'd feel better about it.

 (Of course, we might then conclude that something close to the
 quote_literal solution would work as well as a new function property.
 But it's worth thinking about.)

  The fact that quote_literal() allows (by the expedient of
 overloading) implicit casts to text and that lpad() does not seems
 fairly random to me in hindsight; is there a general principle there
 that we'd all sign on to?

 I don't find that random in the slightest.  The entire purpose of
 quote_literal is manufacture a SQL-literal string representation of
 this value, and that clearly might apply to data of any type.  lpad()
 is, first last and only, a textual operation.  Somebody who thinks it
 should apply directly to an integer is guilty of sloppy thinking at
 best, or not even understanding what a data type is at worst.

Well, considering I made that mistake while working with PostgreSQL
8.2, and considering further that other databases allow it, I'm a
little reluctant to accept this theory.  I'm willing to bet that the
fine folks in Redwood understand what a data type is just fine, and
I'm pretty sure that I do, too.  Sloppy thinking?  Perhaps.  But I
think you could make a perfectly fine argument that the function of
lpad() is to concatenate something onto the string representation of a
value, or conversely that the function of quote_literal() is to escape
a string.  You might not agree with either of those arguments but I
don't care to label someone who does as an idiot.  The problem I have
with the explicit labeling approach is that it seems to depend heavily
on how you conceptualize what the function is trying to do, and not
everyone is going to conceptualize that the same way.  Clearly there
are a lot of people who expect at least some string operators to work
on numbers, including the OP, and are confused when they don't.  We
can call those people nasty names but that's not going to win us many
friends.

Anyway, I'm not averse to thinking about some kind of labeling
solution but I'm not exactly sure what would work well - and I'd still
like to see some hard evidence that the collateral damage from my er
your proposal is unacceptably high.  The most plausible scenario for
how this could break things that has been presented thus far is that
someone might create a function, use it with a data type that requires
assignment-casting, and then create another function, and have things
break.  But as Jeff pointed out, that can happen already: in fact, it
did, in core, with pg_size_pretty(), and while you had doubts about
that change at the time, none of us realized exactly what the failure
scenario was until it was too late to change it.  Would that kind of
thing happen more often under this proposal?  Kind of hard to say, but
if it made us think twice before overloading system catalog functions,
it might even work out to a net positive.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQconninfo function for libpq

2012-11-27 Thread Peter Eisentraut
On 11/22/12 6:44 AM, Magnus Hagander wrote:
 I'm thinking it might be a better idea to just have PG_CONNINFO_NORMAL
 and PG_CONNINFO_PASSWORD (which still make sense to separate),

What is the use case for separating them?

 and
 then plug in the exclusion of specific parameters in pg_basebackup, in
 the CreateRecoveryConf() function. pg_basebackup will of course always
 know what pg_basebackup is doing with these things.

Agreed on that in principle.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQconninfo function for libpq

2012-11-27 Thread Magnus Hagander
On Tue, Nov 27, 2012 at 10:13 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 11/22/12 6:44 AM, Magnus Hagander wrote:
 I'm thinking it might be a better idea to just have PG_CONNINFO_NORMAL
 and PG_CONNINFO_PASSWORD (which still make sense to separate),

 What is the use case for separating them?

Getting a connection string that doesn't contain sensitive information
like a password. In order to show it in a dialog box, for example.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY

2012-11-27 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2012-11-27 14:41:34 -0500, Tom Lane wrote:
 The stuff you are allowed to ALTER is pretty much
 irrelevant to the index's life as an index.

 Isn't inisprimary updated when an ALTER TABLE ... ADD PRIMARY KEY
 ... USING someindex ; is done? Also I think indoption might be written
 to as well.

Ugh, you're right.  Somebody wasn't paying attention when those ALTER
commands were added.

We could probably alleviate the consequences of this by having those
operations reset indcheckxmin if the tuple's old xmin is below the
GlobalXmin horizon.  That's something for later though --- it's not
a data corruption issue, it just means that the index might unexpectedly
not be used for queries for a little bit after an ALTER.

 It strikes me that the whole idea of reusing xmin when indexcheckxmin is
 set is overly clever and storing the xid itself would have be been
 better... Too late though.

Well, the reason for that is documented in README.HOT: if the XID were
in an ordinary column there'd be no nice way to get it frozen when it
gets too old.  As-is, VACUUM takes care of that problem automatically.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQconninfo function for libpq

2012-11-27 Thread Peter Eisentraut
On 11/27/12 4:20 PM, Magnus Hagander wrote:
 On Tue, Nov 27, 2012 at 10:13 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 11/22/12 6:44 AM, Magnus Hagander wrote:
 I'm thinking it might be a better idea to just have PG_CONNINFO_NORMAL
 and PG_CONNINFO_PASSWORD (which still make sense to separate),

 What is the use case for separating them?
 
 Getting a connection string that doesn't contain sensitive information
 like a password. In order to show it in a dialog box, for example.

There is already the PQconninfoOption.dispchar field for this purpose.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Peter Eisentraut
On 11/25/12 6:36 PM, Robert Haas wrote:
 I think that is true.  But for whatever it's worth, and at the risk of
 beating a horse that seems not to be dead yet in spite of the fact
 that I feel I've already administered one hell of a beating, the LPAD
 case is unambiguous, and therefore it is hard to see what sort of
 programming mistake we are protecting users against.

Upstream applications passing wrong data down to the database.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQconninfo function for libpq

2012-11-27 Thread Alvaro Herrera
Peter Eisentraut wrote:
 On 11/27/12 4:20 PM, Magnus Hagander wrote:
  On Tue, Nov 27, 2012 at 10:13 PM, Peter Eisentraut pete...@gmx.net wrote:
  On 11/22/12 6:44 AM, Magnus Hagander wrote:
  I'm thinking it might be a better idea to just have PG_CONNINFO_NORMAL
  and PG_CONNINFO_PASSWORD (which still make sense to separate),
 
  What is the use case for separating them?
  
  Getting a connection string that doesn't contain sensitive information
  like a password. In order to show it in a dialog box, for example.
 
 There is already the PQconninfoOption.dispchar field for this purpose.

I had the impression that that field would go away with this patch, but
then again it might not be worth the compatibility break.  I find the
dispchar thingy somewhat unsightly.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Peter Eisentraut
On 11/25/12 7:21 PM, Robert Haas wrote:
 Sure, in theory that is true, but no other RDBMS that I know about
 feels a need to error out in that situation.  I'm skeptical of the
 contention that we're smarter than everyone else.

Well, I think in most programming languages that have typed function
prototypes,

define foo(string)

call foo(55)

is an error.  I could be convinced otherwise, but that's my current
impression.  So the principle of rejecting this is not new.

If, on the other hand, we want to align more with other RDBMS that
apparently allow this, we should look closer into by what rules they do
this.  If they use assignment casts (that is, the same rules that apply
when running INSERT, for example), we could look into using that as
well, but then we should do that all the time, and not only as a
fallback of some sort.  Because that's (a) arbitrary, and (b) causes
failures when overloaded functions are added, which shouldn't happen
(the existing cases where adding overloaded functions cause an existing
function to fail are surely warts that should be removed, not new ones
designed in).

I wonder what implicit casts would be good for if assignment casts
applied for function and operator calls.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: Patch to compute Max LSN of Data Pages

2012-11-27 Thread Alvaro Herrera
Amit Kapila escribió:
 
 Friday, November 23, 2012 5:38 AM Muhammad Usama wrote:

 - I think when finding the max LSN of single file the utility should
  consider all relation segments.
   Would you like to find for all relation related segments:
   Like 12345, 12345.1 ... 12345.nOr 
   12345, 12345.1 ... and 12345_vm, 12345.1_vm
   
   But how about if user asks for 12345.4, do we need to find all greater
 than 12345.4?
 
   IMHO, as this utility is not aware of relation or any other logical
   entity and deals with only file or directory, it is okay to find
   only for that file.

Hmm.  I think I'd expect that if I give pg_computemaxlsn a number, it
should consider that it is a relfilenode, and so it should get a list of
all segments for all forks of it.  So if I give 12345 it should get
12345, 12345.1 and so on, and also 12345_vm 12345_vm.1 and so on.
However, if what I give it is a path, i.e. it contains a slash, I think
it should only consider the specific file mentioned.  In that light, I'm
not sure that command line options chosen are the best set.


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Peter Eisentraut
On 11/27/12 12:07 PM, Merlin Moncure wrote:
 Speaking of polymorphism, why not just implement lpad()'s first
 argument as 'anyelement'?

One of the arguments made here was that lpad(not-text) *should* fail.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY

2012-11-27 Thread Andres Freund
On 2012-11-27 16:31:15 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2012-11-27 14:41:34 -0500, Tom Lane wrote:
  The stuff you are allowed to ALTER is pretty much
  irrelevant to the index's life as an index.

  Isn't inisprimary updated when an ALTER TABLE ... ADD PRIMARY KEY
  ... USING someindex ; is done? Also I think indoption might be written
  to as well.

 Ugh, you're right.  Somebody wasn't paying attention when those ALTER
 commands were added.

 We could probably alleviate the consequences of this by having those
 operations reset indcheckxmin if the tuple's old xmin is below the
 GlobalXmin horizon.  That's something for later though --- it's not
 a data corruption issue, it just means that the index might unexpectedly
 not be used for queries for a little bit after an ALTER.

mark_index_clustered() does the same btw, its not a problem in the
CLUSTER ... USING ...; case because that creates a new pg_index entry
anyway but in the ALTER TABLE one thats not the case.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Merlin Moncure
On Tue, Nov 27, 2012 at 4:09 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 11/27/12 12:07 PM, Merlin Moncure wrote:
 Speaking of polymorphism, why not just implement lpad()'s first
 argument as 'anyelement'?

 One of the arguments made here was that lpad(not-text) *should* fail.

Well, sure.  My point is: why do we need to break the casting
machinery when we can simply tweak a few of the standard functions on
portability grounds?

Robert's case on lpad() has merit in the sense it has unambiguous
meaning regardless of input type; polymorphic input types were
designed to solve *exactly* that problem.  SQL portability is a
secondary but also important argument.

That said, md5() surely needs some type of cast or interpretation of
non-text types.  ditto to_timestamp(), etc.  So messing around with
the casting rules is surely the wrong answer. I think if you relaxed
the function sigs of a few functions on this page
(http://www.postgresql.org/docs/9.2/interactive/functions-string.html),
most reported problems would go away.

One thing that worries me is introducing ambiguous cases where
previously there weren't any though.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Jeff Davis
On Tue, 2012-11-27 at 15:41 -0500, Robert Haas wrote:
 I can't quite see how a non-overloaded flag would work, unless we get
 rid of schemas.

It may work to pick the first schema in the search path that has any
functions by that name, and then choose the overloaded (or not)
candidate from among those functions in that one schema. Then,
non-overloaded function names would be unique within a schema.

If there are multiple functions of the same name in multiple schemas in
the search path, it does not make sense to me to lump them all together
and choose an overloaded candidate from all of them (although I think
that's what we do now). That sounds like a mistake, to me. Do you know
of any useful examples of doing that?

   But I think there are a variety of other kinds of
 labeling that I think would work.

Worth exploring.

   I'm still not sure that's as good
 as a general solution, because if nothing else it relies on us to make
 the right decision as to which type to use in each case, and
 considering that neither Tom nor I are particularly sold on what we
 did with CONCAT(), nor am I sure that we even agree with each other on
 what the right thing to do would have been there, I'm a bit skeptical
 about our ability to get these decisions right.  But it might still be
 an improvement.

I'm not entirely clear on the benefits of a general solution, nor why
your solution is more general. You are still categorizing functions into
overloaded and non-overloaded, but you are doing so at runtime based
on the current contents of the catalog.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 ... I think if you relaxed
 the function sigs of a few functions on this page
 (http://www.postgresql.org/docs/9.2/interactive/functions-string.html),
 most reported problems would go away.

That's an interesting way of approaching it.  Do we have any data on
exactly which functions people do complain about?

 One thing that worries me is introducing ambiguous cases where
 previously there weren't any though.

Right, but at least we'd be confining the ambiguity to a small number
of function names.  Tweaking the casting rules could have a lot of
unforeseen consequences.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Merlin Moncure
On Tue, Nov 27, 2012 at 4:41 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 ... I think if you relaxed
 the function sigs of a few functions on this page
 (http://www.postgresql.org/docs/9.2/interactive/functions-string.html),
 most reported problems would go away.

 That's an interesting way of approaching it.  Do we have any data on
 exactly which functions people do complain about?

After a few minutes of google-fu, lpad seems to top the list (if you
don't count operator related issues which I think are mostly coding
bugs).

see: http://drupal.org/node/1338188.
also upper: 
http://sourceforge.net/tracker/?func=detailaid=3447417group_id=171772atid=859223..
also substring: http://archives.postgresql.org/pgsql-bugs/2008-01/msg1.php

note in two of the above cases the bugs were raised through 3rd party
issue trackers :/.  Interestingly, if you look at popular
postgresql-only functions, such as regexp_replace, google seems to
indicate there's not much problem there.  This, IMNSHO, reinforces
Robert's point -- but it seems to mostly bite people porting code,
running cross database code bases, or having a strong background in
other systems.

I found a few non-string cases, especially round().

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread John R Pierce

On 11/27/12 2:41 PM, Tom Lane wrote:

Tweaking the casting rules could have a lot of
unforeseen consequences.


understatement of the year.  IMHO.   $0.02 worth etc.


--
john r pierceN 37, W 122
santa cruz ca mid-left coast



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQconninfo function for libpq

2012-11-27 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Peter Eisentraut wrote:
 There is already the PQconninfoOption.dispchar field for this purpose.

 I had the impression that that field would go away with this patch, but
 then again it might not be worth the compatibility break.  I find the
 dispchar thingy somewhat unsightly.

It is that, without a doubt, but that API has been that way longer than
any of us have been working on the code.  I'm not excited about changing
it just to have an allegedly-cleaner API.  And we cannot have the field
simply go away, because it's been exposed to applications for lo these
many years, and surely some of them are using it.  So in practice we'd
be maintaining both the old API and the new one.

I think we should leave it as-is until there are more reasons to change
it than seem to be provided in this thread.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] the number of pending entries in GIN index with FASTUPDATE=on

2012-11-27 Thread Kyotaro HORIGUCHI
  1. This patch applies current git head cleanly, but installation
ends up with failure because of the lack of
pgstattuple--1.0--1.1.sql which added in Makefile.
 
 Added pgstattuple--1.0--1.1.sql.

Good. Installation completed and ALTER EXTENSION UPDATE works
with that.

  2. I feel somewhat uneasy with size for palloc's (it's too long),
and BuildTupleFromCString used instead of heap_from_tuple.. But
it would lead additional modification for existent simillars.
 
 OK. I changed the code as you suggested.

Thank you. It looks simpler than the last one. (although the way
differs to pgstatindex..)

  3. pgstatginidex shows only version, pending_pages, and
pendingtuples. Why don't you show the other properties such as
entry pages, data pages, entries, and total pages as
pgstatindex does?
 
 I didn't expose those because they are accurate as of last VACUUM.
 But if you think they are useful, I don't object to expose them.

Ok, my point was the apparent consistency of the functions. I
don't have any distinct wish about this.

I'll mark this as Ready for Committer.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Further pg_upgrade analysis for many tables

2012-11-27 Thread Bruce Momjian
On Mon, Nov 26, 2012 at 05:26:42PM -0500, Bruce Momjian wrote:
 I have developed the attached proof-of-concept patch to test this idea. 
 Unfortunately, I got poor results:
 
    pg_upgrade 
 dump restore  dmp|res git dmp/res
   1 0.12  0.07  0.13 11.16 13.03
1000 3.80  2.83  5.46 18.78 20.27
2000 5.39  5.65 13.99 26.78 28.54
400016.08 12.40 28.34 41.90 44.03
800032.77 25.70 57.97 78.61 80.09
   1600057.67 63.42134.43158.49165.78
   32000   131.84176.27302.85380.11389.48
   64000   270.37708.30   1004.39   1085.39   1094.70
 
 The last two columns show the patch didn't help at all, and the third
 column shows it is just executing the pg_dump, then the restore, not in
 parallel, i.e. column 1 + column 2 ~= column 3.
...
 I will now test using PRIMARY KEY and custom dump format with pg_restore
 --jobs to see if I can get parallelism that way.

I have some new interesting results (in seconds, test script attached):

 -Fc   --- dump | pg_restore/psql --  - pg_upgrade -
dump  restore   -Fc-Fc|-1  -Fc|-j   -Fp-Fp|-1   gitpatch
1   0.140.080.140.160.190.130.15   11.04   13.07
 1000   3.083.656.536.605.396.376.54   21.05   22.18
 2000   6.066.52   12.15   11.78   10.52   12.89   12.11   31.93   31.65
 4000  11.07   14.68   25.12   24.47   22.07   26.77   26.77   56.03   47.03
 8000  20.85   32.03   53.68   45.23   45.10   59.20   51.33  104.99   85.19
16000  40.28   88.36  127.63   96.65  106.33  136.68  106.64  221.82  157.36
32000  93.78  274.99  368.54  211.30  294.76  376.36  229.80  544.73  321.19
64000 197.79 1109.22 1336.83  577.83 1117.55 1327.98  567.84 1766.12  763.02

I tested custom format with pg_restore -j and -1, as well as text
restore.  The winner was pg_dump -Fc | pg_restore -1;  even -j could not
beat it.  (FYI, Andrew Dunstan told me that indexes can be restored in
parallel with -j.)  That is actually helpful because we can use process
parallelism to restore multiple databases at the same time without
having to use processes for -j parallelism.  

Attached is my pg_upgrade patch for this.  I am going to polish it up
for 9.3 application.

 A further parallelism would be to allow multiple database to be
 dump/restored at the same time.  I will test for that once this is done.

I will work on this next.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
:

. traprm

export QUIET=$((QUIET + 1))

 /rtmp/out

export PGOPTIONS=-c synchronous_commit=off

for CYCLES in 1 1000 2000 4000 8000 16000 32000 64000
do
echo $CYCLES  /rtmp/out

for DIR in /pgsql/CURRENT
do  echo $DIR  /rtmp/out
cd $DIR
pginstall
cd -

# need for +16k
pipe sed 's/#max_locks_per_transaction = 
64/max_locks_per_transaction = 64000/' /u/pg/data/postgresql.conf
pipe sed 's/shared_buffers = 128MB/shared_buffers = 1GB/' 
/u/pg/data/postgresql.conf
pipe sed 's/#work_mem = 1MB/work_mem = 500MB/' 
/u/pg/data/postgresql.conf
pipe sed 's/#maintenance_work_mem = 16MB/maintenance_work_mem = 
500MB/' /u/pg/data/postgresql.conf
pgrestart
sleep 2

echo table creation  /rtmp/out
newdb test
for JOT in $(jot $CYCLES); do echo CREATE TABLE test$JOT (x 
SERIAL PRIMARY KEY);; done| sql test

echo pg_dump creation  /rtmp/out
/usr/bin/time --output=/rtmp/out --append --format '%e' aspg 
pg_dump --schema-only --format=custom test  $TMP/1

echo pg_dump restore  /rtmp/out
newdb test
/usr/bin/time --output=/rtmp/out --append --format '%e' aspg 
pg_restore --exit-on-error --dbname=test $TMP/1

echo dump -Fc|restore  /rtmp/out
newdb test2
/usr/bin/time --output=/rtmp/out --append --format '%e' sh -c 
aspg pg_dump --schema-only --format=custom test |
aspg pg_restore --exit-on-error --dbname=test2  $TMP/1

echo dump -Fc|restore, --single-transaction  /rtmp/out
newdb test2
/usr/bin/time --output=/rtmp/out --append --format '%e' sh -c 
aspg pg_dump --schema-only --format=custom test |
aspg pg_restore --exit-on-error --single-transaction 
--dbname=test2  $TMP/1

echo combined dump -Fc/restore -j  /rtmp/out
newdb 

Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY

2012-11-27 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2012-11-27 16:31:15 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
 Isn't inisprimary updated when an ALTER TABLE ... ADD PRIMARY KEY
 ... USING someindex ; is done? Also I think indoption might be written
 to as well.

 Ugh, you're right.  Somebody wasn't paying attention when those ALTER
 commands were added.

On closer look, indoption is never updated --- perhaps you were thinking
about pg_class.reloptions.  indisprimary, indimmediate, and
indisclustered are all subject to post-creation updates though.

 We could probably alleviate the consequences of this by having those
 operations reset indcheckxmin if the tuple's old xmin is below the
 GlobalXmin horizon.  That's something for later though --- it's not
 a data corruption issue, it just means that the index might unexpectedly
 not be used for queries for a little bit after an ALTER.

 mark_index_clustered() does the same btw, its not a problem in the
 CLUSTER ... USING ...; case because that creates a new pg_index entry
 anyway but in the ALTER TABLE one thats not the case.

After further study I think the situation is that

(1) The indisvalid/indisready/indislive updates in CREATE/DROP INDEX
CONCURRENTLY can, and must, be done in-place since we don't have
exclusive lock on the parent table.

(2) All the other updates can be transactional because we hold
sufficient locks to ensure that nothing bad will happen.  The proposed
reductions in ALTER TABLE lock strength would break this in several
cases, but that's a problem for another day.

Attached is a very preliminary draft patch for this.  I've not addressed
the question of whether we can clear indcheckxmin during transactional
updates of pg_index rows, but I think it covers everything else talked
about in this thread.

regards, tom lane

diff --git a/contrib/tcn/tcn.c b/contrib/tcn/tcn.c
index 6a8a96f60339118f7edb11668c5db23fbf85c211..6172137512f5ee72d39262a394c5bd1bef183e14 100644
*** a/contrib/tcn/tcn.c
--- b/contrib/tcn/tcn.c
*** triggered_change_notification(PG_FUNCTIO
*** 141,148 
  		if (!HeapTupleIsValid(indexTuple))		/* should not happen */
  			elog(ERROR, cache lookup failed for index %u, indexoid);
  		index = (Form_pg_index) GETSTRUCT(indexTuple);
! 		/* we're only interested if it is the primary key */
! 		if (index-indisprimary)
  		{
  			int			numatts = index-indnatts;
  
--- 141,148 
  		if (!HeapTupleIsValid(indexTuple))		/* should not happen */
  			elog(ERROR, cache lookup failed for index %u, indexoid);
  		index = (Form_pg_index) GETSTRUCT(indexTuple);
! 		/* we're only interested if it is the primary key and valid */
! 		if (index-indisprimary  index-indisvalid)
  		{
  			int			numatts = index-indnatts;
  
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index f99919093ca0da8c59ee4f4df0643837dfbdb38b..5f270404bf1279ad2ba745ca417ab00b0b5dbb08 100644
*** a/doc/src/sgml/catalogs.sgml
--- b/doc/src/sgml/catalogs.sgml
***
*** 3480,3486 
 index is possibly incomplete: it must still be modified by
 commandINSERT//commandUPDATE/ operations, but it cannot safely
 be used for queries. If it is unique, the uniqueness property is not
!true either.
/entry
   /row
  
--- 3480,3486 
 index is possibly incomplete: it must still be modified by
 commandINSERT//commandUPDATE/ operations, but it cannot safely
 be used for queries. If it is unique, the uniqueness property is not
!guaranteed true either.
/entry
   /row
  
***
*** 3508,3513 
--- 3508,3523 
   /row
  
   row
+   entrystructfieldindislive/structfield/entry
+   entrytypebool/type/entry
+   entry/entry
+   entry
+If false, the index is in process of being dropped, and should be
+ignored for all purposes (including HOT-safety decisions)
+   /entry
+  /row
+ 
+  row
entrystructfieldindkey/structfield/entry
entrytypeint2vector/type/entry
entryliterallink linkend=catalog-pg-attributestructnamepg_attribute/structname/link.attnum/literal/entry
diff --git a/src/backend/access/heap/README.HOT b/src/backend/access/heap/README.HOT
index f12cad44e56c363e62fa617497bbedfe1ba8c1fe..4cf3c3a0d4c2db96a57e73e46fdd7463db439f79 100644
*** a/src/backend/access/heap/README.HOT
--- b/src/backend/access/heap/README.HOT
*** from the index, as well as ensuring that
*** 386,391 
--- 386,419 
  rows in a broken HOT chain (the first condition is stronger than the
  second).  Finally, we can mark the index valid for searches.
  
+ Note that we do not need to set pg_index.indcheckxmin in this code path,
+ because we have outwaited any transactions that would need to avoid using
+ the index.  (indcheckxmin is only needed because non-concurrent CREATE
+ INDEX doesn't want to wait; its stronger lock would create too 

Re: [HACKERS] Further pg_upgrade analysis for many tables

2012-11-27 Thread Jeff Janes
On Tue, Nov 27, 2012 at 8:13 PM, Bruce Momjian br...@momjian.us wrote:

 I have some new interesting results (in seconds, test script attached):

  -Fc   --- dump | pg_restore/psql --  - 
 pg_upgrade -
 dump  restore   -Fc-Fc|-1  -Fc|-j   -Fp-Fp|-1   git
 patch
 1   0.140.080.140.160.190.130.15   11.04   
 13.07
  1000   3.083.656.536.605.396.376.54   21.05   
 22.18
  2000   6.066.52   12.15   11.78   10.52   12.89   12.11   31.93   
 31.65
  4000  11.07   14.68   25.12   24.47   22.07   26.77   26.77   56.03   
 47.03
  8000  20.85   32.03   53.68   45.23   45.10   59.20   51.33  104.99   
 85.19
 16000  40.28   88.36  127.63   96.65  106.33  136.68  106.64  221.82  
 157.36
 32000  93.78  274.99  368.54  211.30  294.76  376.36  229.80  544.73  
 321.19
 64000 197.79 1109.22 1336.83  577.83 1117.55 1327.98  567.84 1766.12  
 763.02

 I tested custom format with pg_restore -j and -1, as well as text
 restore.  The winner was pg_dump -Fc | pg_restore -1;

I don't have the numbers at hand, but if my relcache patch is
accepted, then -1 stops being faster.

-1 gets rid of the AtOEXAct relcache N^2 behavior, but at the cost of
invoking a different N^2, that one in the stats system.


Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: Patch to compute Max LSN of Data Pages

2012-11-27 Thread Amit Kapila
On Wednesday, November 28, 2012 3:36 AM Alvaro Herrera wrote:
 Amit Kapila escribió:
 
  Friday, November 23, 2012 5:38 AM Muhammad Usama wrote:
 
  - I think when finding the max LSN of single file the utility should
   consider all relation segments.
Would you like to find for all relation related segments:
Like 12345, 12345.1 ... 12345.nOr
12345, 12345.1 ... and 12345_vm, 12345.1_vm
 
But how about if user asks for 12345.4, do we need to find all
 greater
  than 12345.4?
 
IMHO, as this utility is not aware of relation or any other logical
entity and deals with only file or directory, it is okay to find
only for that file.
 
 Hmm.  I think I'd expect that if I give pg_computemaxlsn a number, it
 should consider that it is a relfilenode, and so it should get a list of
 all segments for all forks of it.  So if I give 12345 it should get
 12345, 12345.1 and so on, and also 12345_vm 12345_vm.1 and so on.

Yes, I think this can be done either by having it as new option or
internally code can interpret as you suggest. 
Also, can't we think of it as an extended usecase and implement it on top of
base, if this usecase is not an urgent?

 However, if what I give it is a path, i.e. it contains a slash, I think
 it should only consider the specific file mentioned.  In that light, I'm
 not sure that command line options chosen are the best set.

Yes for sure command line options can be improved/extended based on usecase.
Please feel free to give suggestion regarding command line option if you
feel current option
is not an extendable option.

One way is -p should be for file path and may be -r for relfile number.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-27 Thread Pavel Stehule
2012/11/28 Merlin Moncure mmonc...@gmail.com:
 On Tue, Nov 27, 2012 at 4:41 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 ... I think if you relaxed
 the function sigs of a few functions on this page
 (http://www.postgresql.org/docs/9.2/interactive/functions-string.html),
 most reported problems would go away.

 That's an interesting way of approaching it.  Do we have any data on
 exactly which functions people do complain about?

 After a few minutes of google-fu, lpad seems to top the list (if you
 don't count operator related issues which I think are mostly coding
 bugs).

 see: http://drupal.org/node/1338188.
 also upper: 
 http://sourceforge.net/tracker/?func=detailaid=3447417group_id=171772atid=859223..
 also substring: http://archives.postgresql.org/pgsql-bugs/2008-01/msg1.php

some of these issues are buggy and I am happy, so it doesn't working now.

http://archives.postgresql.org/pgsql-bugs/2008-01/msg1.php

and these issue can be simply solved by overloading.

Pavel


 note in two of the above cases the bugs were raised through 3rd party
 issue trackers :/.  Interestingly, if you look at popular
 postgresql-only functions, such as regexp_replace, google seems to
 indicate there's not much problem there.  This, IMNSHO, reinforces
 Robert's point -- but it seems to mostly bite people porting code,
 running cross database code bases, or having a strong background in
 other systems.

 I found a few non-string cases, especially round().

 merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQconninfo function for libpq

2012-11-27 Thread Magnus Hagander
On Nov 28, 2012 1:54 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  Peter Eisentraut wrote:
  There is already the PQconninfoOption.dispchar field for this purpose.

  I had the impression that that field would go away with this patch, but
  then again it might not be worth the compatibility break.  I find the
  dispchar thingy somewhat unsightly.

 It is that, without a doubt, but that API has been that way longer than
 any of us have been working on the code.  I'm not excited about changing
 it just to have an allegedly-cleaner API.  And we cannot have the field
 simply go away, because it's been exposed to applications for lo these
 many years, and surely some of them are using it.  So in practice we'd
 be maintaining both the old API and the new one.

 I think we should leave it as-is until there are more reasons to change
 it than seem to be provided in this thread.

I think removing that would be a really bad idea. I'm not sure anybody is
actually relying on it, but it would also change the size of the struct and
thus break things for anybody using those functions.

If people prefer we remove the password classifier for the new function
since it at least partially duplicates that field we can certainly do that,
but I think leaving it in allows those who write new code to use a slightly
neater api, at pretty much no cost in maintenance for us.

/Magnus


Re: [HACKERS] Review: Patch to compute Max LSN of Data Pages

2012-11-27 Thread Muhammad Usama
On Tue, Nov 27, 2012 at 5:52 PM, Amit kapila amit.kap...@huawei.com wrote:

  Friday, November 23, 2012 5:38 AM Muhammad Usama wrote:

  Observations and Comments
  ---

  - If no option is given to pg_computemaxlsn utility then we get a wrong
  error message
  ./pg_computemaxlsn  ../data
  pg_computemaxlsn: file or directory (null) exists
  In my opinion we should consider the -P (Print max LSN from whole data
  directory) as default in case no option is specified, or otherwise throw
 a
  more descriptive error.



 Fixed. By considering default option as Whole data directory.


  - Options -p and -P should be mutually exclusive

 Fixed. Both options will not be allowed at the same time.



 - Long options missing for -p and -P



 Fixed. introduced long options as --path  --data-directory.


  - Help string for -P  should be something like print max LSN from whole
  data directory instead of  print max LSN from whole database

 Fixed, by changing the message as suggested.


  - Help menu is silent about -V or --version option

 Fixed.


  - I think when finding the max LSN of single file the utility should
  consider all relation segments.

 IMHO, as this utility is not aware of relation or any other logical entity
 and deals with only file or directory,

   it is okay to find only for that file.



  - There is no check for extra arguments
  e.g
  ./pg_computemaxlsn  -P . ../data/ ../data/base/ ../data2/

 Fixed.
 - Extra options gives error.
 - Multiple -p options also gives error.
 Eg:
 ./pg_computemaxlsn -p base/12286/12200 -p base/12286/12201 data



  - For -p {file | dir}  option the utility expects the file path relative
 to
  the specified data directory path which makes thing little confusing
  for example
  ./pg_computemaxlsn -p data/base/12286/12200 data
  pg_computemaxlsn: file or directory data/base/12286/12200 does not
 exists
  although data/base/12286/12200 is valid file path but we gets file does
 not
  exists error
  As the path of file is relative to data directory, that's why in
 error it prints the path as data/base/12286/12200.
 Do you have any suggestion what should be done for this?


I think we should expect provided path to be relative to current directory
or may consider it to be relative to either one of Data or CWD.
Because normally we expect path to be relative to CWD if some program is
asking for path in command line. And also tab-complete doesn't works on
terminal if path is not absolute or relative to the current directory.
To handle this I think we should not change current directory to the data
directory and generate the file path (like for postmaster.pid) where
required by appending data directory path.
And before throwing an error should check if path is valid for either DATA
or CWD something like

if (lstat(LsnSearchPath, fst)  0)
{
if (errno == ENOENT)
{
/* Check if the path is relative to provided data directory */
snprintf(path, MAXPGPATH, %s/%s,DataDir,LsnSearchPath);
 if (lstat(path, fst)  0)
{
if (errno == ENOENT)
{
/* Check if the path is relative to provided data directory */
 fprintf(stderr, _(%s: file or directory \%s\ does not exists.\n),
progname, LsnSearchPath);
}
else
{
fprintf(stderr, _(%s: could not stat file or directory \%s\: %s\n),
progname, path, strerror(errno));
}
exit(1);
}
else /* Path is relative to data directory*/
LsnSearchPath = strdup(path);
 }
else
{
fprintf(stderr, _(%s: could not stat file or directory \%s\: %s\n),
progname, LsnSearchPath, strerror(errno));
exit(1);
}
}



  - Utility should print the max LSN number in hexadecimal notation and LSN
  format (logid/segno) for consistency with PG,

 Fixed

  and may also print the file name which contains that LSN.

 It might be confusing to users for the cases where it has to follow link
 (pg_tblspc) and then print some absolute path which is not related to
 relative path given by user.
 Also I am not sure if it will be of any use to user.



  - When finding max LSN from directory, the utility does not considers the
  sub directories, which I think could be useful in some cases, like if we
  want to find the max LSN in tablespace directory. With the current
  implementation we would require to run the utility multiple times, once
 for
  each database.

 It will consider sub-directories as well.



  Code Review
  -

  Code generally looks good, I have few small points.

  - In main() function variable maxLSN is defined as uint64,
  Although XLogRecPtr and uint64 are the same thing, but I think we should
  use XLogRecPtr instead.
  Changed as per suggestion.

  - Just a small thing although will hardly improve anything but still for
  sake of saving few cycles, I think you should use XLByteLT (  ) instead
  of XLByteLE ( = )
  to find if the previous max LSN is lesser then current.

 Changed as per suggestion.


  - '\n' is missing from one the printf in the code :-)
  fprintf(stderr, _(%s: file 

Re: [HACKERS] Review: Patch to compute Max LSN of Data Pages

2012-11-27 Thread Muhammad Usama
On Wed, Nov 28, 2012 at 3:06 AM, Alvaro Herrera alvhe...@2ndquadrant.comwrote:

 Amit Kapila escribió:
 
  Friday, November 23, 2012 5:38 AM Muhammad Usama wrote:

  - I think when finding the max LSN of single file the utility should
   consider all relation segments.
Would you like to find for all relation related segments:
Like 12345, 12345.1 ... 12345.nOr
12345, 12345.1 ... and 12345_vm, 12345.1_vm
 
But how about if user asks for 12345.4, do we need to find all greater
  than 12345.4?
 
IMHO, as this utility is not aware of relation or any other logical
entity and deals with only file or directory, it is okay to find
only for that file.

 Hmm.  I think I'd expect that if I give pg_computemaxlsn a number, it
 should consider that it is a relfilenode, and so it should get a list of
 all segments for all forks of it.  So if I give 12345 it should get
 12345, 12345.1 and so on, and also 12345_vm 12345_vm.1 and so on.
 However, if what I give it is a path, i.e. it contains a slash, I think
 it should only consider the specific file mentioned.  In that light, I'm
 not sure that command line options chosen are the best set.

 +1


 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



[HACKERS] pg_basebackup is taking backup of extra files inside a tablespace directory

2012-11-27 Thread Hari Babu
pg_basebackup is taking backup of extra files other than database related
files in side a TABLESPACE directory. 

Scenario: 
1)  Create tablespace in existing directory '/opt/tblspc' having
some extra files and folders. 
create tablespace tbs1 location '/opt/tblspc'; 

2) Now execute the pg_basebackup command; 
We can see it will copy the extra files in '/opt/tblspc'
directory 

I think backup should be done only files and folders present inside
'/opt/tblspc/PG_*' directory (TABLESPACE_VERSION_DIRECTORY). 
Not all the files and folders in side '/opt/tblspc.' directory. 


Is it ok to fix in the following way? 

In function perform_base_backup, 

while sending the tablespaces one by one we can send the header for
Linkpath/TABLESPACE_VERSION_DIRECTORY 
as separate header and sendDir for Linkpath/TABLESPACE_VERSION_DIRECTORY 
as path. 


Regards, 
Hari babu. 

 



Re: [HACKERS] pg_basebackup is taking backup of extra files inside a tablespace directory

2012-11-27 Thread Michael Paquier
On Wed, Nov 28, 2012 at 3:55 PM, Hari Babu haribabu.ko...@huawei.comwrote:

 pg_basebackup is taking backup of extra files other than database related
 files in side a TABLESPACE directory.

And why do you have files not related to your PG server inside a folder
dedicated to a PG server?
-- 
Michael Paquier
http://michael.otacoo.com