date:20140404

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-04 Thread Dean Rasheed

On 1 April 2014 20:58, Florian Pflug f...@phlo.org wrote:
 On Apr1, 2014, at 10:08 , Dean Rasheed dean.a.rash...@gmail.com wrote:
 On 31 March 2014 01:58, Florian Pflug f...@phlo.org wrote:
 Attached are updated patches that include the EXPLAIN changes mentioned
 above and updated docs.


 These patches need re-basing --- they no longer apply to HEAD.

 Rebased to head (554bb3beba27bf4a49edecc40f6c0f249974bc7c). I had to
 re-assign OIDs in the dependent paches again (those which actually
 add the various inverse transition functions).


Looking at the new explain output, that is not exactly what I was
suggesting upthread. In particular, in cases where the WindowAgg node
is executed more than once, I think that the reported transition
counts should be the averages per-execution of the node. That way the
number of transition function calls reported for the node is directly
comparable with its rows value. Thus I think the output should be
divided by nloops, which would be more consistent with the way other
similar values are reported in explain output (c.f.
show_instrumentation_count).

I started looking at the arith patch, and I got as far as looking at
the changes to count(*) and count(val), which seem reasonable. But
then I started testing performance, and I found cases where the
improvement is not nearly what I expected.

The example cited at the start of this thread is indeed orders of
magnitude faster than HEAD:

SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
FROM generate_series(1,2) g(n);

This executes in 20ms on my box, vs 30sec on HEAD, which reflects the
change from an O(n^2) to an O(n) algorithm.

However, if both ends of the frame move, the benefits are far less impressive:

SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 10 FOLLOWING)
FROM generate_series(1,2) g(n);
-- 33ms

SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 100 FOLLOWING)
FROM generate_series(1,2) g(n);
-- 173ms

SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 1000 FOLLOWING)
FROM generate_series(1,2) g(n);
-- 1467ms

SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING)
FROM generate_series(1,2) g(n);
-- 7709ms

This is still exhibiting the behaviour of an O(n^2) algorithm.

The problem lies in window_gettupleslot() which steps one row at a
time from the last position to the new required position. So if both
ends of the frame are moving, it needs to step forwards and backwards
through the entire window, for each row processed - hence the O(n^2)
behaviour.

Looking at window_gettupleslot, there is an obvious potential
mirco-optimisation that can be made if there are multiple rows to be
skipped --- instead of calling tuplestore_gettupleslot() multiple
times, call tuplestore_advance() multiple times, then call
tuplestore_gettupleslot() once to fetch the final required tuple, thus
avoiding a lot of unnecessary tuple copying. That of course won't
eliminate the O(n^2) behaviour, but it will reduce the overall factor,
and so is probably worth doing.

However, to fix the fundamental O(n^2) problem, I think there needs to
be separate tuplestore read pointers for the head and tail of the
frame. There may also be a case for another read pointer for the
current row too, and possibly one for general purpose user-triggered
fetches. One approach might be to have up to a small number N (say 3
or 4) of read pointers, with window_gettupleslot() automatically
choosing the one nearest to the requested row. Possibly there are
better approaches. I think a little more investigation is needed.

I'm not sure how much additional work is required to sort this out,
but to me it looks more realistic to target 9.5 than 9.4, so at this
point I tend to think that the patch ought to be marked as returned
with feedback.

Thoughts?

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL format and API changes (9.5)

2014-04-04 Thread Heikki Linnakangas


On 04/04/2014 02:40 AM, Andres Freund wrote:

On 2014-04-03 19:33:12 -0400, Tom Lane wrote:

Andres Freund and...@2ndquadrant.com writes:

On 2014-04-03 19:08:27 -0400, Tom Lane wrote:

A somewhat more relevant concern is where are we going to keep the state
data involved in all this.  Since this code is, by definition, going to be
called in critical sections, any solution involving internal pallocs is
right out.



We actually already allocate memory in XLogInsert() :(, although only in
the first XLogInsert() a backend does...


Ouch.  I wonder if we should put an Assert(not-in-critical-section)
into MemoryContextAlloc.


Good idea!


XLogInsert() is using malloc() directly, so that wouldn't detect this
case...

It's not a bad idea tho. I wonder how far the regression tests
get...

Not even through initdb. GetVirtualXIDsDelayingChkpt() is to blame.


Hmm, the checkpointer process seems to ignore the rule, and just hopes 
for the best. The allocation in GetVirtualXIDsDelayingChkpt() was 
probably just an oversight, and that call could be moved outside the 
critical section, but we also have this in AbsortFsyncRequests():



/*
 * We have to PANIC if we fail to absorb all the pending requests (eg,
 * because our hashtable runs out of memory).  This is because the 
system
 * cannot run safely if we are unable to fsync what we have been told to
 * fsync.  Fortunately, the hashtable is so small that the problem is
 * quite unlikely to arise in practice.
 */
START_CRIT_SECTION();


Perhaps we could fix that by just calling fsync() directly if the 
allocation fails, like we do if ForwardFsyncRequest() fails.



But if we give the checkpointer process a free pass, running the 
regression tests with an assertion in AllocSetAlloc catches five genuine 
bugs:


1. _bt_newroot
2. XLogFileInit
3. spgPageIndexMultiDelete
4. PageRepairFragmentation
5. PageIndexMultiDelete

I'll commit the attached patch to fix those shortly.

- Heikki
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index d2ca8d9..922412e 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -1995,8 +1995,10 @@ _bt_newroot(Relation rel, Buffer lbuf, Buffer rbuf)
 	BTPageOpaque lopaque;
 	ItemId		itemid;
 	IndexTuple	item;
-	Size		itemsz;
-	IndexTuple	new_item;
+	IndexTuple	left_item;
+	Size		left_item_sz;
+	IndexTuple	right_item;
+	Size		right_item_sz;
 	Buffer		metabuf;
 	Page		metapg;
 	BTMetaPageData *metad;
@@ -2016,6 +2018,26 @@ _bt_newroot(Relation rel, Buffer lbuf, Buffer rbuf)
 	metapg = BufferGetPage(metabuf);
 	metad = BTPageGetMeta(metapg);
 
+	/*
+	 * Create downlink item for left page (old root).  Since this will be the
+	 * first item in a non-leaf page, it implicitly has minus-infinity key
+	 * value, so we need not store any actual key in it.
+	 */
+	left_item_sz = sizeof(IndexTupleData);
+	left_item = (IndexTuple) palloc(left_item_sz);
+	left_item-t_info = left_item_sz;
+	ItemPointerSet((left_item-t_tid), lbkno, P_HIKEY);
+
+	/*
+	 * Create downlink item for right page.  The key for it is obtained from
+	 * the high key position in the left page.
+	 */
+	itemid = PageGetItemId(lpage, P_HIKEY);
+	right_item_sz = ItemIdGetLength(itemid);
+	item = (IndexTuple) PageGetItem(lpage, itemid);
+	right_item = CopyIndexTuple(item);
+	ItemPointerSet((right_item-t_tid), rbkno, P_HIKEY);
+
 	/* NO EREPORT(ERROR) from here till newroot op is logged */
 	START_CRIT_SECTION();
 
@@ -2034,16 +2056,6 @@ _bt_newroot(Relation rel, Buffer lbuf, Buffer rbuf)
 	metad-btm_fastlevel = rootopaque-btpo.level;
 
 	/*
-	 * Create downlink item for left page (old root).  Since this will be the
-	 * first item in a non-leaf page, it implicitly has minus-infinity key
-	 * value, so we need not store any actual key in it.
-	 */
-	itemsz = sizeof(IndexTupleData);
-	new_item = (IndexTuple) palloc(itemsz);
-	new_item-t_info = itemsz;
-	ItemPointerSet((new_item-t_tid), lbkno, P_HIKEY);
-
-	/*
 	 * Insert the left page pointer into the new root page.  The root page is
 	 * the rightmost page on its level so there is no high key in it; the
 	 * two items will go into positions P_HIKEY and P_FIRSTKEY.
@@ -2051,32 +2063,20 @@ _bt_newroot(Relation rel, Buffer lbuf, Buffer rbuf)
 	 * Note: we *must* insert the two items in item-number order, for the
 	 * benefit of _bt_restore_page().
 	 */
-	if (PageAddItem(rootpage, (Item) new_item, itemsz, P_HIKEY,
+	if (PageAddItem(rootpage, (Item) left_item, left_item_sz, P_HIKEY,
 	false, false) == InvalidOffsetNumber)
 		elog(PANIC, failed to add leftkey to new root page
 			  while splitting block %u of index \%s\,
 			 BufferGetBlockNumber(lbuf), RelationGetRelationName(rel));
-	pfree(new_item);
-
-	/*
-	 * Create downlink item for right page.  The key for it is obtained from
-	 * the high key position in the left page.
-	 */
-	itemid = PageGetItemId(lpage, P_HIKEY);
-	itemsz =

Re: [HACKERS] WAL format and API changes (9.5)

2014-04-04 Thread Andres Freund

Hi,

On 2014-04-04 10:48:32 +0300, Heikki Linnakangas wrote:
 But if we give the checkpointer process a free pass, running the regression
 tests with an assertion in AllocSetAlloc catches five genuine bugs:
 
 1. _bt_newroot
 2. XLogFileInit
 3. spgPageIndexMultiDelete
 4. PageRepairFragmentation
 5. PageIndexMultiDelete

Some of those, like PageRepairFragmentation, are somewhat bad...

 @@ -484,10 +483,11 @@ PageRepairFragmentation(Page page)
   ((PageHeader) page)-pd_upper = pd_special;
   }
   else
 - {   /* nstorage != 
 0 */
 + {
   /* Need to compact the page the hard way */
 - itemidbase = (itemIdSort) palloc(sizeof(itemIdSortData) * 
 nstorage);
 - itemidptr = itemidbase;
 + itemIdSortData itemidbase[MaxHeapTuplesPerPage];
 + itemIdSort  itemidptr = itemidbase;
 +

That's a fair bit of stack, and it can be called somewhat deep on the
stack via heap_page_prune_opt(). I wonder if we ought to add a
check_stack_depth() somewhere.

Thanks,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL format and API changes (9.5)

2014-04-04 Thread Heikki Linnakangas


On 04/04/2014 11:41 AM, Andres Freund wrote:

On 2014-04-04 10:48:32 +0300, Heikki Linnakangas wrote:

@@ -484,10 +483,11 @@ PageRepairFragmentation(Page page)
((PageHeader) page)-pd_upper = pd_special;
}
else
-   {   /* nstorage != 
0 */
+   {
/* Need to compact the page the hard way */
-   itemidbase = (itemIdSort) palloc(sizeof(itemIdSortData) * 
nstorage);
-   itemidptr = itemidbase;
+   itemIdSortData itemidbase[MaxHeapTuplesPerPage];
+   itemIdSort  itemidptr = itemidbase;
+


That's a fair bit of stack, and it can be called somewhat deep on the
stack via heap_page_prune_opt(). I wonder if we ought to add a
check_stack_depth() somewhere.


Hmm, on my 64-bit laptop, that array is 24*291=6984 bytes with 8k block 
size. That's fairly large, but not unheard of; there are a few places 
where we allocate a BLCKSZ-sized buffer from stack.


We could easily reduce the stack consumption here by changing itemIdSort 
to use 16-bit ints; all the lengths and offsets that 
PageRepairFragmentation deals with fit in 16-bits.


But overall I wouldn't worry about it. check_stack_depth() leaves a fair 
amount of headroom: STACK_DEPTH_SLOP is 512kB. As long as we don't 
recurse, that's plenty.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL format and API changes (9.5)

2014-04-04 Thread Andres Freund

On 2014-04-04 12:50:25 +0300, Heikki Linnakangas wrote:
 On 04/04/2014 11:41 AM, Andres Freund wrote:
 On 2014-04-04 10:48:32 +0300, Heikki Linnakangas wrote:
 @@ -484,10 +483,11 @@ PageRepairFragmentation(Page page)
 ((PageHeader) page)-pd_upper = pd_special;
 }
 else
 -   {   /* nstorage != 
 0 */
 +   {
 /* Need to compact the page the hard way */
 -   itemidbase = (itemIdSort) palloc(sizeof(itemIdSortData) * 
 nstorage);
 -   itemidptr = itemidbase;
 +   itemIdSortData itemidbase[MaxHeapTuplesPerPage];
 +   itemIdSort  itemidptr = itemidbase;
 +
 
 That's a fair bit of stack, and it can be called somewhat deep on the
 stack via heap_page_prune_opt(). I wonder if we ought to add a
 check_stack_depth() somewhere.
 
 Hmm, on my 64-bit laptop, that array is 24*291=6984 bytes with 8k block
 size. That's fairly large, but not unheard of; there are a few places where
 we allocate a BLCKSZ-sized buffer from stack.

Yea, I am not complaing about using so much stack. Seems sensible here.

 But overall I wouldn't worry about it. check_stack_depth() leaves a fair
 amount of headroom: STACK_DEPTH_SLOP is 512kB. As long as we don't recurse,
 that's plenty.

Well, there's no checks at nearby afair. That's why I was
wondering... But I don't have a big problem with not checking, I just
wanted to bring it up.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Observed an issue in CREATE TABLE syntax

2014-04-04 Thread Rajeev rastogi

I observed an issue that even if invalid syntax is provided for CREATE TABLE, 
table is getting created successfully.

Below table creation succeed even though same constraint name 
is given multiple times.
None of the below constraints has any meaning of giving 
multiple times.

postgres=# create table new (id int NULL NULL);
CREATE TABLE

postgres=# create table new1(id serial NOT NULL NOT NULL);
CREATE TABLE

postgres=# create table new2 (id int unique unique);
CREATE TABLE

Should we not throw error for above syntaxes?

Please find the attached patch with the fix.

Thanks and Regards,
Kumar Rajeev Rastogi



multiconstissuev1.patch
Description: multiconstissuev1.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-04 Thread Florian Pflug

 ), which seem reasonable. But
 then I started testing performance, and I found cases where the
 improvement is not nearly what I expected.
 
 The example cited at the start of this thread is indeed orders of
 magnitude faster than HEAD:
 
 SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
 F
 
 
 
 
 
 
 I'm not sure how much additional work is required to sort this out,
 but to me it looks more realistic to target 9.5 than 9.4, so at this
 point I tend to think that the patch ought to be marked as returned
 with feedback.
 
 Thoughts?
 
 Regards,
 Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-04 Thread Florian Pflug


 On 04.04.2014, at 09:40, Dean Rasheed dean.a.rash...@gmail.com wrote:
 
 I'm not sure how much additional work is required to sort this out,
 but to me it looks more realistic to target 9.5 than 9.4, so at this
 point I tend to think that the patch ought to be marked as returned
 with feedback.

I think the patch is worthwhile, even without this additional optimization. In 
fact, If the optimization was part of the patch, there would probably be calls 
to factor it out, on the ground that the patch is already rather large.

I don't see what bumping the whole thing to 9.5 buys, compared do applying what 
we have now, and optimizing in 9.5 further.

best regards,
Florian Pflug

PS: Sorry for the broken mail I sent earlier - miss-touched on my Phone ;-(

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-04-04 Thread YAMAMOTO Takashi

 On Fri, Apr 04, 2014 at 02:36:05AM +, YAMAMOTO Takashi wrote:
  Thanks.  To avoid socket path length limitations, I lean toward placing the
  socket temporary directory under /tmp rather than placing under the CWD:
  
  http://www.postgresql.org/message-id/flat/20121129223632.ga15...@tornado.leadboat.com
 
 openvswitch has some tricks to overcome the socket path length
 limitation using symlink.  (or procfs where available)
 iirc these were introduced for debian builds which use deep CWD.
 
 That's another reasonable approach.  Does it have a notable advantage over
 placing the socket in a subdirectory of /tmp?  Offhand, the security and
 compatibility consequences look similar.

an advantage is that the socket can be placed under CWD
and thus automatically obeys its directory permissions etc.

YAMAMOTO Takashi

 
 -- 
 Noah Misch
 EnterpriseDB http://www.enterprisedb.com
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-04-04 Thread Andres Freund

On 2014-04-04 12:56:55 +0200, Florian Pflug wrote:

  On 04.04.2014, at 09:40, Dean Rasheed dean.a.rash...@gmail.com wrote:
 
  I'm not sure how much additional work is required to sort this out,
  but to me it looks more realistic to target 9.5 than 9.4, so at this
  point I tend to think that the patch ought to be marked as returned
  with feedback.

 I think the patch is worthwhile, even without this additional
 optimization. In fact, If the optimization was part of the patch,
 there would probably be calls to factor it out, on the ground that the
 patch is already rather large.

 I don't see what bumping the whole thing to 9.5 buys, compared do
 applying what we have now, and optimizing in 9.5 further.

From my POV applying this patch can't be considered a very high priority
for 9.4x. It came *really* late to the game for a relatively complex
patch. A significant portion of the development only happened *after*
the start of the last commitfest.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Allocations in critical section (was Re: [HACKERS] WAL format and API changes (9.5))

2014-04-04 Thread Heikki Linnakangas

Ok, I fixed the issues that the assertion fixed. I also committed a 
patch to add the assertion itself; let's see if the buildfarm finds more 
cases that violate the rule.


It ignores the checkpointer, because it's known to violate the rule, and 
allocations in ErrorContext, which is used during error recovery, e.g if 
you indeed PANIC while in a critical section for some other reason.


I didn't backpatch this. Although you shouldn't be running with 
assertions enabled in production, it nevertheless seems too risky. There 
might be some obscure cases where there is no real risk, e.g because the 
current memory context always has enough free space because of a 
previous pfree, and it doesn't seem worth tracking down and fixing such 
issues in backbranches. You have to be pretty unlucky to run out of 
memory in a critical section to begin with.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] HOT Update || want to use a different page for updated tuple

2014-04-04 Thread Rohit Goyal

Hi All,

I was comparing postgresql performance and was just curious about
performance in case i want to store the updated index tuple version on a
different page.
I was looking into the code of heapam.c, but was not able to find loop
which i should remove so that postgresql use a different buffer for updated
index tuple version.

Please guide me, about how i can achieve this.

Regards,
Rohit Goyal

Re: [HACKERS] HOT Update || want to use a different page for updated tuple

2014-04-04 Thread Heikki Linnakangas


On 04/04/2014 02:53 PM, Rohit Goyal wrote:

Hi All,

I was comparing postgresql performance and was just curious about
performance in case i want to store the updated index tuple version on a
different page.
I was looking into the code of heapam.c, but was not able to find loop
which i should remove so that postgresql use a different buffer for updated
index tuple version.


Did you mean to:

1. Force the old and new tuple to always be stored on different pages?

Hack heap_update so that it chooses a new page. I think you'll also need 
to hack RelationGetBufferForTuple to not return the same buffer.


2. Disable the HOT optimization, so that HOT is not used even if the old 
and new tuple are stored on the same page?


In heap_update, force satisfies_hot variable to false.

3. Allow HOT to be used even though the old and new tuple are stored on 
different pages?


This is not feasible..

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] HOT Update || want to use a different page for updated tuple

2014-04-04 Thread Rohit Goyal

On Fri, Apr 4, 2014 at 2:03 PM, Heikki Linnakangas
hlinnakan...@vmware.comwrote:

 On 04/04/2014 02:53 PM, Rohit Goyal wrote:

 Hi All,

 I was comparing postgresql performance and was just curious about
 performance in case i want to store the updated index tuple version on a
 different page.
 I was looking into the code of heapam.c, but was not able to find loop
 which i should remove so that postgresql use a different buffer for
 updated
 index tuple version.


 Did you mean to:

 1. Force the old and new tuple to always be stored on different pages?

 Hack heap_update so that it chooses a new page. I think you'll also need
 to hack RelationGetBufferForTuple to not return the same buffer.

 Can you explain a bit more from inside the code where i have to make the
changes :)? my focus is just to store the index tuple of an updated tuple
onto a different page or on a different buffer.


 2. Disable the HOT optimization, so that HOT is not used even if the old
 and new tuple are stored on the same page?

 In heap_update, force satisfies_hot variable to false.

If I do change only this variable to false. then does it means, i have
disabled the hot update.?


 3. Allow HOT to be used even though the old and new tuple are stored on
 different pages?

 This is not feasible..

 - Heikki


thanks in advance.!! :)

Regards,
Rohit Goyal


-- 
Regards,
Rohit Goyal

[HACKERS] Proposal: COUNT(*) (and related) speedup

2014-04-04 Thread Joshua Yanovski

Hey all,

First off, please feel free to let me know if this idea is a waste of time :)

I was thinking about COUNT(*) and its slow performance today, and I
thought about whether we could get better performance by taking a page
out of index-only-scans.

Essentially, the idea is that you would store a counter (let's say, as
a special index type) that would initially (on index creation) be set
to the total count of
all rows on fully visible pages (visibility map bit set to 1).
Whenever a query was made on the table for COUNT(*), we would not even
bother to visit any pages
that were in the visibility map, but instead visit the heap for every
page that was not marked visible and compute the delta between the
count of tuples visible
at relfrozenxid and tuples currently visible.  The deltas on all the
pages would then be summed with the original counter and returned as
the count of the table.
The counter would be updated only by VACUUM, which would perform the
same computation performed by the COUNT operation but add it
permanently to counter just before
it set the visible_map bit to 1 (I am not certain whether this would
require unusual changes to WAL replay or not).  There may be some
issues with this idea, since
I am not too familiar with the Postgres codebase, but Andrew Gierth
was kind enough to put up with my bugging him in the IRC channel all
night and helped me work through
some of this stuff so I think it should at least be coherent.

The upshot of the above proposal is that in situations where an index
only scan would have helped before, now you might be able to skip
almost all the I/O for COUNT(*),
while still getting a completely accurate result.

The disadvantages are pretty clear, I think, but let me list them
anyway (that I can think of):
* Adding an extra index type that can do only one thing
* Adding an index type that can only do that one thing efficiently if
you happen to want to scan over the entire table
* Adding an index type for a use case that isn't very compelling
(SELECT COUNT(*) FROM table)
* There may be additional concerns about this being costed correctly,
although I think there is probably enough information to make an
informed decision here.
* It can't be added as a contrib module (probably) and also be crash-safe.

However, if the patch for negative transition aggregate functions
lands, I think this idea becomes a bit more compelling.  Then instead
of just being a technique for
COUNT(*), it effectively becomes a very low-memory way of speeding up
a large number of aggregate queries (any that have an inverse) while
maintaining accuracy, in
situations where most of the pages never get modified.  The only
change to the above algorithm is that delta is computed as application
of the inverse function to old
tuples and the direct function to new ones on the page (I think this
can likely be done during vacuum index cleanup).

Please critique this idea and let me know whether it is worth pursuing further.

Thanks!

-- Joshua Yanovski


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Including replication slot data in base backups

2014-04-04 Thread Michael Paquier

On Wed, Apr 2, 2014 at 10:27 PM, Andres Freund and...@2ndquadrant.com wrote:
 The new master won't necessarily have all the neccessary WAL available, no?
No, they won't have it, and things begin to get bad once a base backup
includes a slot that has a non-null value of restart_lsn. I imagine
that if we want to guarantee the correctness of a replication slot we
would need to fetch from archives the necessary WAL files needed for
it when a node is in recovery, which is not something that this patch
does...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Including replication slot data in base backups

2014-04-04 Thread Andres Freund

On 2014-04-04 21:42:33 +0900, Michael Paquier wrote:
 On Wed, Apr 2, 2014 at 10:27 PM, Andres Freund and...@2ndquadrant.com wrote:
  The new master won't necessarily have all the neccessary WAL available, no?

 No, they won't have it, and things begin to get bad once a base backup
 includes a slot that has a non-null value of restart_lsn.

In other words, every slot that has been used.

 I imagine
 that if we want to guarantee the correctness of a replication slot we
 would need to fetch from archives the necessary WAL files needed for
 it when a node is in recovery, which is not something that this patch
 does...

Does that mean you retract the patch?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Including replication slot data in base backups

2014-04-04 Thread Michael Paquier

On Fri, Apr 4, 2014 at 9:44 PM, Andres Freund and...@2ndquadrant.com wrote:
 I imagine
 that if we want to guarantee the correctness of a replication slot we
 would need to fetch from archives the necessary WAL files needed for
 it when a node is in recovery, which is not something that this patch
 does...

 Does that mean you retract the patch?
For 9.4, clearly yes, this would change the semantic of recovery and
this is not something wise to do at the end of a development cycle.
For 9.5 though, this is a different story. It clearly depends on if
this is though as useful enough to change how recovery fetches WAL
files (in this case by scanning existing repslots). There are other
things to consider as well like for example: do we reset the
restart_lsn of a repslot if needed WAL files are not here anymore or
abort recovery? I haven't worked much with repslots though...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: COUNT(*) (and related) speedup

2014-04-04 Thread Joshua Yanovski

From feedback on IRC, two immediately obvious technical problems:

* Heap pruning can happen at any time, not just during VACUUM and HOT
updates.  This is obviously a pretty significant issue and I doubt the
easy solution (don't do heap pruning for tables with an index like
this) would be acceptable.
* While relfrozenxid works for this in 9.4, it's not really related to
this purpose and its implementation could change at any time, so it is
not reasonable to rely on it.  A solution that would still work, but
not be as fast, would be to just not have a minimum for calculating
deltas.

On Fri, Apr 4, 2014 at 5:38 AM, Joshua Yanovski pythones...@gmail.com wrote:
 Hey all,

 First off, please feel free to let me know if this idea is a waste of time :)

 I was thinking about COUNT(*) and its slow performance today, and I
 thought about whether we could get better performance by taking a page
 out of index-only-scans.

 Essentially, the idea is that you would store a counter (let's say, as
 a special index type) that would initially (on index creation) be set
 to the total count of
 all rows on fully visible pages (visibility map bit set to 1).
 Whenever a query was made on the table for COUNT(*), we would not even
 bother to visit any pages
 that were in the visibility map, but instead visit the heap for every
 page that was not marked visible and compute the delta between the
 count of tuples visible
 at relfrozenxid and tuples currently visible.  The deltas on all the
 pages would then be summed with the original counter and returned as
 the count of the table.
 The counter would be updated only by VACUUM, which would perform the
 same computation performed by the COUNT operation but add it
 permanently to counter just before
 it set the visible_map bit to 1 (I am not certain whether this would
 require unusual changes to WAL replay or not).  There may be some
 issues with this idea, since
 I am not too familiar with the Postgres codebase, but Andrew Gierth
 was kind enough to put up with my bugging him in the IRC channel all
 night and helped me work through
 some of this stuff so I think it should at least be coherent.

 The upshot of the above proposal is that in situations where an index
 only scan would have helped before, now you might be able to skip
 almost all the I/O for COUNT(*),
 while still getting a completely accurate result.

 The disadvantages are pretty clear, I think, but let me list them
 anyway (that I can think of):
 * Adding an extra index type that can do only one thing
 * Adding an index type that can only do that one thing efficiently if
 you happen to want to scan over the entire table
 * Adding an index type for a use case that isn't very compelling
 (SELECT COUNT(*) FROM table)
 * There may be additional concerns about this being costed correctly,
 although I think there is probably enough information to make an
 informed decision here.
 * It can't be added as a contrib module (probably) and also be crash-safe.

 However, if the patch for negative transition aggregate functions
 lands, I think this idea becomes a bit more compelling.  Then instead
 of just being a technique for
 COUNT(*), it effectively becomes a very low-memory way of speeding up
 a large number of aggregate queries (any that have an inverse) while
 maintaining accuracy, in
 situations where most of the pages never get modified.  The only
 change to the above algorithm is that delta is computed as application
 of the inverse function to old
 tuples and the direct function to new ones on the page (I think this
 can likely be done during vacuum index cleanup).

 Please critique this idea and let me know whether it is worth pursuing 
 further.

 Thanks!

 -- Joshua Yanovski



-- 
Josh


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] ipc_test

2014-04-04 Thread Robert Haas

Does anybody care about being able to compile ipc_test as a standalone
binary any more?

I ask because, while working on some of the outstanding cleanup issues
around dynamic shared memory, I made sure to test whether it required
further adjustments based on the changes that I'd done, only to
discover that it was already broken by the huge pages patch:

[rhaas pgsql]$ make -C src/backend/port ipc_test
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -g -Wall -Werror -L../../../src/port -L../../../src/common
-L/opt/local/lib -Wl,-dead_strip_dylibs  -Wall -Werror   ipc_test.o
pg_sema.o pg_shmem.o -lpgcommon -lpgport -lintl -lxml2 -lssl -lcrypto
-lz -lreadline -lm  -o ipc_test
Undefined symbols for architecture x86_64:
  _huge_pages, referenced from:
  _PGSharedMemoryCreate in pg_shmem.o
ld: symbol(s) not found for architecture x86_64
collect2: ld returned 1 exit status
make: *** [ipc_test] Error 1

My judgement is that ipc_test has insufficient utility to justify
keeping it around and updating it every time sysv_shmem.c references a
new symbol or experiences a relevant calling signature change.  So I'd
favor just ripping it out.  I doubt we're likely to have any
completely new semaphore or shared memory implementations that require
this kind of testing any time soon, and even if we do I think whoever
is doing it can quite easily put together a bespoke testing framework
that will likely serve their needs at least as well as ipc_test does.
It was probably quite reasonable to add this 12 years ago but I just
don't think we need it any more.

All that having been said, if somebody feels strongly that this is
still useful, I'd rather just fix it than argue about it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ipc_test

2014-04-04 Thread Andres Freund

Hi,

On 2014-04-04 09:31:01 -0400, Robert Haas wrote:
 Does anybody care about being able to compile ipc_test as a standalone
 binary any more?

I don't.

But if we want to keep it, it should be built during a normal build to
make sure it doesn't get broken.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-04-04 Thread Tom Lane

y...@netbsd.org (YAMAMOTO Takashi) writes:
 On Fri, Apr 04, 2014 at 02:36:05AM +, YAMAMOTO Takashi wrote:
 openvswitch has some tricks to overcome the socket path length
 limitation using symlink.  (or procfs where available)
 iirc these were introduced for debian builds which use deep CWD.

 That's another reasonable approach.  Does it have a notable advantage over
 placing the socket in a subdirectory of /tmp?  Offhand, the security and
 compatibility consequences look similar.

 an advantage is that the socket can be placed under CWD
 and thus automatically obeys its directory permissions etc.

I'm confused.  The proposed alternative is to make a symlink in /tmp
or someplace like that, pointing to a socket that might be deeply buried?
How is that any better from a security standpoint from putting the socket
right in /tmp?  If /tmp is not sticky then an attacker can replace the
symlink, no?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Allocations in critical section (was Re: [HACKERS] WAL format and API changes (9.5))

2014-04-04 Thread Tom Lane

Heikki Linnakangas hlinnakan...@vmware.com writes:
 Ok, I fixed the issues that the assertion fixed. I also committed a 
 patch to add the assertion itself; let's see if the buildfarm finds more 
 cases that violate the rule.

 It ignores the checkpointer, because it's known to violate the rule,

... uh, isn't that a bug to be fixed?

 and 
 allocations in ErrorContext, which is used during error recovery, e.g if 
 you indeed PANIC while in a critical section for some other reason.

Yeah, I realized we'd have to do something about elog's own allocations.
Not sure if a blanket exemption for ErrorContext is the best way.  I'd
been thinking of having a way to turn off the complaint once processing
of an elog(PANIC) has started.

 I didn't backpatch this.

Agreed.

BTW, I'm pretty sure you added some redundant assertions in mcxt.c.
eg, palloc does not need its own copy.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GiST support for inet datatypes

2014-04-04 Thread Andres Freund

Hi,

On 2014-03-08 23:40:31 +0200, Emre Hasegeli wrote:
 Fourth version of the patch attached. It is rebased to the HEAD (8879fa0).
 Operator name formatting patch rebased on top of it. I will put
 the selectivity estimation patch to the next commit fest.
 
 This version of the patch does not touch to the btree_gist extension,
 does not set the operator class as the default. It adds support to
 the not equals operator to make the operator class compatible with
 the btree_gist extension.

This patch looks like it can be applied much more realistically, but it
looks too late for 9.4. I suggest moving it to the next CF?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Dynamic Shared Memory stuff

2014-04-04 Thread Robert Haas

On Wed, Jan 22, 2014 at 10:17 AM, Noah Misch n...@leadboat.com wrote:
 Yeah, abandoning the state file is looking attractive.

Here's a draft patch getting rid of the state file.  This should
address concerns raised by Heikki and Fujii Masao and echoed by Tom
that dynamic shared memory behaves differently than the main shared
memory segment.  The control segment ID is stored in the System V
shared memory block, so we're guaranteed that when using either System
V or POSIX shared memory we'll always latch onto the control segment
that matches up with the main shared memory segment we latched on to.
Cleanup for the file-mmap and Windows methods doesn't need to change,
because the former always just means clearing out $PGDATA/pg_dynshmem,
and the latter is done automatically by the operating system.

Comments?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
commit 6095f077a0d9a45e422087be1c7e8791c247a4e9
Author: Robert Haas rh...@postgresql.org
Date:   Fri Apr 4 09:06:23 2014 -0400

Get rid of the dynamic shared memory state file.

Instead, store the control segment ID in the header for the main
shared memory segment, and recover it from there when we restart after
a crash.  This ensures that we never latch onto the main shared memory
segment from one previous run and the dynamic shared memory control
segment from a different one.  It also avoids problems if you take a
base backup and start it up as hot standby on the same machine while
the previous postmaster is still running.

diff --git a/src/backend/port/ipc_test.c b/src/backend/port/ipc_test.c
index 6259919..936c1a4 100644
--- a/src/backend/port/ipc_test.c
+++ b/src/backend/port/ipc_test.c
@@ -30,6 +30,7 @@
 #include unistd.h
 
 #include miscadmin.h
+#include storage/dsm.h
 #include storage/ipc.h
 #include storage/pg_sema.h
 #include storage/pg_shmem.h
@@ -214,12 +215,13 @@ int
 main(int argc, char **argv)
 {
 	MyStorage  *storage;
+	PGShmemHeader *shim;
 	int			cpid;
 
 	printf(Creating shared memory ... );
 	fflush(stdout);
 
-	storage = (MyStorage *) PGSharedMemoryCreate(8192, false, 5433);
+	storage = (MyStorage *) PGSharedMemoryCreate(8192, false, 5433, shim);
 
 	storage-flag = 1234;
 
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index 51c1a2b..5e3850b 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -30,6 +30,7 @@
 
 #include miscadmin.h
 #include portability/mem.h
+#include storage/dsm.h
 #include storage/ipc.h
 #include storage/pg_shmem.h
 #include utils/guc.h
@@ -421,7 +422,8 @@ CreateAnonymousSegment(Size *size)
  * zero will be passed.
  */
 PGShmemHeader *
-PGSharedMemoryCreate(Size size, bool makePrivate, int port)
+PGSharedMemoryCreate(Size size, bool makePrivate, int port,
+	 PGShmemHeader **shim)
 {
 	IpcMemoryKey NextShmemSegID;
 	void	   *memAddress;
@@ -509,10 +511,13 @@ PGSharedMemoryCreate(Size size, bool makePrivate, int port)
 
 		/*
 		 * The segment appears to be from a dead Postgres process, or from a
-		 * previous cycle of life in this same process.  Zap it, if possible.
+		 * previous cycle of life in this same process.  Zap it, if possible,
+		 * and any associated dynamic shared memory segments, as well.
 		 * This probably shouldn't fail, but if it does, assume the segment
 		 * belongs to someone else after all, and continue quietly.
 		 */
+		if (hdr-dsm_control != 0)
+			dsm_cleanup_using_control_segment(hdr-dsm_control);
 		shmdt(memAddress);
 		if (shmctl(shmid, IPC_RMID, NULL)  0)
 			continue;
@@ -539,6 +544,7 @@ PGSharedMemoryCreate(Size size, bool makePrivate, int port)
 	hdr = (PGShmemHeader *) memAddress;
 	hdr-creatorPID = getpid();
 	hdr-magic = PGShmemMagic;
+	hdr-dsm_control = 0;
 
 	/* Fill in the data directory ID info, too */
 	if (stat(DataDir, statbuf)  0)
@@ -554,6 +560,7 @@ PGSharedMemoryCreate(Size size, bool makePrivate, int port)
 	 */
 	hdr-totalsize = size;
 	hdr-freeoffset = MAXALIGN(sizeof(PGShmemHeader));
+	*shim = hdr;
 
 	/* Save info for possible future use */
 	UsedShmemSegAddr = memAddress;
@@ -608,6 +615,7 @@ PGSharedMemoryReAttach(void)
 	if (hdr != origUsedShmemSegAddr)
 		elog(FATAL, reattaching to shared memory returned unexpected address (got %p, expected %p),
 			 hdr, origUsedShmemSegAddr);
+	dsm_set_control_handle(((PGShmemHeader *) hdr)-dsm_control);
 
 	UsedShmemSegAddr = hdr;		/* probably redundant */
 }
diff --git a/src/backend/port/win32_shmem.c b/src/backend/port/win32_shmem.c
index dca371c..3a0ded4 100644
--- a/src/backend/port/win32_shmem.c
+++ b/src/backend/port/win32_shmem.c
@@ -117,7 +117,8 @@ PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2)
  *
  */
 PGShmemHeader *
-PGSharedMemoryCreate(Size size, bool makePrivate, int port)
+PGSharedMemoryCreate(Size size, bool makePrivate, int port,
+	 PGShmemHeader **shim)
 {
 	void	   *memAddress;
 	PGShmemHeader *hdr;
@@ -245,12 +246,14 @@ PGSharedMemoryCreate(Size

Re: [HACKERS] Create function prototype as part of PG_FUNCTION_INFO_V1

2014-04-04 Thread Andres Freund

On 2014-02-17 10:30:16 -0300, Alvaro Herrera wrote:
 Tom Lane wrote:
  Peter Eisentraut pete...@gmx.net writes:
   On 2/15/14, 10:22 AM, Tom Lane wrote:
   Yes it does; people who fail to remove their manual externs will get
   Windows-only build failures (or at least warnings; it's not very clear
   which declaration will win).
  
   The manual externs and the automatically provided ones are exactly the
   same.  Why would that fail?
  
  Maybe I'm remembering the wrong patch.  I thought what this patch was
  intending was to put PGDLLEXPORT into the automatically-provided externs.
 
 This hunk is the essence of this patch:
 
  #define PG_FUNCTION_INFO_V1(funcname) \  
   
   

 +Datum funcname(PG_FUNCTION_ARGS); \  
   
   

  extern PGDLLEXPORT const Pg_finfo_record * 
 CppConcat(pg_finfo_,funcname)(void); \
   
  
 
 
 Note that PGDLLEXPORT is already there.  This patch is just about
 additionally providing the prototype.

The PGDLLEXPORT is attached to the variable, no the function tho. If
somebody previously tried to do the correct thing and attached
PGDLLEXPORT to their own *function* prototoype, it would cause problems
now.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optimize kernel readahead using buffer access strategy

2014-04-04 Thread Andres Freund

Hi,

On 2014-01-14 20:58:20 +0900, KONDO Mitsumasa wrote:
 I will test some join sqls performance and TPC-3 benchmark in this or next 
 week.

This patch has been marked as Waiting For Author for nearly two months
now. Marked as Returned with Feedback.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ipc_test

2014-04-04 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 On 2014-04-04 09:31:01 -0400, Robert Haas wrote:
 Does anybody care about being able to compile ipc_test as a standalone
 binary any more?

 I don't.

I can't remember the last time I had use for it either.  +1 for removal.

 But if we want to keep it, it should be built during a normal build to
 make sure it doesn't get broken.

Yup.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Allocations in critical section (was Re: [HACKERS] WAL format and API changes (9.5))

2014-04-04 Thread Heikki Linnakangas


On 04/04/2014 04:56 PM, Tom Lane wrote:

Heikki Linnakangas hlinnakan...@vmware.com writes:

Ok, I fixed the issues that the assertion fixed. I also committed a
patch to add the assertion itself; let's see if the buildfarm finds more
cases that violate the rule.



It ignores the checkpointer, because it's known to violate the rule,


... uh, isn't that a bug to be fixed?


Yes. One step a time ;-).


and
allocations in ErrorContext, which is used during error recovery, e.g if
you indeed PANIC while in a critical section for some other reason.


Yeah, I realized we'd have to do something about elog's own allocations.
Not sure if a blanket exemption for ErrorContext is the best way.  I'd
been thinking of having a way to turn off the complaint once processing
of an elog(PANIC) has started.


Hmm. PANIC processing should avoid allocations too, except in 
ErrorContext, because otherwise you might get an out-of-memory during 
PANIC processing.


ErrorContext also covers elog(DEBUG2, ...). I presume we'll want to 
ignore that too. Although I also tried without the exemption for 
ErrorContext at first, and didn't get any failures from the regression 
tests, so I guess we never do that in a critical section. I was a bit 
surprised by that.




BTW, I'm pretty sure you added some redundant assertions in mcxt.c.
eg, palloc does not need its own copy.


palloc is copy-pasted from MemoryContextAlloc - it does need its own copy.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: COUNT(*) (and related) speedup

2014-04-04 Thread Tom Lane

Joshua Yanovski pythones...@gmail.com writes:
 Essentially, the idea is that you would store a counter (let's say, as
 a special index type) that would initially (on index creation) be set
 to the total count of
 all rows on fully visible pages (visibility map bit set to 1).

It seems to me this can't possibly work because of race conditions.
In particular, what happens when some query dirties a page and thereby
clears its fully-visible bit?  Presumably, any such query would have
to (1) recompute the number of all-visible rows on that page (already
an expensive thing) and then (2) go and subtract that from the counter
(meaning the counter becomes a serialization bottleneck for all updates
on the table, which is exactly the reason we don't just have a globally
maintained row counter already).  But worse, what happens if a count(*)
is in progress?  It might or might not have scanned this page already,
and there's no way to get the right answer in both cases.  Counter
updates done by VACUUM would have a similar race-condition problem.

 Please critique this idea and let me know whether it is worth pursuing 
 further.

I doubt it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Observed an issue in CREATE TABLE syntax

2014-04-04 Thread Tom Lane

Rajeev rastogi rajeev.rast...@huawei.com writes:
 Should we not throw error for above syntaxes?

No.  There's nothing wrong with those statements, and complaining about
them will accomplish nothing except to break applications that used to
work.  Admittedly, code that generates such declarations would be a bit
sloppy, but people won't thank us for breaking it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tests for client programs

2014-04-04 Thread Andres Freund

Hi,

I personally would very much like to get this patch commited. It doesn't
have much risk in destabilizing stuff, rather the contrary.

Peter, what's you opinion about the current state?

On 2014-02-27 21:44:48 -0500, Peter Eisentraut wrote:
 diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
 index 16b3621..aee049a 100644
 --- a/doc/src/sgml/regress.sgml
 +++ b/doc/src/sgml/regress.sgml
 @@ -204,6 +204,12 @@ titleAdditional Test Suites/title
   located in filenamesrc/test/isolation/.
  /para
 /listitem
 +   listitem
 +para
 + Tests of client programs under filenamesrc/bin/filename.  See
 + also xref linkend=regress-tap.
 +/para
 +   /listitem
/itemizedlist
  
para
 @@ -660,6 +666,28 @@ titleVariant Comparison Files/title
  
/sect1
  
 +  sect1 id=regress-tap
 +   titleTAP Tests/title
 +
 +   para
 +The client program tests under filenamesrc/bin/filename use the Perl
 +TAP tools and are run by commandprove/command.  You can pass
 +command-line options to commandprove/command by setting
 +the commandmake/command variable varnamePROVE_FLAGS/, for 
 example:
 +programlisting
 +make -C src/bin check PROVE_FLAGS='--reverse'
 +/programlisting
 +The default is literal--verbose/literal.  See the manual page
 +of commandprove/command for more information.
 +   /para
 +
 +   para
 +The tests written in Perl require the Perl
 +module literalIPC::Run/literal, otherwise most tests will be skipped.
 +This module is available from CPAN or an operating system package.
 +   /para
 +  /sect1
 +

There's actually also some binaries in /contrib, so maybe phrase this a
bit more generally?

sect1 id=regress-coverage

  lcov.info: $(gcda_files)
   rm -f *.gcov
 - $(if $^,$(LCOV) -d . -c -o $@ $(LCOVFLAGS))
 + $(if $^,$(LCOV) -d . -c -o $@ $(LCOVFLAGS) --gcov-tool $(GCOV))

Looks unrelated, but whatever.

 +open HBA, $tempdir/pgdata/pg_hba.conf;
 +print HBA local replication all trust\n;
 +print HBA host replication all 127.0.0.1/32 trust\n;
 +print HBA host replication all ::1/128 trust\n;
 +close HBA;

Given the recent make check security discussions, this doesn't seem like
a good idea...

 +issues_sql_like(['createdb', 'foobar1'], qr/statement: CREATE DATABASE 
 foobar1/, 'SQL CREATE DATABASE run');
 +issues_sql_like(['createdb', 'foobar2', '-l', 'C', '-E', 'LATIN1', '-T', 
 'template0'], qr/statement: CREATE DATABASE foobar2 ENCODING 'LATIN1'/, 
 'create database with encoding');

Hm. Are all platforms guaranteed to provide latin1?

 diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm
 +if (!$ENV{PGPORT}) {
 + $ENV{PGPORT} = 65432;
 +}
 +
 +$ENV{PGPORT} = int($ENV{PGPORT}) % 65536;

Hm. I think this should use logical similar to what pg_regress is using,
namely test a few ports.

 +sub start_test_server {
 + my ($tempdir) = @_;
 + my $ret;
 +
 + system initdb -D $tempdir/pgdata -A trust -N /dev/null;
 + $ret = system 'pg_ctl', '-D', $tempdir/pgdata, '-s', '-w', '-l', 
 $tempdir/logfile, '-o', --fsync=off -k $tempdir --listen-addresses='' 
 --log-statement=all, 'start';
 +
 + if ($ret != 0) {
 + system('cat', $tempdir/logfile);
 + BAIL_OUT(pg_ctl failed);
 + }
 +
 + $ENV{PGHOST} = $tempdir;
 + $test_server_datadir = $tempdir/pgdata;
 + $test_server_logfile = $tempdir/logfile;
 +}

Should stuff like --fsync-off, -k, really be on by default?

I think the code to massage pg_hba.conf should also be here, there'll be
a fair number of tests that need it.


Some questions:
* I haven't looked very careful, but does this set PATH correctly to
  pick up programs?
* What does installcheck mean for this?
* I think there should be support for contrib modules to use this
  automatically, without overwriting makefile targets.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GiST support for inet datatypes

2014-04-04 Thread Andreas Karlsson


On 04/04/2014 04:01 PM, Andres Freund wrote:

This patch looks like it can be applied much more realistically, but it
looks too late for 9.4. I suggest moving it to the next CF?


If it does not change the default operator class I do not see anything 
preventing it from being applied to 9.4, as long as the committers have 
the time to look at this. My review is done and I think the first patch 
is ok and useful by itself.


--
Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] integrate pg_upgrade analyze_new_cluster.sh into vacuumdb

2014-04-04 Thread Andres Freund

On 2014-01-14 22:22:08 -0500, Peter Eisentraut wrote:
 +  listitem
 +   para
 +Only calculate statistics for use by the optimizer (no vacuum),
 +like option--analyze-only/option.  Run several stages of analyze
 +with different configuration settings, to produce usable statistics
 +faster.
 +   /para
 +
 +   para
 +This option is useful to analyze a database that was newly populated
 +from a restored dump or by commandpg_upgrade/command.  This 
 option
 +will try to create some statistics as fast as possible, to make the
 +database usable, and then produce full statistics in the subsequent
 +stages.
 +   /para
 +  /listitem
 + /varlistentry
 +

If it's intended to be useful independent from pg_upgrade, shouldn't
this document in how many stages it runs?

 + const char *stage_commands[] = {
 + SET default_statistics_target=1; SET 
 vacuum_cost_delay=0;,
 + SET default_statistics_target=10; RESET 
 vacuum_cost_delay;,
 + RESET default_statistics_target;
 + };
 + const char *stage_messages[] = {
 + gettext_noop(Generating minimal optimizer statistics 
 (1 target)),
 + gettext_noop(Generating medium optimizer statistics 
 (10 targets)),
 + gettext_noop(Generating default (full) optimizer 
 statistics (100 targets?))
 + };

Imo 100 shouldn't be listed here, as it's actually using the database's
default.

This whole thing won't work for relations with per-column statistics
targets btw...

Other than that, it looks fine to me. And certainly nicer than that
chain of shell/bat pg_upgrade used to output.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: In checkpoint, move the check for in-progress xacts out of criti

2014-04-04 Thread Robert Haas

On Fri, Apr 4, 2014 at 10:32 AM, Heikki Linnakangas
heikki.linnakan...@iki.fi wrote:
 In checkpoint, move the check for in-progress xacts out of critical section.

 GetVirtualXIDsDelayingChkpt calls palloc, which isn't safe in a critical
 section. I thought I covered this case with the exemption for the
 checkpointer, but CreateCheckPoint is also called from the startup process.

This commit un-broke make check, which you might have wanted to run
before committing, but make -C src/test/isolation check is still
broken.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GiST support for inet datatypes

2014-04-04 Thread Andres Freund

On 2014-04-04 16:50:36 +0200, Andreas Karlsson wrote:
 On 04/04/2014 04:01 PM, Andres Freund wrote:
 This patch looks like it can be applied much more realistically, but it
 looks too late for 9.4. I suggest moving it to the next CF?
 
 If it does not change the default operator class I do not see anything
 preventing it from being applied to 9.4, as long as the committers have the
 time to look at this. My review is done and I think the first patch is ok
 and useful by itself.

Well, the patch was marked as needs review, not ready for committer. So,
if that's your position, re-check the latest version and mark it as
ready.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: In checkpoint, move the check for in-progress xacts out of criti

2014-04-04 Thread Heikki Linnakangas


On 04/04/2014 06:06 PM, Robert Haas wrote:

On Fri, Apr 4, 2014 at 10:32 AM, Heikki Linnakangas
heikki.linnakan...@iki.fi wrote:

In checkpoint, move the check for in-progress xacts out of critical section.

GetVirtualXIDsDelayingChkpt calls palloc, which isn't safe in a critical
section. I thought I covered this case with the exemption for the
checkpointer, but CreateCheckPoint is also called from the startup process.


This commit un-broke make check, which you might have wanted to run
before committing, but make -C src/test/isolation check is still
broken.


Yeah.. Fixed now.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb is also breaking the rule against nameless unions

2014-04-04 Thread Robert Haas

On Thu, Apr 3, 2014 at 11:28 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-04-02 23:50:19 +0200, Andres Freund wrote:
   I just tried it on clang. It builds clean with -Wc11-extensions except
   warning about _Static_assert(). That's possibly fixable with some
   autoconf trickery.
 
  Ah.  That sounds promising.  What clang version is that?

 It's debian's clang-3.5, which is built from trunk IIUC: 1:3.5~svn201651-1

 I have some patches that fix the configure tests to properly use
 -Werror, but I am too tired to check their validity now.

 So, three patches attached:
 1) fix a couple of warnings clang outputs in -pedantic. All of them
somewhat valid and not too ugly to fix.
 2) Use -Werror in a couple more configure checks. That allows to compile
pg using both -Wc11-extensions and -Wc99-extensions in a halfway sane
fashion. I think this is sane, but I'd welcome comments.
 3) Fix C89 compliance issue in pgbench, plus minor cleanups. I don't
like the fix here, but I don't really have a better idea.

I've committed the first one of these, which looks uncontroversial.
The others seem like they might need more discussion, or at least more
thought than I can give them right now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GiST support for inet datatypes

2014-04-04 Thread Andreas Karlsson


On 03/08/2014 10:40 PM, Emre Hasegeli wrote:

Fourth version of the patch attached. It is rebased to the HEAD (8879fa0).
Operator name formatting patch rebased on top of it. I will put
the selectivity estimation patch to the next commit fest.

This version of the patch does not touch to the btree_gist extension,
does not set the operator class as the default. It adds support to
the not equals operator to make the operator class compatible with
the btree_gist extension.


The patch looks good but it does not apply anymore against master. If 
you could fix that and the duplicate OIDs I think this is ready for a 
committer.


Andreas



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-04 Thread Noah Misch

On Thu, Apr 03, 2014 at 11:44:46PM -0400, Tom Lane wrote:
 Peter Geoghegan p...@heroku.com writes:
  I think that those are objectively very large reductions in a cost
  that figures prominently in most workloads. Based solely on those
  facts, but also on the fairly low complexity of the patch, it may be
  worth considering committing this before 9.4 goes into feature freeze,
 
 Personally, I have paid no attention to this thread and have no intention
 of doing so before feature freeze.  There are three dozen patches at
 https://commitfest.postgresql.org/action/commitfest_view?id=21
 that have moral priority for consideration for 9.4.  Not all of them are
 going to get in, certainly, and I'm already feeling a lot of guilt about
 the small amount of time I've been able to devote to reviewing/committing
 patches this cycle.  Spending time now on patches that didn't even exist
 at the submission deadline feels quite unfair to me.
 
 Perhaps I shouldn't lay my own guilt trip on other committers --- but
 I think it would be a bad precedent to not deal with the existing patch
 queue first.

+1

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSOC13 proposal - extend RETURNING syntax

2014-04-04 Thread Andres Freund

Hi,

Some comments about the patch:
* Coding Style:
  * multiline comments have both /* and */ on their own lines.
  * I think several places indent by two tabs.
  * Spaces around operators
  * ...
* Many of the new comments would enjoy a bit TLC by a native speaker.

* The way RTE_ALIAS creeps in many place where it doesn't seem to belong
  seems to indicate that the design isn't yet ready. I share Robert's
  suspicion that this would be better solved by referring to a special
  range table entry.

Based on the lack of activity around this and the fact that this needs a
*significant* chunk of work before being committable, I am going to mark
this as returned with feedback.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-04 Thread Robert Haas

On Fri, Apr 4, 2014 at 12:13 PM, Noah Misch n...@leadboat.com wrote:
 On Thu, Apr 03, 2014 at 11:44:46PM -0400, Tom Lane wrote:
 Peter Geoghegan p...@heroku.com writes:
  I think that those are objectively very large reductions in a cost
  that figures prominently in most workloads. Based solely on those
  facts, but also on the fairly low complexity of the patch, it may be
  worth considering committing this before 9.4 goes into feature freeze,

 Personally, I have paid no attention to this thread and have no intention
 of doing so before feature freeze.  There are three dozen patches at
 https://commitfest.postgresql.org/action/commitfest_view?id=21
 that have moral priority for consideration for 9.4.  Not all of them are
 going to get in, certainly, and I'm already feeling a lot of guilt about
 the small amount of time I've been able to devote to reviewing/committing
 patches this cycle.  Spending time now on patches that didn't even exist
 at the submission deadline feels quite unfair to me.

 Perhaps I shouldn't lay my own guilt trip on other committers --- but
 I think it would be a bad precedent to not deal with the existing patch
 queue first.

 +1

+1

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using indices for UNION.

2014-04-04 Thread Andres Freund

Hi,

On 2014-01-14 18:10:40 +0900, Kyotaro HORIGUCHI wrote:
 This is cont'd from CF3.
 
 http://www.postgresql.org/message-id/20131122.165927.27412386.horiguchi.kyot...@lab.ntt.co.jp
 
 The issue in brief is that UNION is never flattened differently
 to UNION ALL so UNION cannot make use of index scan even if
 usable.
 
 This patch flattens UNION likewise currently did for UNION ALL.
 
 This patch needs another 'UNION ALL' patch and further gain with
 even another 'Widening application of indices' patch. ('Ready for
 Commit' now in CF3..)
 
 http://www.postgresql.org/message-id/20140114.180447.145186052.horiguchi.kyot...@lab.ntt.co.jp
 http://www.postgresql.org/message-id/20140114.180810.122352231.horiguchi.kyot...@lab.ntt.co.jp
 
 
 You can see the detailed outlines in the message at here,
 
 http://www.postgresql.org/message-id/20131031.194251.12577697.horiguchi.kyot...@lab.ntt.co.jp
 
 The attached patche is rebased to current 9.4dev HEAD and make
 check at the topmost directory and src/test/isolation are passed
 without error.

I haven't really followed this topic, so please excuse my ignorance.

This is still marked as needs review in
https://commitfest.postgresql.org/action/patch_view?id=1374 . But I am
not sure the patch as is is relevant after
a87c729153e372f3731689a7be007bc2b53f1410?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: COUNT(*) (and related) speedup

2014-04-04 Thread Joshua Yanovski

 t It seems to me this can't possibly work because of race conditions.
 In particular, what happens when some query dirties a page and thereby
 clears its fully-visible bit?  Presumably, any such query would have
 to (1) recompute the number of all-visible rows on that page (already
 an expensive thing) and then (2) go and subtract that from the counter
 (meaning the counter becomes a serialization bottleneck for all updates
 on the table, which is exactly the reason we don't just have a globally
 maintained row counter already)..
Unless what you're saying is that the index creation itself invites
race conditions, I don't think this is true.  The general idea here is
to not actually compute the number of rows on a page (or whatever
other aggregate) unless (1) the index is being created, (2) it's
requested by the user, (3) VACUUM is occurring, and not to update the
counter except on (1) and (3).  I am not sure off the top of my head
about (1), but it seems from comments in vac_update_relstats that we
are already making the implicit assumption that VACUUM is not running
concurrently with another VACUUM on the same table.  A query that
dirties a page doesn't have to perform any additional work than it
does already, under any circumstances, under this model.

 But worse, what happens if a count(*)
 is in progress?  It might or might not have scanned this page already,
 and there's no way to get the right answer in both cases.  Counter
 updates done by VACUUM would have a similar race-condition problem.
I don't think the first part really matters.  If the page was visible
when COUNT(*) started and then got dirtied by some other transaction,
any changes by the second transaction shouldn't alter the actual count
in our transaction anyway, so whether we scan the page needlessly or
not seems beside the point.  If that did matter, index only scans
using COUNT(*) wouldn't work properly, since they make the same basic
assumption.

VACUUM counter updates, on the other hand, initially seem more
problematic, since if we grab the value of the counter, then VACUUM
updates the counter and the visbility bits, and then we check which
visibility bits are set, we might skip pages we really need to check
(since we're using an old value for the counter).  But in practice I
am pretty sure this is a nonissue, because as long as our COUNT(*)
transaction is going, no pages can have their visibility bits marked
unless they haven't been updated since the transaction started, which
consequently means we won't see the counter updated (the counter is
intended to be updated immediately before the visibility bit is set
while VACUUMing).  Took me a while to figure this out, for thanks for
pointing it out :)


 Please critique this idea and let me know whether it is worth pursuing 
 further.

 I doubt it.
So do I, but not for the reasons you listed, but because we can't rely
on dead tuples sticking around for us to count them and generate
deltas consistently.  I think there is probably a way to get around
that problem in a sane way with a more traditional index, though.

 regards, tom lane



-- 
Josh


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC proposal - make an unlogged table logged

2014-04-04 Thread Robert Haas

On Thu, Apr 3, 2014 at 7:26 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 On 04/01/2014 08:39 PM, Heikki Linnakangas wrote:
 On 03/07/2014 05:36 AM, Tom Lane wrote:
 =?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com
 writes:
 Do you think is difficult to implement ALTER TABLE ... SET UNLOGGED
 too?
 Thinking in a scope of one GSoC, of course.

 I think it's basically the same thing.  You might hope to optimize it;
 but you have to create (rather than remove) an init fork, and there's
 no way to do that in exact sync with the commit.

 You just have to include that information with the commit WAL record, no?

 No-one's replied yet, but perhaps the worry is that after you've written the
 commit record, you have to go ahead with removing/creating the init fork,
 and that is seen as too risky. If a creat() or unlink() call fails, that
 will have to be a PANIC, and crash recovery will likewise have to PANIC if
 the forks still cannot be removed/created.

Yeah, that's the concern.

If I may digress for a moment, unlogged materialized views are not
supported.  This is because we have this facility where if a
materialized view hasn't been populated yet, you get an error when you
try to scan it.  If we allowed unlogged materialized views, then
they'd get reset to empty rather than to not-populated, because the
not-populated status is stored in the catalog, not the filesystem.  I
still wish we'd never added the notion of populated in the first
place, but Kevin felt it was essential, so we ended up here.

Anyway, the idea that I had for fixing the unlogged materialized view
case was to add a new 64-bit integer to the control file that gets
bumped every time we start crash recovery, and which also gets
recorded in pg_class.  The value 0 would be reserved, and all pg_class
entries for non-unlogged relations would store 0.  For unlogged
relations, we could check whether the value in pg_class equals the
current value; if not, the relation should be viewed as not-populated.

This is not too far from a solution from the problem we need to solve
here.  If we want to make an unlogged relation logged, we can go ahead
and remove the init forks right away, knowing that the pg_class update
changing relpersistence and this new value won't take effect until
commit.  If the system meanwhile crashes, a backend connected to the
relevant database has enough state to recognize that the relation is
in this in-between state.  Before we can again use that relation, we
need to rebuild the init fork and reset it.  Of course, it's not too
clear exactly how that state cleanup happens; as one option, we could
just require the user to run a manual TRUNCATE.

This would not be totally without precedent, because CREATE INDEX
CONCURRENTLY leaves crap behind that the user has to reindex or drop,
but it's clearly not ideal.  Another option would be to try to make
autovacuum put things right, but that of course might not happen right
away.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: COUNT(*) (and related) speedup

2014-04-04 Thread Joshua Yanovski

 VACUUM counter updates, on the other hand, initially seem more
 problematic, since if we grab the value of the counter, then VACUUM
 updates the counter and the visbility bits, and then we check which
 visibility bits are set, we might skip pages we really need to check
 (since we're using an old value for the counter).  But in practice I
 am pretty sure this is a nonissue, because as long as our COUNT(*)
 transaction is going, no pages can have their visibility bits marked
 unless they haven't been updated since the transaction started, which
 consequently means we won't see the counter updated (the counter is
 intended to be updated immediately before the visibility bit is set
 while VACUUMing).  Took me a while to figure this out, for thanks for
 pointing it out :)

Clarification: the counter may be updated, but never in a way that
affects the final result, because it can only be updated to reflect
statistics on pages that are fully visible to all current
transactions.  As long as we scan every page that was really dirty
at the start of our transactions, we can't really go wrong, since we
can recompute that information ourselves--we can just be better or
worse at avoiding redundant heap fetches.

-- 
Josh


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using indices for UNION.

2014-04-04 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 On 2014-01-14 18:10:40 +0900, Kyotaro HORIGUCHI wrote:
 This patch flattens UNION likewise currently did for UNION ALL.

 I haven't really followed this topic, so please excuse my ignorance.

 This is still marked as needs review in
 https://commitfest.postgresql.org/action/patch_view?id=1374 . But I am
 not sure the patch as is is relevant after
 a87c729153e372f3731689a7be007bc2b53f1410?

I think it's an independent issue.

After a quick read of the patch (which is really badly explained :-()
I think the idea is that a nest of UNIONs with no datatype conversions
can be flattened into a UNION ALL appendrel, with the required duplicate
elimination handled by sticking a DISTINCT onto the top level.

However, it's not clear to me that this is worth the trouble.  The
DISTINCT would act as an optimization fence preventing the subquery from
being flattened any further, so it doesn't seem like there would be any
global benefit just because it now contains a simple appendrel rather than
a setop tree.  And a nest of conversion-free UNIONs already results in a
plan that's a flat Append followed by sort/uniq, which seems like the same
thing you'd get from the DISTINCT.  So what's the point?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: COUNT(*) (and related) speedup

2014-04-04 Thread Tom Lane

Joshua Yanovski pythones...@gmail.com writes:
 But worse, what happens if a count(*)
 is in progress?  It might or might not have scanned this page already,
 and there's no way to get the right answer in both cases.  Counter
 updates done by VACUUM would have a similar race-condition problem.

 I don't think the first part really matters.  If the page was visible
 when COUNT(*) started and then got dirtied by some other transaction,
 any changes by the second transaction shouldn't alter the actual count
 in our transaction anyway, so whether we scan the page needlessly or
 not seems beside the point.

The question is not whether you scan a page needlessly or not, it's
whether you double-count the tuples on it.  I don't think it's possible to
be sure that when you add the central counter value into your local sum,
you are neither double-counting nor omitting pages whose all-visible state
changed while you were scanning the table.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Another thought about search_path semantics

2014-04-04 Thread Tom Lane

Back in 9.2 (commit 880bfc328) we decided that nonexistent schemas listed
in search_path should be silently ignored, reasoning by analogy with Unix
PATH settings where nonexistent directories in the path don't result in
error reports.  This remains imperfect though, cf commit 15386281a and
the similar report today at
http://www.postgresql.org/message-id/533ed1ec.4080...@abshere.net

It strikes me that the real issue here is that the analogy to PATH is
fine for search_path's role as a *search* path, but it's not so good for
determining the creation target schema.  I wonder if we should further
redefine things so that the creation target schema is always the first
thing named in search_path, and if that doesn't exist, we throw an
error rather than silently creating in some schema further down the
list.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Another thought about search_path semantics

2014-04-04 Thread Andres Freund

Hi,

On 2014-04-04 13:33:59 -0400, Tom Lane wrote:
 It strikes me that the real issue here is that the analogy to PATH is
 fine for search_path's role as a *search* path, but it's not so good for
 determining the creation target schema.  I wonder if we should further
 redefine things so that the creation target schema is always the first
 thing named in search_path, and if that doesn't exist, we throw an
 error rather than silently creating in some schema further down the
 list.

Wouldn't that devolve into an even messier behaviour because of the
historical $user,public search path?

I wonder if we could extend the search path syntax to specify whether a
schema should be used for creation of objects or not. Sounds somewhat
nasty, but I don't really have a better idea :(. Something like
search_patch=public,!pg_catalog.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Another thought about search_path semantics

2014-04-04 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 On 2014-04-04 13:33:59 -0400, Tom Lane wrote:
 It strikes me that the real issue here is that the analogy to PATH is
 fine for search_path's role as a *search* path, but it's not so good for
 determining the creation target schema.  I wonder if we should further
 redefine things so that the creation target schema is always the first
 thing named in search_path, and if that doesn't exist, we throw an
 error rather than silently creating in some schema further down the
 list.

 Wouldn't that devolve into an even messier behaviour because of the
 historical $user,public search path?

Ugh, right.  I think we had this discussion before actually, I'd just
forgotten it.

 I wonder if we could extend the search path syntax to specify whether a
 schema should be used for creation of objects or not. Sounds somewhat
 nasty, but I don't really have a better idea :(. Something like
 search_patch=public,!pg_catalog.

Hm ... doesn't fix the problem for existing dump files, which are going to
say search_path = foo, pg_catalog.  However, we could modify it a bit,
so that the marker is put on schemas that can be skipped if missing for
creation purposes.  Then the default could look like search_path =
!$user, public, while we still get safe behavior for pg_dump's commands.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Another thought about search_path semantics

2014-04-04 Thread Josh Berkus

On 04/04/2014 01:47 PM, Andres Freund wrote:
 I wonder if we could extend the search path syntax to specify whether a
 schema should be used for creation of objects or not. Sounds somewhat
 nasty, but I don't really have a better idea :(. Something like
 search_patch=public,!pg_catalog.

No, if we're fixing this, then we should have a separate
creation_target_schema GUC.   The fact that the only way to designate
creation target schema was to put it at the start of the search path has
*always* been a problem, since 7.3.

Non-atomic data sucks.  ;-b

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] gsoc knn spgist

2014-04-04 Thread Костя Кузнецов

Helllo. I want to implement knn for spgist. I dont have question with knn, but have questions with implementation of interface. i modify pg_am.h (set amcanorderbyop  to true in spgist index).Also i modify pg_amop.h(add  DATA(insert (    4015   600 600 15 o 517 4000 1970 )); ) explain SELECT * FROM quad_point_tbl ORDER BY p - '-2,50'  Sort  (cost=1219.31..1246.82 rows=11003 width=16)   Sort Key: ((p - '(-2,50)'::point))   -  Index Only Scan using sp_quad_ind on quad_point_tbl  (cost=0.15..480.70 rows=11003 width=16)what am I doing wrong?In a gist index we have : regression=# explain SELECT * FROM point_tbl ORDER BY f1 - '-3,0'; QUERY PLAN  Index Scan using gpointind on point_tbl  (cost=0.13..8.27 rows=7 width=16)   Order By: (f1 - '(-3,0)'::point) Planning time: 0.166 ms(3 rows) Are there documents about modification access methods? Thanks.Constantine Kuznetsov

Re: [HACKERS] Another thought about search_path semantics

2014-04-04 Thread Andres Freund

On 2014-04-04 13:58:53 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I wonder if we could extend the search path syntax to specify whether a
  schema should be used for creation of objects or not. Sounds somewhat
  nasty, but I don't really have a better idea :(. Something like
  search_patch=public,!pg_catalog.
 
 Hm ... doesn't fix the problem for existing dump files, which are going to
 say search_path = foo, pg_catalog.  However, we could modify it a bit,
 so that the marker is put on schemas that can be skipped if missing for
 creation purposes.  Then the default could look like search_path =
 !$user, public, while we still get safe behavior for pg_dump's commands.

Unfortunately the curren tsearch_path is probably enshrined in a couple
of thousand postgresql.confs...

How about simply refusing to create anything in pg_catalog unless it's
explicitly schema qualified? Looks a bit nasty to implement but doable?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Another thought about search_path semantics

2014-04-04 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 On 2014-04-04 13:58:53 -0400, Tom Lane wrote:
 Hm ... doesn't fix the problem for existing dump files, which are going to
 say search_path = foo, pg_catalog.  However, we could modify it a bit,
 so that the marker is put on schemas that can be skipped if missing for
 creation purposes.  Then the default could look like search_path =
 !$user, public, while we still get safe behavior for pg_dump's commands.

 Unfortunately the curren tsearch_path is probably enshrined in a couple
 of thousand postgresql.confs...

Uncommented?  Anyway, we never have and never will promise that you don't
have to revisit your postgresql.conf during a major version upgrade.

 How about simply refusing to create anything in pg_catalog unless it's
 explicitly schema qualified? Looks a bit nasty to implement but doable?

That's what happens already.  The point is to do better.  What we want
for pg_dump's case is to get a complaint that schema foo doesn't exist,
*not* an attempt to create in pg_catalog.  That's what you got (though
at the SET command not the CREATE command) in all versions before 9.2.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Another thought about search_path semantics

2014-04-04 Thread Andres Freund

On 2014-04-04 14:13:43 -0400, Tom Lane wrote:
  How about simply refusing to create anything in pg_catalog unless it's
  explicitly schema qualified? Looks a bit nasty to implement but doable?
 
 That's what happens already.  The point is to do better.  What we want
 for pg_dump's case is to get a complaint that schema foo doesn't exist,
 *not* an attempt to create in pg_catalog.  That's what you got (though
 at the SET command not the CREATE command) in all versions before 9.2.

I was thinking - but not saying explicitly - of rigging things so that
pg_catalog is ignored when searching for the target schema for object
creation unless explicitly specified. So if there's no other schema in
the search path you'd get the error about no no schema has been
selected to create in, even if pg_catalog is somewhere in there.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Another thought about search_path semantics

2014-04-04 Thread Tom Lane

Josh Berkus j...@agliodbs.com writes:
 No, if we're fixing this, then we should have a separate
 creation_target_schema GUC.   The fact that the only way to designate
 creation target schema was to put it at the start of the search path has
 *always* been a problem, since 7.3.

Well, if we were doing this in a green field we might do that, but we
don't have a green field.  7.3 was released in 2002.  We need to find
some reasonably upward-compatible reinterpretation of what pg_dump
has been doing since then.

I'm not saying that we couldn't *also* invent a creation_target_schema
GUC, as long as its default setting means consult search_path for the
schema to use.  What I'm saying is that having such a GUC won't solve
the existing problem for existing dump files.  (And in fact, a non-default
setting of it could completely break existing dump files, so we'd have
to tread carefully.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Another thought about search_path semantics

2014-04-04 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 I was thinking - but not saying explicitly - of rigging things so that
 pg_catalog is ignored when searching for the target schema for object
 creation unless explicitly specified. So if there's no other schema in
 the search path you'd get the error about no no schema has been
 selected to create in, even if pg_catalog is somewhere in there.

Hm.  Seems pretty grotty, but it'd at least fix pg_dump's problem,
since pg_dump's lists are always foo, pg_catalog with no third
schema mentioned.  I think what we'd actually need is to say
pg_catalog cannot be selected as the creation target unless it's
the *first* entry in the search_path list.

The larger issue here is that if search_path is say a, b, c and
a doesn't exist, is it really sane to create in b instead?

Another relatively narrow fix we could consider is to revert to
treating $user specially, such that it can be skipped over if
nonexistent for the purpose of selecting a creation target, but
actual schema names cannot be.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Another thought about search_path semantics

2014-04-04 Thread Andres Freund

On 2014-04-04 14:32:46 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I was thinking - but not saying explicitly - of rigging things so that
  pg_catalog is ignored when searching for the target schema for object
  creation unless explicitly specified. So if there's no other schema in
  the search path you'd get the error about no no schema has been
  selected to create in, even if pg_catalog is somewhere in there.
 
 Hm.  Seems pretty grotty, but it'd at least fix pg_dump's problem,
 since pg_dump's lists are always foo, pg_catalog with no third
 schema mentioned.  I think what we'd actually need is to say
 pg_catalog cannot be selected as the creation target unless it's
 the *first* entry in the search_path list.

I was actually suggesting that the only way to create something in
pg_catalog is to do it with a explicit schema qualified id. I realize
that that's not something backpatchable...

 The larger issue here is that if search_path is say a, b, c and
 a doesn't exist, is it really sane to create in b instead?

I think $user really nailed that behaviour down. Everything else just
seems really confusing. We could of course make that behave special as
you suggest, but yuck.


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Another thought about search_path semantics

2014-04-04 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 On 2014-04-04 14:32:46 -0400, Tom Lane wrote:
 Hm.  Seems pretty grotty, but it'd at least fix pg_dump's problem,
 since pg_dump's lists are always foo, pg_catalog with no third
 schema mentioned.  I think what we'd actually need is to say
 pg_catalog cannot be selected as the creation target unless it's
 the *first* entry in the search_path list.

 I was actually suggesting that the only way to create something in
 pg_catalog is to do it with a explicit schema qualified id. I realize
 that that's not something backpatchable...

I don't find that to be a good idea at all.  pg_dump is probably not the
only code that believes it can select a creation target with search_path,
no matter what that target is.

As for back-patchability, I was initially thinking of only fixing this in
HEAD.  If the behavior change is small enough, maybe we could get away
with back-patching 9.2 and 9.3; but I don't think we should start with
the assumption that we must do that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation

2014-04-04 Thread Andres Freund

On 2014-03-25 21:09:13 +0900, MauMau wrote:
 ! /*
 !  * Remove old symlink in recovery, in case it points to the wrong place.
 !  * On Windows, lstat() reports junction points as directories.
 !  */
   if (InRecovery)
   {
 ! if (lstat(linkloc, st) == 0  S_ISDIR(st.st_mode))
 ! {
 ! if (rmdir(linkloc)  0)
 ! ereport(ERROR,
 ! (errcode_for_file_access(),
 !  errmsg(could not remove 
 directory \%s\: %m,
 ! linkloc)));
 ! }
 ! else
 ! {
 ! if (unlink(linkloc)  0  errno != ENOENT)
 ! ereport(ERROR,
 ! (errcode_for_file_access(),
 !  errmsg(could not remove 
 symbolic link \%s\: %m,
 ! linkloc)));
 ! }
   }

if (..)
...
else
{
if (...)
   ...
}

is pretty odd code layout.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using indices for UNION.

2014-04-04 Thread Tom Lane

I wrote:
 However, it's not clear to me that this is worth the trouble.  The
 DISTINCT would act as an optimization fence preventing the subquery from
 being flattened any further, so it doesn't seem like there would be any
 global benefit just because it now contains a simple appendrel rather than
 a setop tree.  And a nest of conversion-free UNIONs already results in a
 plan that's a flat Append followed by sort/uniq, which seems like the same
 thing you'd get from the DISTINCT.  So what's the point?

Oh, after re-reading the earlier part of the thread I get the point ---
after making this change, the planner will consider some plan types for
the DISTINCT that it wouldn't have found in the setop-tree planning path.
Specifically it can make use of a mergeappend of sorted paths for the
individual union leaf queries.  That can't happen in the generic setop
planning code because we have no ability to consider more than one plan
for any leaf query.

I still think this stuff mostly needs to be thrown away and rewritten
in Path-creation style, but that's a long-term project.  In the meantime
this seems like a relatively small increment of complexity, so maybe it's
worth applying.  I'm concerned about the method for building a new
DISTINCT clause though; the best that can be said for that is that it's
a crude hack, and I'm less than convinced that there are no cases where
it'll dump core.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Pending 9.4 patches

2014-04-04 Thread Andres Freund

Hi,

I today tried to cleanup the state of the pending patches a bit. I hope
I haven't bloodied too many toes.

Here's a summary of all patches that aren't either committed, returned
or rejected:

Pending patches waiting for committer are:
c01) Custom Scan APIs
 This really seems to need Tom's attention.
c02) cache-only table scan
 This unfortunately also seems to need Tom's attention.
c03) ALTER TABLE lock strength reduction
 Simon should cleanup the cosmetic issues noticed by Noah and commit
 it.
c04) Foreign table inheritance.
 No idea. Too big to look.
c05) Widening application of indices.
 Is this really ready? I am rather doubtful.
c06) Enable CREATE FOREIGN TABLE (... LIKE ... )
 I *personally* don't like some parts of the approach, but a committer
 should decide. My concerns are more around aestetics and
 maintainability than any real problems.
c07) Updatable security barrier views.
 This needs a serious look by a committer.
c08) Bugfix for timeout in LDAP connection parameter resolution.
 Looks pretty straightforward to me.
c09) Problem with displaying wide tables in psql
 Not really sure. I wonder if there are situations where the new
 output will be more confusing than the old.
c10) PostgreSQL fails to start on Windows if it crashes after tablespace
 creation
 Looks good to me, trivial code formatting issue.
c11) pg_ctl fails with config-only directory
 Not a big fan, but I don't have a better idea.
c12) pg_ctl always uses the same event source
 I personally think this is pointless complication. But I am also not a 
windows person.
c12) pg_ctl stop times out when it should respond quickly
c13) PostgreSQL Service on Windows does not start if data directory given is 
relative path
 I unfortunately couldn't convince myself to care enough to have an
 opinion on these.

The complicated patches that seem to need serious committer time are:
c07, c01, c02, c04, c05.

My feeling is that 01 and 02 came in pretty darn late in their current
form to be fully considered for 9.4, independent of their state. 07
seems to be somewhat large this late, but there really hasn't been much
change recently.

Pending patches waiting for review are:
r01) Inverse Aggregate Transition Functions
 There's some argument about whether a already useful subset should
 be committed in 9.4 or whether everything should be in 9.5. The 9.4
 thing doesn't seem to be too far away from being ready.
r02) Using indices for UNION
 Tom seems to be looking at atm. Not that big.
r03) Optimization in regexp handling in pg_trgm
 Unsure whether the complications are worth the gains.
r04) Row-security based on Updatable security barrier views
 This one's fate seems to be hard to judge without c07.
r05) WAL rate limiting
 I don't think we can get concensus on this for 9.4, thus it
 probably should be returned with feedback.
r06) Add min, max, and stdev execute statement time in pg_stat_statement
 The discussion here seems to have stalled on the questions which
 additional columns should be added. Doesn't look too hard to make
 committable once agreement has been reached.
r07) ECPG cursor readahead
 I have no friggin idea.
r08) vacuumdb: Add option --analyze-in-stages
 Should imo be committed after some absolutely minor cleanup.
r09) transforms
 It's a bit sad to see so this patch limping along for more than a
 year now. But it still doesn't seem to be ready :(. Peter wrote
 more than two months back that he wants to make another pass after
 a couple nights of sleep. I'll have a look so there's some
 progress, but I don't think it's 9.4 material at this point.
r10) PL/pgSQL Warnings - plpgsql.warn_shadow
 I haven't followed, and the thread is too long.
r11) extension_control_path
 Minefield.
r12) Create function prototype as part of PG_FUNCTION_INFO_V1
 Some potential issues, but it'd reduce the amount extension authors
 have to type.
r13) Correctly place DLLs for ECPG apps in bin folder
 msvc issue, doesn't seem to be fully ready. Seems like a bug worthy
 of fixing tho.
r14) Issue with PGC_BACKEND parameters on Windows
 Looks simple enough.
r15) multibyte messages are displayed incorrectly on the client
 Seems like a bigger can of worms than its worth.
r16) tests for client programs
 Should imo be committed after some simple cleanup.

The bigger things that might have a bearing for 9.4 seem to be: r01,
r04. Lots of the other stuff just need someone paying a bit of
attention.

Waiting for author:
w01) GiST support for inet datatypes
 Needs a rebase as noticed today, apparently ready for committer
 otherwise.
w02) variant of regclass etc.
 Robert noticed some issues today when he wanted to commit.

I hope that we can prune this list to the patches that we think have a
chance for 9.4, so we can avoid delaying the feature freeze. The CF is
officially closed,

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-04 Thread Greg Stark

On Fri, Apr 4, 2014 at 12:15 PM, Robert Haas robertmh...@gmail.com wrote:

  Perhaps I shouldn't lay my own guilt trip on other committers --- but
  I think it would be a bad precedent to not deal with the existing patch
  queue first.
 
  +1

 +1


I don't think we have to promise a strict priority queue and preempt all
other development. But I agree loosely that processing patches that have
been around should be a higher priority.

I've been meaning to do more review for a while and just took a skim
through the queue. There are only a couple I feel I can contribute with so
I'm going to work on those and then if it's still before the feature freeze
I would like to go ahead with Peter's patch. I think it's generally a good
patch.

Two questions I have:

1) Would it make more sense to use a floating point instead of an integer?
I saw a need for a function like this when I was looking into doing GPU
sorts. But GPUs expect floating point values.

2) I would want to see a second data type, probably numeric, before
committing to be sure we had a reasonably generic api. But it's pretty
simply to do so.


-- 
greg

Re: [HACKERS] [COMMITTERS] pgsql: Add ALTER TABLESPACE ... MOVE command

2014-04-04 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Commit d86d51a95 was pretty damn awful in this regard as well, but
 let's clean them both up, not make it worse.

Fair enough.  I recall being a bit surprised at it also but didn't spend
much time thinking about it.  I'll clean it up.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Another thought about search_path semantics

2014-04-04 Thread Andres Freund

On 2014-04-04 14:56:54 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I was actually suggesting that the only way to create something in
  pg_catalog is to do it with a explicit schema qualified id. I realize
  that that's not something backpatchable...
 
 I don't find that to be a good idea at all.  pg_dump is probably not the
 only code that believes it can select a creation target with search_path,
 no matter what that target is.

Sure, but how many of those are trying to put things in pg_catalog?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using indices for UNION.

2014-04-04 Thread Tom Lane

I wrote:
 I still think this stuff mostly needs to be thrown away and rewritten
 in Path-creation style, but that's a long-term project.  In the meantime
 this seems like a relatively small increment of complexity, so maybe it's
 worth applying.  I'm concerned about the method for building a new
 DISTINCT clause though; the best that can be said for that is that it's
 a crude hack, and I'm less than convinced that there are no cases where
 it'll dump core.

OK, after still more time thinking about it, here's the issues with that
way of generating a DISTINCT clause corresponding to the UNION:

1. Calling transformDistinctClause with a NULL pstate seems pretty unsafe.
It might accidentally fail to fail, today, but it's surely fragile.
It's violating the API contract not only of transformDistinctClause
itself (which is not documented as accepting a NULL pstate) but of
addTargetToGroupList (q.v.).  A minimum requirement before I'd accept a
patch that did that is that it extended the header comments of those
functions to specify under what circumstances a NULL pstate can be passed.
However, that's not the direction to go, because of #2.

2. This approach potentially changes the semantics of the UNION.  (This is
only important for a datatype that has more than one btree equality
operator, but let's posit that.)  transformDistinctClause, as per its
header comment, allows the outer ORDER BY to influence which equality
operator it picks for DISTINCT.  However, in the case of
(SELECT ... UNION SELECT ...) ORDER BY ..., the UNION semantics were
chosen without reference to the ORDER BY, so it's possible that the
equality operators cited in the SetOperationStmt's groupClauses list are
not what you'd get from applying transformDistinctClause as the patch does.
It is not okay for the planner to override the parser's choice of
semantics like that.

Now I'm fairly sure that planner.c is expecting that the query's
distinctClause be a superset of the sortClause if any (cf comments for
SortGroupClause in parsenodes.h), so it wouldn't be okay to just blindly
build a distinctClause from topop-groupClauses.  I think what you need
to do is check that topop-groupClauses is compatible with the sortClause
if any (skipping the flattening transformation if not) and then build a
distinctClause by extending the sort clause list with any missing items
from topop-groupClauses.  So this is sort of like what
transformDistinctClause does, but different in detail, and the failure
case is to not do the transformation, rather than ereport'ing.  (See also
generate_setop_grouplist, which you could almost use except that it's not
expecting to have to merge with a sortClause list.)

So this is all doable enough, but you're probably going to end up with
a new distinctClause-generating function that's at least twice the size
of the patch's former net code addition.  So the question is whether it's
worth the trouble, considering that all this code has (I hope) a life
expectancy of no more than one or two more release cycles.

I'm going to set the patch back to Waiting on Author, but unless you
are prepared to rewrite the distinctClause creation code in the next
couple of days, you should change it to Returned with Feedback.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-04-04 Thread Claudio Freire

On Fri, Apr 4, 2014 at 5:29 PM, Greg Stark st...@mit.edu wrote:
 Two questions I have:

 1) Would it make more sense to use a floating point instead of an integer? I
 saw a need for a function like this when I was looking into doing GPU sorts.
 But GPUs expect floating point values.


In the context of this patch, I don't think you want to add
uncertainty to the != 0 or ==0 case (which is what FP would do).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Another thought about search_path semantics

2014-04-04 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 On 2014-04-04 14:56:54 -0400, Tom Lane wrote:
 I don't find that to be a good idea at all.  pg_dump is probably not the
 only code that believes it can select a creation target with search_path,
 no matter what that target is.

 Sure, but how many of those are trying to put things in pg_catalog?

Maybe not many, but pg_dump itself certainly can try to do that.
(Most of the time, pg_dump won't dump things in pg_catalog, but there
are exceptions, eg --binary-upgrade dump of an extension containing
objects in pg_catalog.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Another thought about search_path semantics

2014-04-04 Thread Andres Freund

On 2014-04-04 17:24:00 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-04-04 14:56:54 -0400, Tom Lane wrote:
  I don't find that to be a good idea at all.  pg_dump is probably not the
  only code that believes it can select a creation target with search_path,
  no matter what that target is.
 
  Sure, but how many of those are trying to put things in pg_catalog?
 
 Maybe not many, but pg_dump itself certainly can try to do that.
 (Most of the time, pg_dump won't dump things in pg_catalog, but there
 are exceptions, eg --binary-upgrade dump of an extension containing
 objects in pg_catalog.)

If we're not backpatching, fixing that seems easy enough? pg_upgrade
definitely needs the pg_dump around, so that should be fine.

I don't like my own suggestion, which isn't a good sign, but I haven't
heard anything I like more :(.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Another thought about search_path semantics

2014-04-04 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 On 2014-04-04 17:24:00 -0400, Tom Lane wrote:
 Maybe not many, but pg_dump itself certainly can try to do that.
 (Most of the time, pg_dump won't dump things in pg_catalog, but there
 are exceptions, eg --binary-upgrade dump of an extension containing
 objects in pg_catalog.)

 If we're not backpatching, fixing that seems easy enough?

Not especially.  As I said, pg_dump believes that setting search_path is
an appropriate way to control where things get created, and that's wired
into its structure pretty deeply.  I would have exactly zero faith in a
hack that tried to change that just for objects in pg_catalog.  In any
case, it's not clear that the case can't arise in pre-existing dump files,
even if you discount --binary-upgrade cases.

 I don't like my own suggestion, which isn't a good sign, but I haven't
 heard anything I like more :(.

Don't see why you don't find what I suggested to be just a small variant
on it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: COUNT(*) (and related) speedup

2014-04-04 Thread Joshua Yanovski

Yeah, you're right,  I believe that every code path in VACUUM that
leads to the visibility map bit being set also leads to all remaining
tuples on the page being frozen.  So in a world without heap pruning,
frozen should be a reliable proxy for value of the tuple the last
time it was added to the counter.  If you count -1 for a frozen
tuple, and 1 for a visible tuple, you should end up at precisely the
delta from the last time the page was turned visible.

It seems to me that with this definition of delta, the concurrent
COUNT(*) case is fine.  The counter isn't affected by the dirtying of
the page, nor is the delta for the page.  The result will be
completely consistent with what it would have been had the page dirty
never happened. But concurrent VACUUM under this definition fails.  So
yeah, I'll drop this until I have a concrete idea of how (or if) to
proceed.  Thanks for the feedback.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Idea for aggregates

2014-04-04 Thread Greg Stark

Simon, Dmitri, Peter Eisentraut, and I were chatting at PGConfNYC and
we had an idea for something interesting to do with aggregates.
Interestingly Simon and I came at this from two different directions
but realized we needed the same functionality to implement what we
wanted.

The basic idea is to separate the all the properties of the aggregate
functions except the final function from the final function into a
separate object. Giving the optimizer the knowledge that multiple
aggregate functions use the share the same basic machinery and
semantics for the state is the magic sauce that's a prerequisite for
the several ideas we were each thinking of.

I'm imagining something like (though I'm really not wedded to this
syntax at all):

CREATE AGGREGATE CLASS class_name ( input_data_type [ , ... ] ) (
SFUNC = sfunc,
STYPE = state_data_type
[ , INITCOND = initial_condition ]
[ , SORTOP = sort_operator ]
)

CREATE AGGREGATE func_name ( class_name ) (
 SFUNC = sfunc
)

The idea then is that this should enable a number of useful optimizations:

1) If the planner sees multiple aggregates in the target list
belonging to the same class and having the same arguments then it
knows it can keep just one transition state varible for all of them.

2) For materialized views the materialized view can keep just the
state variable rather than the final aggregate needed for any
aggregates in the materialized view. This would enable users to query
any other aggregate function in the same class on the same column even
if it wasn't included in the original materialized view definition.

This kind of thing is probably even more important down the road for
incrementally updating materialized views.

3) I envision treating the aggregate classes as real types so you
could do something like

CREATE TABLE user (userid integer, page_timing aggregate class
numeric_agg_class)
And later update the row by doing something like update user set
page_timing = page_timing + $1 where id = $2
And still later call select page_timing.count(), page_timing.sum(),
page_timing.stddev() from user where id = $1

Except of course this syntax wouldn't work quite right. I haven't
thought of a better syntax yet though.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add transforms feature

2014-04-04 Thread Andres Freund

On 2014-01-15 21:13:18 -0500, Peter Eisentraut wrote:
 The attached patch will probably fail to apply because of the pg_proc
 changes.  So if you want to try it out, look into the header for the Git
 hash it was based off.  I'll produce a properly merged version when this
 approach is validated.  Also, some implementation details could probably
 take some revising after a couple of nights of sleep. ;-)

You've slept since? ;)

So, I am only doign a look through the patch, to see where it has gone
in the past year.

 index 4e476c3..5ac9f05 100644
 --- a/src/Makefile.shlib
 +++ b/src/Makefile.shlib
 @@ -133,7 +133,7 @@ ifeq ($(PORTNAME), darwin)
else
  # loadable module
  DLSUFFIX = .so
 -LINK.shared  = $(COMPILER) -bundle -multiply_defined suppress
 +LINK.shared  = $(COMPILER) -bundle -multiply_defined 
 suppress -Wl,-undefined,dynamic_lookup
endif
BUILD.exports  = $(AWK) '/^[^\#]/ {printf _%s\n,$$1}' $ $@
exports_file   = $(SHLIB_EXPORTS:%.txt=%.list)

Hm. Do the linkers on other platforms support this behaviour? Linux
does, by default, but I have zap clue about the rest.

Why do we need this? I guess because a transform from e.g. hstore to $pl
needs symbols out of hstore.so and the $pl?

I wonder if it woudln't be better to rely on explicit symbol lookups for
this. That'd avoid the need for the order dependency and linker changes
like this.

 + case OBJECT_TRANSFORM:
 + {
 + TypeName   *typename = (TypeName *) 
 linitial(objname);
 + char   *langname = (char *) 
 linitial(objargs);
 + Oid type_id = 
 typenameTypeId(NULL, typename);
 + Oid lang_id = 
 get_language_oid(langname, false);
 +
 + address.classId = TransformRelationId;
 + address.objectId =
 + get_transform_oid(type_id, 
 lang_id, missing_ok);
 + address.objectSubId = 0;
 + }
 + break;

Hm. I wonder if missing_ok should be forwarded to get_language_oid() and
(by changing the way things are done atm) to typenameTypeId?

 + case OCLASS_TRANSFORM:
 + {
 + HeapTuple   trfTup;
 + Form_pg_transform trfForm;
 +
 + trfTup = SearchSysCache1(TRFOID,
 + 
   ObjectIdGetDatum(object-objectId));
 + if (!HeapTupleIsValid(trfTup))
 + elog(ERROR, could not find tuple for 
 transform %u,
 +  object-objectId);
 +
 + trfForm = (Form_pg_transform) GETSTRUCT(trfTup);
 +
 + appendStringInfo(buffer, _(transform for %s 
 language %s),
 +  
 format_type_be(trfForm-trftype),
 +  
 get_language_name(trfForm-trflang, false));
 +
 + ReleaseSysCache(trfTup);
 + break;
 + }
 +

Why deviate from the usual 'cache lookup failed for ..'? elog doesn't
translate so it's not particular relevant, but ...

   referenced.objectSubId = 0;
   recordDependencyOn(myself, referenced, DEPENDENCY_NORMAL);
  
 + /* dependency on transform used by return type, if any */
 + if ((trfid = get_transform_oid(returnType, languageObjectId, true)))
 + {
 + referenced.classId = TransformRelationId;
 + referenced.objectId = trfid;
 + referenced.objectSubId = 0;
 + recordDependencyOn(myself, referenced, DEPENDENCY_NORMAL);
 + }
 +

Should be compared to InvalidOid imo, rather than implicitly assuming
that InvalidOid evaluates to false.

 +/*
 + * CREATE TRANSFORM
 + */
 +Oid
 +CreateTransform(CreateTransformStmt *stmt)
 +{
...
 + if (!pg_type_ownercheck(typeid, GetUserId()))
 + aclcheck_error_type(ACLCHECK_NOT_OWNER, typeid);
 +
 + aclresult = pg_type_aclcheck(typeid, GetUserId(), ACL_USAGE);
 + if (aclresult != ACLCHECK_OK)
 + aclcheck_error_type(aclresult, typeid);
 +
 + /*
 +  * Get the language
 +  */
 + langid = get_language_oid(stmt-lang, false);
 +
 + aclresult = pg_language_aclcheck(langid, GetUserId(), ACL_USAGE);
 + if (aclresult != ACLCHECK_OK)
 + aclcheck_error(aclresult, ACL_KIND_LANGUAGE, stmt-lang);

Hm. Is USAGE really sufficient here? Should we possibly make it
dependant on

Re: [HACKERS] [review] PostgreSQL Service on Windows does not start if data directory given is relative path

2014-04-04 Thread Tom Lane

Rajeev rastogi rajeev.rast...@huawei.com writes:
 [ pgctl_win32service_rel_dbpath_v6.patch ]

Committed with minor corrections, mostly but not all cosmetic.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea for aggregates

2014-04-04 Thread Tom Lane

Greg Stark st...@mit.edu writes:
 The basic idea is to separate the all the properties of the aggregate
 functions except the final function from the final function into a
 separate object. Giving the optimizer the knowledge that multiple
 aggregate functions use the share the same basic machinery and
 semantics for the state is the magic sauce that's a prerequisite for
 the several ideas we were each thinking of.

Why exactly do you need to invent an aggregate class concept for this?
Can't the planner just look in pg_aggregate to see that the
sfunc/stype/initcond are the same for two different aggregates?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] pg_ctl always uses the same event source

2014-04-04 Thread Tom Lane

MauMau maumau...@gmail.com writes:
 [ pg_ctl_eventsrc_v6.patch ]

I looked at this patch a bit.  As a non-Windows person I have no intention
of being the committer, since I can't test the Windows-specific changes.
However, I do want to object to the business about having pg_ctl use
postgres -C to try to determine the event source name to use.  The code
that you repurposed was okay for its original use, namely finding out
where a config directory would point the data directory to, but I don't
think it's up to snuff for identifying the value of event_source that a
running server is using.  The actual value might have been set from a
command line option for instance.  Moreover, the whole thing seems rather
circular since what pg_ctl wants the event source name for is to report
errors.  What if it fails to get the event source name, or needs to
report an error before the (rather late) point at which it even tries
to get it?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] pg_ctl fails with config-only directory

2014-04-04 Thread Tom Lane

MauMau maumau...@gmail.com writes:
 From: Heikki Linnakangas hlinnakan...@vmware.com
 Hmm, why do this only on Windows? If postgres -C is safe enough to run 
 as Administrator on Windows, why not allow running it as root on Unix as 
 well? Even if there's no particular need to allow it as root on Unix, 
 fewer #ifdefs is good.

 Yes, I limited the change only to Windows, because I thought I might get get 
 objection against unnecessary change.  Plus, --single should not be allowed 
 for root, because root cannot be the PostgreSQL user account.

Indeed, and why would that not apply to Windows as well?  AFAIK, inclusion
of --single in this list is just plain nuts.  The most likely result of
running that way is creation of root-owned files inside $PGDATA, which
would cause havoc for later normally-privileged postmasters.

I will go and commit this, without the #ifdefs and without the --single
exclusion.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] pg_ctl fails with config-only directory

2014-04-04 Thread Tom Lane

I wrote:
 I will go and commit this, without the #ifdefs and without the --single
 exclusion.

On closer inspection I realized that the switch parsing was still far too
risky, because it would treat -C in any word of the process command line
as a reason not to check for root.  Quite aside from the fact that some of
those words might be switch arguments not switches, main.c is also the
front end for other operating modes that have switches unrelated to the
postmaster's switches.  --boot mode doesn't have any -C switch today, but
it might do so tomorrow, and that would result in a hard-to-notice hole in
our root protections.

However, there is a reasonably simple way around that objection, which is
to only skip the root check if -C is the first switch.  pg_ctl can easily
be changed to call it that way, and we're not really here to make -C easy
for root users to call manually, so I'm not too concerned about that
aspect of it.  --describe-config is only accepted as the first switch
anyway, so there's no issue there either.

Committed with appropriate changes.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation

2014-04-04 Thread Tom Lane

MauMau maumau...@gmail.com writes:
 [ remove_tblspc_symlink_v6.patch ]

Committed, thanks.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea for aggregates

2014-04-04 Thread Tom Lane

Greg Stark st...@mit.edu writes:
 Well in many cases stype will just be internal for many of them. That
 doesn't mean they're the same.

 Hm, I suppose it might if they have the same sfunc.

 This is actually where I started but we concluded that we needed some
 declaration that the aggregates were actually related and would interpret
 the state the same way and not just that it happened to use the same
 storage format.

Well, in practice you'd need to also compare the input datatype (consider
polymorphic aggregates) and initcond.  But the sfunc isn't told which
finalfunc will be applied, so any aggregates that share the same sfunc and
have the other conditions the same *must* have the identical transition
data behavior.  I don't see any reason to invent new syntax, and there
are good reasons not to if we don't have to.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea for aggregates

2014-04-04 Thread Greg Stark

Well in many cases stype will just be internal for many of them. That
doesn't mean they're the same.

Hm, I suppose it might if they have the same sfunc.

This is actually where I started but we concluded that we needed some
declaration that the aggregates were actually related and would interpret
the state the same way and not just that it happened to use the same
storage format.

Re: [HACKERS] [bug fix] multibyte messages are displayed incorrectly on the client

2014-04-04 Thread Tom Lane

MauMau maumau...@gmail.com writes:
 Then, as a happy medium, how about disabling message localization only if 
 the client encoding differs from the server one?  That is, compare the 
 client_encoding value in the startup packet with the result of 
 GetPlatformEncoding().  If they don't match, call 
 disable_message_localization().

AFAICT this is not what was agreed to in this thread.  It puts far too
much credence in the server-side default for client_encoding, which up to
now has never been thought to be very interesting; indeed I doubt most
people bother to set it at all.  The reason that this issue is even on
the table is that that default is too likely to be wrong, no?

Also, whatever possessed you to use pg_get_encoding_from_locale to
identify the server's encoding?  That's expensive and seems fairly
unlikely to yield the right answer.  I don't remember offhand where we
keep the postmaster's idea of what encoding messages should be in, but I'm
fairly sure it's stored explicitly somewhere.  Or if it isn't, we can for
sure do better than recalculating it during every connection attempt.

Having said all that, though, I'm unconvinced that this cure isn't worse
than the disease.  Somebody claimed upthread that no very interesting
messages would be delocalized by a change like this, but that's complete
nonsense: in particular, *every* message associated with client
authentication will be sent in English if we go down this path.  Given
the nearly complete lack of complaints in the many years that this code
has worked like this, I'm betting that most people will find a change
like this to be a net reduction in friendliness.

Given the changes here to extract client_encoding from the startup packet
ASAP, I wonder whether the right thing isn't just to set the client
encoding immediately when we do that.  Most application libraries pass
client encoding in the startup packet anyway (libpq certainly does).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Fwd: [HACKERS] Proposal: variant of regclass

2014-04-04 Thread Amit Kapila

On Fri, Apr 4, 2014 at 8:48 PM, Robert Haas robertmh...@gmail.com wrote:
 I see.  Here's an updated patch with a bit of minor refactoring to
 clean that up, and some improvements to the documentation.

 I was all ready to commit this when I got cold feet.  What's bothering
 me is that the patch, as written, mimics the exact behavior of the
 text-regproc cast, including the fact that the supplying an OID,
 written as a number, will always return that OID, whether it exists or
 not:

 rhaas=# select to_regclass('1259'), to_regclass('pg_class');
  to_regclass | to_regclass
 -+-
  pg_class| pg_class
 (1 row)

 rhaas=# select to_regclass('12590'), to_regclass('does_not_exist');
  to_regclass | to_regclass
 -+-
  12590   |
 (1 row)

 I think that's unacceptably weird behavior.

Agreed.  Actually I had also noticed this behaviour, but ignored it thinking
that we should just consider behavior change for to_ function incase name
is passed.  However after you pointed, it looks pretty wrong for OID cases.

The reason of this behavior is that in out functions (regclassout), we return
the OID as it is incase it doesn't exist.  One way to fix this is incase of
OID input parameters, we check if the passed OID exists in to_* functions
and return NULL if it doesn't exist. I think by this way we can retain
the similarity of input parameters between types and to_* functions and
making to_* functions return NULL incase OID doesn't exist makes it
similar to case when user passed name.

 My suggestion is to
 restructure the code so that to_regclass() only accepts a name, not an
 OID, and make its charter precisely to perform a name lookup and
 return an OID (as regclass) or NULL if there's no match.

How will we restrict user from passing some number in string form?
Do you mean that incase user has passed OID, to_* functions should
return NULL or if found that it is OID then return error incase of to_*
functions?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

84 matches

Mail list logo