Re: [HACKERS] NUMERIC private methods?

2014-12-16 Thread Heikki Linnakangas

On 12/16/2014 08:34 AM, David Fetter wrote:

Folks,

While noodling with some weighted statistics
https://github.com/davidfetter/weighted_stats, I noticed I was
having to jump through a lot of hoops because of all the private
methods in numeric.c, especially NumericVar.  Would there be some
major objection to exposing NumericVar as an opaque blob?


Hmm. You'd want to make add_var, mul_var etc. non-static?

Looking at the weighed_stats code, this probably illustrates the hoops 
you had to jump through:



/* sqrt((n/(n-1)) * ((s0*s2 - s1*s1)/(s0*s0)) */

result
= DirectFunctionCall1(
numeric_sqrt,
DirectFunctionCall2(
numeric_mul,
DirectFunctionCall2(
numeric_div,
n_prime,
DirectFunctionCall2(
numeric_sub,
n_prime,
/*
 * This rather 
convoluted way to compute the value
 * 1 gives us a result 
which should have at least
 * as big a decimal 
scale as s_2 does, which should
 * guarantee that our 
result is as precise as the
 * input...
 */
DirectFunctionCall2(
numeric_add,

DirectFunctionCall2(

numeric_sub,

state-s_2,

state-s_2
),
make_numeric(1)
)
)
),
DirectFunctionCall2(
numeric_div,
DirectFunctionCall2(
numeric_sub,
DirectFunctionCall2(
numeric_mul,
state-s_0,
state-s_2
),
DirectFunctionCall2(
numeric_mul,
state-s_1,
state-s_1
)
),
DirectFunctionCall2(
numeric_mul,
state-s_0,
state-s_0
)
)
)
);


As a start, it would help a lot to #define a few helper macros like:

#define ADD(a, b) DirectFunctionCall2(numeric_add, a, b)
#define MUL(a, b) DirectFunctionCall2(numeric_mul, a, b)

in your extension. That would already make that a lot shorter.

You might also be worrying about performance, though. The above snippet 
was from the aggregate's final function, which isn't performance 
critical, but you have some numeric operations in the transition 
function too. I wonder how big the impact really is, though. 
init_var_from_num and make_result look quite cheap, but certainly not free.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming replication and WAL archive interactions

2014-12-16 Thread Borodin Vladimir

12 дек. 2014 г., в 16:46, Heikki Linnakangas hlinnakan...@vmware.com 
написал(а):

 There have been a few threads on the behavior of WAL archiving, after a 
 standby server is promoted [1] [2]. In short, it doesn't work as you might 
 expect. The standby will start archiving after it's promoted, but it will not 
 archive files that were replicated from the old master via streaming 
 replication. If those files were not already archived in the master before 
 the promotion, they are not archived at all. That's not good if you wanted to 
 restore from a base backup + the WAL archive later.
 
 The basic setup is a master server, a standby, a WAL archive that's shared by 
 both, and streaming replication between the master and standby. This should 
 be a very common setup in the field, so how are people doing it in practice? 
 Just live with the wisk that you might miss some files in the archive if you 
 promote? Don't even realize there's a problem? Something else?

Yes, I do live like that (with streaming replication and shared archive between 
master and replicas) and don’t even realize there’s a problem :( And I think 
I’m not the only one. Maybe at least a note should be added to the 
documentation?

 
 And how would we like it to work?
 
 There was some discussion in August on enabling WAL archiving in the standby, 
 always [3]. That's a related idea, but it assumes that you have a separate 
 archive in the master and the standby. The problem at promotion happens when 
 you have a shared archive between the master and standby.

AFAIK most people use the scheme with shared archive.

 
 [1] 
 http://www.postgresql.org/message-id/CAHGQGwHVYqbX=a+zo+avfbvhlgoypo9g_qdkbabexgxbvgd...@mail.gmail.com
 
 [2] http://www.postgresql.org/message-id/20140904175036.310c6466@erg
 
 [3] 
 http://www.postgresql.org/message-id/CAHGQGwHNMs-syU=mevsesthna+exd9pfo_ohhfpjcwovayr...@mail.gmail.com.
 
 - Heikki
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


--
Vladimir






Re: [HACKERS] TABLESAMPLE patch

2014-12-16 Thread Petr Jelinek

On 16/12/14 08:43, Jaime Casanova wrote:

On Wed, Dec 10, 2014 at 6:24 PM, Petr Jelinek p...@2ndquadrant.com wrote:

Hello,

Attached is a basic implementation of TABLESAMPLE clause. It's SQL standard
clause and couple of people tried to submit it before so I think I don't
need to explain in length what it does - basically returns random sample
of a table using a specified sampling method.



Tablesample, yay!

Sadly when the jsonb functions patch was committed a few oids where
used, so you should update the ones you are using. at least to make
the patch easier for testing.


Will do.



The test added for this failed, attached is the diff. i didn't looked
up why it failed



It isn't immediately obvious to me why, will look into it.



Finally, i created a view with a tablesample clause. i used the view
and the tablesample worked, then dumped and restored and the
tablesample clause went away... actually pg_get_viewdef() didn't see
it at all.



Yeah, as I mentioned in the submission the ruleutils support is not 
there yet, so that's expected.



will look at the patch more close tomorrow when my brain wake up ;)



Thanks!



--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NUMERIC private methods?

2014-12-16 Thread Andrew Gierth
 Heikki == Heikki Linnakangas hlinnakan...@vmware.com writes:

 Heikki Looking at the weighed_stats code, this probably illustrates
 Heikki the hoops you had to jump through:

Actually that hoop-jumping expression is almost irrelevant.

The part that hurts (and yes, it's performance that's at issue here,
and not code aesthetics) is not being able to use NumericVar in the
aggregate's transition variable, because that means that every
computed intermediate value is palloc'd and pfree'd twice (once as the
digit buffer of a NumericVar and again as a Numeric datum).

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewind in contrib

2014-12-16 Thread Satoshi Nagayasu

Hi,

On 2014/12/12 23:13, Heikki Linnakangas wrote:
 Hi,

 I'd like to include pg_rewind in contrib. I originally wrote it as an
 external project so that I could quickly get it working with the
 existing versions, and because I didn't feel it was quite ready for
 production use yet. Now, with the WAL format changes in master, it is a
 lot more maintainable than before. Many bugs have been fixed since the
 first prototypes, and I think it's fairly robust now.

 I propose that we include pg_rewind in contrib/ now. Attached is a patch
 for that. It just includes the latest sources from the current pg_rewind
 repository at https://github.com/vmware/pg_rewind. It is released under
 the PostgreSQL license.

 For those who are not familiar with pg_rewind, it's a tool that allows
 repurposing an old master server as a new standby server, after
 promotion, even if the old master was not shut down cleanly. That's a
 very often requested feature.

I'm looking into pg_rewind with a very first scenario.
My scenario is here.

https://github.com/snaga/pg_rewind_test/blob/master/pg_rewind_test.sh

At least, I think a file descriptor srcf should be closed before
exiting copy_file_range(). I got can't open file error with
too many open file while running pg_rewind.


diff --git a/contrib/pg_rewind/copy_fetch.c b/contrib/pg_rewind/copy_fetch.c
index bea1b09..5a8cc8e 100644
--- a/contrib/pg_rewind/copy_fetch.c
+++ b/contrib/pg_rewind/copy_fetch.c
@@ -280,6 +280,8 @@ copy_file_range(const char *path, off_t begin, off_t 
end, bool trunc)

write_file_range(buf, begin, readlen);
begin += readlen;
}
+
+   close(srcfd);
 }

 /*


And I have one question here.

pg_rewind assumes that the source PostgreSQL has, at least, one 
checkpoint after getting promoted. I think the target timeline id

in the pg_control file to be read is only available after the first
checkpoint. Right?

Regards,



- Heikki





--
NAGAYASU Satoshi sn...@uptime.jp


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST kNN search queue (Re: KNN-GiST with recheck)

2014-12-16 Thread Heikki Linnakangas

On 12/15/2014 11:59 PM, Jeff Janes wrote:

On Mon, Dec 15, 2014 at 5:08 AM, Heikki Linnakangas hlinnakan...@vmware.com

wrote:


Here's a new version of the patch. It now uses the same pairing heap code

that I posted in the other thread (advance local xmin more aggressivley,
http://www.postgresql.org/message-id/5488acf0.8050...@vmware.com). The
pairingheap_remove() function is unused in this patch, but it is needed by
that other patch.


Under enable-cassert, this tries to call pairingheap_empty, but that
function doesn't exist.

I looked in the other patch and didn't find it defined there, either.


Ah, I renamed pairingheap_empty to pairingheap_is_empty at the last 
minute, and missed the Asserts. Here's a corrected version.


- Heikki

diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index 7a8692b..e5eb6f6 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -18,6 +18,7 @@
 #include access/relscan.h
 #include miscadmin.h
 #include pgstat.h
+#include lib/pairingheap.h
 #include utils/builtins.h
 #include utils/memutils.h
 #include utils/rel.h
@@ -243,8 +244,6 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
 	GISTPageOpaque opaque;
 	OffsetNumber maxoff;
 	OffsetNumber i;
-	GISTSearchTreeItem *tmpItem = so-tmpTreeItem;
-	bool		isNew;
 	MemoryContext oldcxt;
 
 	Assert(!GISTSearchItemIsHeap(*pageItem));
@@ -275,18 +274,15 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
 		oldcxt = MemoryContextSwitchTo(so-queueCxt);
 
 		/* Create new GISTSearchItem for the right sibling index page */
-		item = palloc(sizeof(GISTSearchItem));
-		item-next = NULL;
+		item = palloc(SizeOfGISTSearchItem(scan-numberOfOrderBys));
 		item-blkno = opaque-rightlink;
 		item-data.parentlsn = pageItem-data.parentlsn;
 
 		/* Insert it into the queue using same distances as for this page */
-		tmpItem-head = item;
-		tmpItem-lastHeap = NULL;
-		memcpy(tmpItem-distances, myDistances,
+		memcpy(item-distances, myDistances,
 			   sizeof(double) * scan-numberOfOrderBys);
 
-		(void) rb_insert(so-queue, (RBNode *) tmpItem, isNew);
+		pairingheap_add(so-queue, item-phNode);
 
 		MemoryContextSwitchTo(oldcxt);
 	}
@@ -348,8 +344,7 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
 			oldcxt = MemoryContextSwitchTo(so-queueCxt);
 
 			/* Create new GISTSearchItem for this item */
-			item = palloc(sizeof(GISTSearchItem));
-			item-next = NULL;
+			item = palloc(SizeOfGISTSearchItem(scan-numberOfOrderBys));
 
 			if (GistPageIsLeaf(page))
 			{
@@ -372,12 +367,10 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
 			}
 
 			/* Insert it into the queue using new distance data */
-			tmpItem-head = item;
-			tmpItem-lastHeap = GISTSearchItemIsHeap(*item) ? item : NULL;
-			memcpy(tmpItem-distances, so-distances,
+			memcpy(item-distances, so-distances,
    sizeof(double) * scan-numberOfOrderBys);
 
-			(void) rb_insert(so-queue, (RBNode *) tmpItem, isNew);
+			pairingheap_add(so-queue, item-phNode);
 
 			MemoryContextSwitchTo(oldcxt);
 		}
@@ -390,44 +383,24 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
  * Extract next item (in order) from search queue
  *
  * Returns a GISTSearchItem or NULL.  Caller must pfree item when done with it.
- *
- * NOTE: on successful return, so-curTreeItem is the GISTSearchTreeItem that
- * contained the result item.  Callers can use so-curTreeItem-distances as
- * the distances value for the item.
  */
 static GISTSearchItem *
 getNextGISTSearchItem(GISTScanOpaque so)
 {
-	for (;;)
-	{
-		GISTSearchItem *item;
-
-		/* Update curTreeItem if we don't have one */
-		if (so-curTreeItem == NULL)
-		{
-			so-curTreeItem = (GISTSearchTreeItem *) rb_leftmost(so-queue);
-			/* Done when tree is empty */
-			if (so-curTreeItem == NULL)
-break;
-		}
+	GISTSearchItem *item;
 
-		item = so-curTreeItem-head;
-		if (item != NULL)
-		{
-			/* Delink item from chain */
-			so-curTreeItem-head = item-next;
-			if (item == so-curTreeItem-lastHeap)
-so-curTreeItem-lastHeap = NULL;
-			/* Return item; caller is responsible to pfree it */
-			return item;
-		}
-
-		/* curTreeItem is exhausted, so remove it from rbtree */
-		rb_delete(so-queue, (RBNode *) so-curTreeItem);
-		so-curTreeItem = NULL;
+	if (!pairingheap_is_empty(so-queue))
+	{
+		item = (GISTSearchItem *) pairingheap_remove_first(so-queue);
+	}
+	else
+	{
+		/* Done when both heaps are empty */
+		item = NULL;
 	}
 
-	return NULL;
+	/* Return item; caller is responsible to pfree it */
+	return item;
 }
 
 /*
@@ -458,7 +431,7 @@ getNextNearest(IndexScanDesc scan)
 			/* visit an index page, extract its items into queue */
 			CHECK_FOR_INTERRUPTS();
 
-			gistScanPage(scan, item, so-curTreeItem-distances, NULL, NULL);
+			gistScanPage(scan, item, item-distances, NULL, NULL);
 		}
 
 		pfree(item);
@@ -491,7 +464,6 @@ 

Re: [HACKERS] pg_basebackup vs. Windows and tablespaces

2014-12-16 Thread Amit Kapila
On Tue, Dec 16, 2014 at 12:58 PM, Amit Kapila amit.kapil...@gmail.com
wrote:

 On Sun, Dec 14, 2014 at 11:54 AM, Amit Kapila amit.kapil...@gmail.com
wrote:
  On Sat, Dec 13, 2014 at 10:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:
   Sorry, I was not paying very close attention to this thread and missed
   the request for comments.  A few such:
  
   1. The patch is completely naive about what might be in the symlink
   path string; eg embedded spaces in the path would break it.  On at
   least some platforms, newlines could be in the path as well.  I'm not
   sure about how to guard against this while maintaining human
readability
   of the file.
  
 
  I will look into this and see what best can be done.
 

 I have chosen #3 (Make pg_basebackup check for and fail on symlinks
 containing characters (currently newline only) it can't handle) from the
 different options suggested by Tom.  This keeps the format same as
 previous and human readable.


Actually, here instead of an error a warning is issued and that particular
path (containing new line) will be skipped.  This is similar to what
is already done for the cases when there is any problem in reading
link paths.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Mark Cave-Ayland
On 15/12/14 19:08, Robert Haas wrote:

 On Sun, Dec 14, 2014 at 2:24 PM, Mark Cave-Ayland
 mark.cave-ayl...@ilande.co.uk wrote:
 However if it were posted as part of patchset with a subject of [PATCH]
 gram.y: add EXCLUDED pseudo-alias to bison grammar then this may pique
 my interest enough to review the changes to the grammar rules, with the
 barrier for entry reduced to understanding just the PostgreSQL-specific
 parts.
 
 Meh.  Such a patch couldn't possibly compile or work without modifying
 other parts of the tree.  And I'm emphatically opposed to splitting a
 commit that would have taken the tree from one working state to
 another working state into a series of patches that would have taken
 it from a working state through various non-working states and
 eventually another working state.  Furthermore, if you really want to
 review that specific part of the patch, just look for what changed in
 gram.y, perhaps by applying the patch and doing git diff
 src/backend/parser/gram.y.  This is really not hard.

Okay I agree that the suggested subject above was a little misleading,
so let me try and clarify this further.

If I were aiming to deliver this as an individual patch as part of a
patchset, my target for this particular patch would be to alter both the
bison grammar *and* the minimal underlying code/structure changes into a
format that compiles but adds no new features.

So why is this useful? Because the parser in PostgreSQL is important and
people have sweated many hours to ensure its performance is the best it
can be. By having a checkpoint with just the basic parser changes then
it becomes really easy to bisect performance issues down to just this
one particular area, rather than the feature itself.

And as per my original message it is a much lower barrier of entry for a
potential reviewer who has previous bison experience (and is curious
about PostgreSQL) to review the basic rule changes than a complete new
feature.

 I certainly agree that there are cases where patch authors could and
 should put more work into decomposing large patches into bite-sized
 chunks.  But I don't think that's always possible, and I don't think
 that, for example, applying BRIN in N separate pieces would have been
 a step forward.  It's one feature, not many.

I completely agree with you here. Maybe this isn't exactly the same for
PostgreSQL but in general for a new QEMU feature I could expect a
patchset of around 12 patches to be posted. Of those 12 patches,
probably patches 1-9 are internal API changes, code refactoring and
preparation work, patch 10 is a larger patch containing the feature, and
patches 11-12 are for tidy-up and removing unused code sections.

Even with the best patch review process in the world, there will still
be patches that introduce bugs, and the bugs are pretty much guaranteed
to be caused by patches 1-9.

Imagine a subtle bug in an internal API change which exhibits itself not
in the feature added by the patchset itself but in another mostly
unrelated part of the system; maybe this could be caused by a
pass-by-value API being changed to a pass-by-reference API in patches
1-9 and this tickles a bug due to an unexpected lifecycle heap access
elsewhere causing random data to be returned. Being able to bisect this
issue down to a single specific patch suddenly becomes very useful indeed.


ATB,

Mark.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewind in contrib

2014-12-16 Thread Heikki Linnakangas

On 12/16/2014 11:23 AM, Satoshi Nagayasu wrote:

Hi,

On 2014/12/12 23:13, Heikki Linnakangas wrote:
   Hi,
  
   I'd like to include pg_rewind in contrib. I originally wrote it as an
   external project so that I could quickly get it working with the
   existing versions, and because I didn't feel it was quite ready for
   production use yet. Now, with the WAL format changes in master, it is a
   lot more maintainable than before. Many bugs have been fixed since the
   first prototypes, and I think it's fairly robust now.
  
   I propose that we include pg_rewind in contrib/ now. Attached is a patch
   for that. It just includes the latest sources from the current pg_rewind
   repository at https://github.com/vmware/pg_rewind. It is released under
   the PostgreSQL license.
  
   For those who are not familiar with pg_rewind, it's a tool that allows
   repurposing an old master server as a new standby server, after
   promotion, even if the old master was not shut down cleanly. That's a
   very often requested feature.

I'm looking into pg_rewind with a very first scenario.
My scenario is here.

https://github.com/snaga/pg_rewind_test/blob/master/pg_rewind_test.sh

At least, I think a file descriptor srcf should be closed before
exiting copy_file_range(). I got can't open file error with
too many open file while running pg_rewind.


diff --git a/contrib/pg_rewind/copy_fetch.c b/contrib/pg_rewind/copy_fetch.c
index bea1b09..5a8cc8e 100644
--- a/contrib/pg_rewind/copy_fetch.c
+++ b/contrib/pg_rewind/copy_fetch.c
@@ -280,6 +280,8 @@ copy_file_range(const char *path, off_t begin, off_t
end, bool trunc)
  write_file_range(buf, begin, readlen);
  begin += readlen;
  }
+
+   close(srcfd);
   }

   /*



Yep, good catch. I pushed a fix to the pg_rewind repository at github.


And I have one question here.

pg_rewind assumes that the source PostgreSQL has, at least, one
checkpoint after getting promoted. I think the target timeline id
in the pg_control file to be read is only available after the first
checkpoint. Right?


Yes, it does assume that the source server (= old standby, new master) 
has had at least one checkpoint after promotion. It probably should be 
more explicit about it: If there hasn't been a checkpoint, you will 
currently get an error source and target cluster are both on the same 
timeline, which isn't very informative.


I assume that by target timeline ID you meant the timeline ID of the 
source server, i.e. the timeline that the target server should be 
rewound to.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Mark Cave-Ayland
On 15/12/14 19:24, Andrew Dunstan wrote:

 On 12/15/2014 02:08 PM, Robert Haas wrote:
 On Sun, Dec 14, 2014 at 2:24 PM, Mark Cave-Ayland
 mark.cave-ayl...@ilande.co.uk wrote:
 However if it were posted as part of patchset with a subject of [PATCH]
 gram.y: add EXCLUDED pseudo-alias to bison grammar then this may pique
 my interest enough to review the changes to the grammar rules, with the
 barrier for entry reduced to understanding just the PostgreSQL-specific
 parts.
 Meh.  Such a patch couldn't possibly compile or work without modifying
 other parts of the tree.  And I'm emphatically opposed to splitting a
 commit that would have taken the tree from one working state to
 another working state into a series of patches that would have taken
 it from a working state through various non-working states and
 eventually another working state.  Furthermore, if you really want to
 review that specific part of the patch, just look for what changed in
 gram.y, perhaps by applying the patch and doing git diff
 src/backend/parser/gram.y.  This is really not hard.

 I certainly agree that there are cases where patch authors could and
 should put more work into decomposing large patches into bite-sized
 chunks.  But I don't think that's always possible, and I don't think
 that, for example, applying BRIN in N separate pieces would have been
 a step forward.  It's one feature, not many.

 
 +1
 
 I have in the past found the bunch of patches approach to be more that
 somewhat troublesome, especially when one gets here is an update to
 patch nn of the series and one has to try to compose a coherent set of
 patches in order to test them. A case in point I recall was the original
 Foreign Data Wrapper patchset.

In practice, people don't tend to post updates to individual patches in
that way. What happens is that after a week or so of reviews, people go
away and rework the patch and come back with a complete updated patchset
clearly marked as [PATCHv2] with the same subject line and an updated
cover letter describing the changes, so a complete coherent patchset is
always available.

Now as I mentioned previously, one of the disadvantages of splitting
patches in this way is that mailing list volume tends to grow quite
quickly - hence the use of [PATCH] filters and system identifiers in the
subject line to help mitigate this.

Whilst the volume of mail was a shock to begin with, having stuck with
it for a while I personally find that the benefits to development
outweigh the costs. Each individual project is different, but I believe
that there are good ideas here that can be used to improve workflow,
particularly when it comes to patch review.


ATB,

Mark.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewind in contrib

2014-12-16 Thread Satoshi Nagayasu

On 2014/12/16 18:37, Heikki Linnakangas wrote:

On 12/16/2014 11:23 AM, Satoshi Nagayasu wrote:

pg_rewind assumes that the source PostgreSQL has, at least, one
checkpoint after getting promoted. I think the target timeline id
in the pg_control file to be read is only available after the first
checkpoint. Right?


Yes, it does assume that the source server (= old standby, new master)
has had at least one checkpoint after promotion. It probably should be
more explicit about it: If there hasn't been a checkpoint, you will
currently get an error source and target cluster are both on the same
timeline, which isn't very informative.


Yes, I got the message, so I could find the checkpoint thing.
It could be more informative, or some hint message could be added.


I assume that by target timeline ID you meant the timeline ID of the
source server, i.e. the timeline that the target server should be
rewound to.


Yes.
Target timeline I mean here is the timeline id coming from the promoted 
master (== source server == old standby).


I got it. Thanks.

I'm going to look into more details.

Regards,



- Heikki



--
NAGAYASU Satoshi sn...@uptime.jp


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using 128-bit integers for sum, avg and statistics aggregates

2014-12-16 Thread David Rowley
On 14 November 2014 at 13:57, Andreas Karlsson andr...@proxel.se wrote:

 On 11/13/2014 03:38 AM, Alvaro Herrera wrote:

 configure is a generated file.  If your patch touches it but not
 configure.in, there is a problem.


 Thanks for pointing it out, I have now fixed it.



Hi Andreas,

These are some very promising performance increases.

I've done a quick pass of reading the patch. I currently don't have a
system with a 128bit int type, but I'm working on that.

Just a couple of things that could do with being fixed:


This fragment needs fixed to put braces on new lines
if (state) {
numstate.N = state-N;
int16_to_numericvar(state-sumX, numstate.sumX);
int16_to_numericvar(state-sumX2, numstate.sumX2);
} else {
numstate.N = 0;
}



It also looks like your OIDs have been nabbed by some jsonb stuff.

DETAIL:  Key (oid)=(3267) is duplicated.

I'm also wondering why in numeric_int16_sum() you're doing:

#else
return numeric_sum(fcinfo);
#endif

but you're not doing return int8_accum() in the #else part
of int8_avg_accum()
The same goes for int8_accum_inv() and int8_avg_accum_inv(), though perhaps
you're doing it here because of the elog() showing the wrong function name.
Although that's a pretty much shouldn't ever happen case that mightn't be
worth worrying about.


Also since I don't currently have a machine with a working int128, I
decided to benchmark master vs patched to see if there was any sort of
performance regression due to numeric_int16_sum calling numeric_sum, but
I'm a bit confused with the performance results as it seems there's quite a
good increase in performance with the patch, I'd have expected there to be
no change.

CREATE TABLE t (value bigint not null);
insert into t select a.a from generate_series(1,500) a(a);
vacuum;

int128_bench.sql has select sum(value) from t;

Master:
D:\Postgres\installb\binpgbench.exe -f d:\int128_bench.sql -n -T 120
postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 120 s
number of transactions actually processed: 92
latency average: 1304.348 ms
tps = 0.762531 (including connections establishing)
tps = 0.762642 (excluding connections establishing)

Patched:
D:\Postgres\install\binpgbench.exe -f d:\int128_bench.sql -n -T 120
postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 120 s
number of transactions actually processed: 99
latency average: 1212.121 ms
tps = 0.818067 (including connections establishing)
tps = 0.818199 (excluding connections establishing)

Postgresql.conf is the same in both instances.
I've yet to discover why this is any faster.

Regards

David Rowley


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Craig Ringer
On 12/16/2014 05:53 PM, Mark Cave-Ayland wrote:
 In practice, people don't tend to post updates to individual patches in
 that way.

Exactly. Much like if you push a new revision of a working branch, you
repost all the changesets - or should.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Mark Cave-Ayland
On 15/12/14 19:27, Robert Haas wrote:

 On Sun, Dec 14, 2014 at 4:53 PM, Mark Cave-Ayland
 mark.cave-ayl...@ilande.co.uk wrote:
 What I find frustrating is that I've come back from a workflow where
 I've been reviewing/testing patches within months of joining a project
 because the barrier for entry has been so low, and yet even with much
 longer experience of the PostgreSQL codebase I feel I can't do the same
 for patches submitted to the commitfest.

 And why is this? It's because while I know very specific areas of the
 code well, many patches span different areas of the source tree of which
 I have little and/or no experience, which when supplied as a single
 monolithic patch makes it impossible for me to give meaningful review to
 all but a tiny proportion of them.
 
 So, there are certainly some large patches that do that, and they
 typically require a very senior reviewer.  But there are lots of small
 patches too, touching little enough that you can learn enough to give
 them a decent review even if you've never looked at that before.  I
 find myself in that situation as a reviewer and committer pretty
 regularly; being a committer doesn't magically make you an expert on
 the entire code base.  You can (and I do) focus your effort on the
 things you know best, but you have to step outside your comfort zone
 sometimes, or you never learn anything new.

Right. Which is why I'm advocating the approach of splitting patches in
relevant chunks so that experts in those areas can review them in
parallel. And even better, if I then want to start digging into other
parts of the system as time and interest allow then I can choose to
begin by picking a subject line from a patchset and going through this
small individual patch in detail rather than a single large patch.

It has often been suggested that people learn best when studying a mix
of 80% of things they already know compared to 20% of things they don't.
At least with more granular patches people can find a comfortable ratio
of new/old material vs. a single large patch which could be 70-80% of
completely new material and therefore much more difficult to pick-up.

Another thought to ponder here is that by reviewing smaller patches
which takes less time, for the same time cost a reviewer with experience
in one specific area can in theory review and provide feedback on
another 2-3 patchsets which should help relieve patch review starvation
across patchset submissions.


ATB,

Mark.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Simon Riggs
On 15 December 2014 at 19:52, Josh Berkus j...@agliodbs.com wrote:
 On 12/15/2014 11:27 AM, Robert Haas wrote:
 I feel like we used to be better at encouraging people to participate
 in the CF even if they were not experts, and to do the best they can
 based on what they did know.  That was a helpful dynamic.  Sure, the
 reviews weren't perfect, but more people helped, and reviewing some of
 the patch well and some of it in a more cursory manner is way better
 than reviewing none of it at all.

 Well, it was strongly expressed to me by a number of senior contributors
 on this list and at the developer meeting that inexpert reviews were not
 really wanted, needed or helpful.  As such, I stopped recruiting new
 reviewers (and, for that matter, doing them myself).  I don't know if
 the same goes for anyone else.

I don't remember saying that, hearing it or agreeing with it and I
don't agree with it now.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Mark Cave-Ayland
On 16/12/14 04:57, Noah Misch wrote:

 But that doesn't mean we should be turning anyone away. We should not.
 
 +1.  Some of the best reviews I've seen are ones where the reviewer expressed
 doubts about the review's quality, so don't let such doubts keep you from
 participating.  Every defect you catch saves a committer time; a review that
 finds 3 of the 10 defects in a patch is still a big help.  Some patch
 submissions waste the community's time, but it's almost impossible to waste
 the community's time by posting a patch review.
 
 Confusion may have arisen due to statements that we need more expert
 reviewers, which is also true.  (When an expert writes a sufficiently-complex
 patch, it's important that a second expert examine the patch at some point.)
 If you're a novice reviewer, your reviews do help to solve that problem by
 reducing the workload on expert reviewers.

I should add that by reducing the barrier of entry for patch review,
there is definitely potential to find common errors before a patch gets
analysed in detail by a committer.

For example I may not know a lot about certain PostgreSQL subsystems but
from a decent C coding background I can pick out certain suspicious
things in patches, e.g. potential overflows, pointer arithmetic bugs,
spelling mistakes/incorrect comments etc. and flag them to the original
submitter to double-check.


ATB,

Mark.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Possibly a comment typo in xlogrecord.h

2014-12-16 Thread Rahila Syed
Hello,

The comment before declaration of XLogRecordBlockHeader says

* 'data_length' is the length of the payload data associated with this,
 * and includes the possible full-page image, and rmgr-specific data. It

IIUC, data_length does not include associated full page image length.
Attached patch changes the comment as follows:

- * and includes the possible full-page image, and rmgr-specific data. It
- * does not include the XLogRecordBlockHeader struct itself.
+ * and includes the rmgr-specific data. It does not include the possible
+ * full page image and XLogRecordBlockHeader struct itself.

Thank you,
Rahila Syed


correct_comment_typo_XLogRecordBlockHeader.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Marko Tiikkaja

On 12/16/14 11:26 AM, Mark Cave-Ayland wrote:

On 15/12/14 19:27, Robert Haas wrote:

So, there are certainly some large patches that do that, and they
typically require a very senior reviewer.  But there are lots of small
patches too, touching little enough that you can learn enough to give
them a decent review even if you've never looked at that before.  I
find myself in that situation as a reviewer and committer pretty
regularly; being a committer doesn't magically make you an expert on
the entire code base.  You can (and I do) focus your effort on the
things you know best, but you have to step outside your comfort zone
sometimes, or you never learn anything new.


Right. Which is why I'm advocating the approach of splitting patches in
relevant chunks so that experts in those areas can review them in
parallel.


I don't see how splitting patches up would help with that.  I often look 
at only the parts of patches that touch the things I've worked with 
before.  And in doing that, I've found that having the context there is 
absolutely crucial almost every single time, since I commonly ask myself 
Why do we need to do this to implement feature X?, and only looking at 
the rest of the complete patch (or patch set, however you want to think 
about it) reveals that.


Of course, me looking at parts of patches, finding nothing questionable 
and not sending an email about my findings (or lack thereof) hardly 
counts as review, so somebody else still has to review the actual 
patch as a whole.  Nor do I get any credit for doing any of that, which 
might be a show-stopper for someone else.  But I think that's just 
because I'm not doing it correctly.  I don't see why someone couldn't 
comment on a patch saying I've reviewed the grammar changes, and they 
look good to me.



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Mark Cave-Ayland
On 16/12/14 07:33, David Rowley wrote:

 On 16 December 2014 at 18:18, Josh Berkus j...@agliodbs.com
 mailto:j...@agliodbs.com wrote:
 
  Man. You're equating stuff that's not the same. You didn't get your way
  (and I'm tentatively on your side onthat one) and take that to imply
  that we don't want more reviewers.
 
 During that thread a couple people said that novice reviewers added no
 value to the review process, and nobody argued with them then.  I've
 also been told this to my face at pgCon, and when I've tried organizing
 patch review events.  I got the message, which is why I stopped trying
 to get new reviewers.
 
 And frankly: if we're opposed to giving credit to patch reviewers, we're
 opposed to having them.
 
 
 
 I'd just like to add something which might be flying below the radar of
 more senior people. There are people out there  (ike me)  working on
 PostgreSQL more for the challenge and perhaps the love of the product,
 who make absolutely zero money out of it. For these people getting
 credit where it's due is very important. I'm pretty happy with this at
 the moment and I can't imagine any situation where not crediting
 reviewers would be beneficial to anyone.

This is exactly where I am at the moment, having previously been more
involved with the development side of PostgreSQL during the past.

Personally having a credit as a patch reviewer isn't particularly
important to me, since mail archives are good enough these days that if
people do query my contributions towards projects then I can point them
towards any reasonable search engine.

The biggest constraint on my ability to contribute is *time*.

Imagine the situation as a reviewer that I am currently on the mailing
list for two well-known open source projects and I also have a day job
and a home life to contend with.

For the spare time that I have for review, one of these projects
requires me to download attachment(s), apply them to a git tree
(hopefully it still applies), run a complete make check regression
series, try and analyse a patch which will often reference parts to
which I have no understanding, and then write up a coherent email and
submit it to the mailing list. Realistically to do all this and provide
a review that is going to be of use to a committer is going to take a
minimum of 1-2 hours, and even then there's a good chance that I've
easily missed obvious bugs in the parts of the system I don't understand
well.

For the second project, I can skim through my inbox daily picking up
specific areas I work on/are interested in, hit reply to add a couple of
lines of inline comments to the patch and send feedback to the
author/list in just a few minutes.

The obvious question is, of course, which project gets the majority
share of my spare review time?


ATB,

Mark.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Mark Cave-Ayland
On 16/12/14 10:49, Marko Tiikkaja wrote:

 On 12/16/14 11:26 AM, Mark Cave-Ayland wrote:
 On 15/12/14 19:27, Robert Haas wrote:
 So, there are certainly some large patches that do that, and they
 typically require a very senior reviewer.  But there are lots of small
 patches too, touching little enough that you can learn enough to give
 them a decent review even if you've never looked at that before.  I
 find myself in that situation as a reviewer and committer pretty
 regularly; being a committer doesn't magically make you an expert on
 the entire code base.  You can (and I do) focus your effort on the
 things you know best, but you have to step outside your comfort zone
 sometimes, or you never learn anything new.

 Right. Which is why I'm advocating the approach of splitting patches in
 relevant chunks so that experts in those areas can review them in
 parallel.
 
 I don't see how splitting patches up would help with that.  I often look
 at only the parts of patches that touch the things I've worked with
 before.  And in doing that, I've found that having the context there is
 absolutely crucial almost every single time, since I commonly ask myself
 Why do we need to do this to implement feature X?, and only looking at
 the rest of the complete patch (or patch set, however you want to think
 about it) reveals that.

I've already covered this earlier in the thread so I won't go into too
much detail, but the overall flow of the patchset is described by the
cover letter (please feel free to look at the example link I posted).

In terms of individual patches within a patchset then if the combination
of the cover letter and individual commit message don't give you enough
context then the developer needs to fix this; either the patchset has
been split at a nonsensical place or either one or other of the cover
letter and/or commit message need to be revised.

 Of course, me looking at parts of patches, finding nothing questionable
 and not sending an email about my findings (or lack thereof) hardly
 counts as review, so somebody else still has to review the actual
 patch as a whole.  Nor do I get any credit for doing any of that, which
 might be a show-stopper for someone else.  But I think that's just
 because I'm not doing it correctly.  I don't see why someone couldn't
 comment on a patch saying I've reviewed the grammar changes, and they
 look good to me.

Sure, there is always scope for doing that which is what splitting
patches and constant review encourages. In terms of the current
commitfest system, the process for review is clearly documented and as
I've pointed out in my response to David, extremely heavyweight in
comparison.


ATB,

Mark.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NUMERIC private methods?

2014-12-16 Thread David Fetter
On Tue, Dec 16, 2014 at 09:01:47AM +, Andrew Gierth wrote:
  Heikki == Heikki Linnakangas hlinnakan...@vmware.com writes:
 
  Heikki Looking at the weighed_stats code, this probably illustrates
  Heikki the hoops you had to jump through:
 
 Actually that hoop-jumping expression is almost irrelevant.

Right.  Not that it made things fun or easy (at least for me) to
debug, as you can see by the git history.

 The part that hurts (and yes, it's performance that's at issue here,
 and not code aesthetics) is not being able to use NumericVar in the
 aggregate's transition variable, because that means that every
 computed intermediate value is palloc'd and pfree'd twice (once as
 the digit buffer of a NumericVar and again as a Numeric datum).

Yep.  Performance of NUMERIC might yet get some of the love it
deserves.  Perhaps something along the lines of a 128-bit default
structure with a promotion/demotion scheme for larger representations.
Until then, those of us writing extensions are stuck with heaps of
extra instructions in it that could easily be trimmed away to good
effect.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] KNN-GiST with recheck

2014-12-16 Thread Heikki Linnakangas

On 10/06/2014 12:36 PM, Emre Hasegeli wrote:

Thanks. The main question now is design of this patch. Currently, it does
all the work inside access method. We already have some discussion of pro
and cons of this method. I would like to clarify alternatives now. I can
see following way:

1. Implement new executor node which performs sorting by priority queue.
Let's call it Priority queue. I think it should be separate node from
Sort node. Despite Priority queue and Sort are essentially similar
from user view, they would be completely different in implementation.
2. Implement some interface to transfer distance values from access
method to Priority queue node.


If we assume that all of them need recheck, maybe it can be done
without passing distance values.


No, the executor needs the lower-bound distance value, as calculated by 
the indexam, so that it knows which tuples it can return from the queue 
already. For example, imagine the following items coming from the index:


tuple # lower bound actual distance
1   1   1
2   2   10
3   30  30
4   40  40

After the executor has fetched tuple 2, and re-checked the distance, it 
pushes the tuple to the queue. It then fetches tuple 3, with lower bound 
30, and it can now immediately return tuple # 2 from the queue. Because 
10  30, so there cannot be any more tuples coming from the index that 
would need to go before tuple # 2.


The executor needs the lower bound as calculated by the index, as well 
as the actual distance it calculates itself, to make those decisions.



3. Somehow tell the planner that it could use Priority queue in
corresponding cases. I see two ways of doing this:
   - Add flag to operator in opclass indicating that index can only
   order by lower bound of col op value, not by col op value itself.
   - Define new relation between operators. Value of one operator could
   be lower bound for value of another operator. So, planner can
put Priority
   queue node when lower bound ordering is possible from index. Also ALTER
   OPERATOR command would be reasonable, so extensions could upgrade.


I think, it would be better to make it a property of the operator
class.  We can add a column to pg_amop or define another value for
amoppurpose on pg_amop.  Syntax can be something like this:

CREATE OPERATOR CLASS circle_ops DEFAULT
FOR TYPE circle USING gist AS
OPERATOR 15  -(circle, point) FOR ORDER BY pg_catalog.float_ops LOWER 
BOUND;

While looking at it, I realize that current version of the patch does
not use the sort operator family defined with the operator class.  It
assumes that the distance function will return values compatible with
the operator.  Operator class definition makes me think that there is
not such an assumption.


Yeah. I also noticed that the type of the argument passed to the 
consistent function varies, and doesn't necessarily match that declared 
in pg_proc. Looking at gist_point_consistent, the argument type can be a 
point, a polygon, or a circle, depending on the strategy group. But 
it's declared as a point in pg_proc.



Besides overhead, this way makes significant infrastructural changes. So,
it may be over-engineering. However, it's probably more clean and beautiful
solution.
I would like to get some feedback from people familiar with KNN-GiST like
Heikki or Tom. What do you think about this? Any other ideas?


I would be happy to test and review the changes.  I think it is nice
to solve the problem in a generalized way improving the access method
infrastructure.  Definitely, we should have a consensus on the design
before working on the infrastructure changes.


I took a stab on this. I added the reorder queue directly to the Index 
Scan node, rather than adding a whole new node type for it. It seems 
reasonable, as Index Scan is responsible for rechecking the quals, too, 
even though re-ordering the tuples is more complicated than rechecking 
quals.


To recap, the idea is that the index can define an ordering op, even if 
it cannot return the tuples in exactly the right order. It is enough 
that for each tuple, it returns a lower bound of the expression that is 
used for sorting. For example, for ORDER BY key - column, it is 
enough that it returns a lower bound of key - column for each tuple. 
The index must return the tuples ordered by the lower bounds. The 
executor re-checks the expressions, and re-orders the tuples to the 
correct order.


Patch attached. It should be applied on top of my pairing heap patch at 
http://www.postgresql.org/message-id/548ffa2c.7060...@vmware.com. Some 
caveats:


* The signature of the distance function is unchanged, it doesn't get a 
recheck argument. It is just assumed that if the consistent function 
sets the recheck flag, then the distance needs to be rechecked as well. 
We might want to add the recheck argument, like you Alexander did in 
your 

Re: [HACKERS] Commitfest problems

2014-12-16 Thread Claudio Freire
On Tue, Dec 16, 2014 at 8:09 AM, Mark Cave-Ayland
mark.cave-ayl...@ilande.co.uk wrote:
 For the spare time that I have for review, one of these projects
 requires me to download attachment(s), apply them to a git tree
 (hopefully it still applies), run a complete make check regression
 series, try and analyse a patch which will often reference parts to
 which I have no understanding, and then write up a coherent email and
 submit it to the mailing list. Realistically to do all this and provide
 a review that is going to be of use to a committer is going to take a
 minimum of 1-2 hours, and even then there's a good chance that I've
 easily missed obvious bugs in the parts of the system I don't understand
 well.

 For the second project, I can skim through my inbox daily picking up
 specific areas I work on/are interested in, hit reply to add a couple of
 lines of inline comments to the patch and send feedback to the
 author/list in just a few minutes.

Notice though that on the second project, the review is made in the
air. You didn't build, nor ran tests, you're guessing how the code
performs rather than seeing it perform, so the review is light
compared to the first.

Some of the effort of the first review is pointless, but not all of
it. Running make check may be a good task for a CI tool, but if you
ignore the result of make check, your review is missing an important
bit of information to be weighted.

There's something to be learned from the open build service (
http://openbuildservice.org/ ), there, review requests contain in the
interface the results of the build and rpmlint (it's for rpms). It
makes the review easy yet informed.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possibly a comment typo in xlogrecord.h

2014-12-16 Thread Heikki Linnakangas

On 12/16/2014 12:44 PM, Rahila Syed wrote:

Hello,

The comment before declaration of XLogRecordBlockHeader says


* 'data_length' is the length of the payload data associated with this,
* and includes the possible full-page image, and rmgr-specific data. It


IIUC, data_length does not include associated full page image length.
Attached patch changes the comment as follows:

- * and includes the possible full-page image, and rmgr-specific data. It
- * does not include the XLogRecordBlockHeader struct itself.
+ * and includes the rmgr-specific data. It does not include the possible
+ * full page image and XLogRecordBlockHeader struct itself.


Thanks, fixed! I also reworded the comment slightly, hopefully it's more 
readable now.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread David Fetter
On Tue, Dec 16, 2014 at 11:09:34AM +, Mark Cave-Ayland wrote:
 On 16/12/14 07:33, David Rowley wrote:
 
  On 16 December 2014 at 18:18, Josh Berkus j...@agliodbs.com
  mailto:j...@agliodbs.com wrote:
  
   Man. You're equating stuff that's not the same. You didn't get your 
  way
   (and I'm tentatively on your side onthat one) and take that to imply
   that we don't want more reviewers.
  
  During that thread a couple people said that novice reviewers added no
  value to the review process, and nobody argued with them then.  I've
  also been told this to my face at pgCon, and when I've tried organizing
  patch review events.  I got the message, which is why I stopped trying
  to get new reviewers.
  
  And frankly: if we're opposed to giving credit to patch reviewers, we're
  opposed to having them.
  
  
  
  I'd just like to add something which might be flying below the radar of
  more senior people. There are people out there  (ike me)  working on
  PostgreSQL more for the challenge and perhaps the love of the product,
  who make absolutely zero money out of it. For these people getting
  credit where it's due is very important. I'm pretty happy with this at
  the moment and I can't imagine any situation where not crediting
  reviewers would be beneficial to anyone.
 
 This is exactly where I am at the moment, having previously been more
 involved with the development side of PostgreSQL during the past.
 
 Personally having a credit as a patch reviewer isn't particularly
 important to me, since mail archives are good enough these days that if
 people do query my contributions towards projects then I can point them
 towards any reasonable search engine.
 
 The biggest constraint on my ability to contribute is *time*.
 
 Imagine the situation as a reviewer that I am currently on the mailing
 list for two well-known open source projects and I also have a day job
 and a home life to contend with.
 
 For the spare time that I have for review, one of these projects
 requires me to download attachment(s), apply them to a git tree
 (hopefully it still applies), run a complete make check regression
 series, try and analyse a patch which will often reference parts to
 which I have no understanding, and then write up a coherent email and
 submit it to the mailing list. Realistically to do all this and provide
 a review that is going to be of use to a committer is going to take a
 minimum of 1-2 hours, and even then there's a good chance that I've
 easily missed obvious bugs in the parts of the system I don't understand
 well.
 
 For the second project, I can skim through my inbox daily picking up
 specific areas I work on/are interested in, hit reply to add a couple of
 lines of inline comments to the patch and send feedback to the
 author/list in just a few minutes.

With utmost respect, you've missed something really important in the
second that the first has, and frankly isn't terribly onerous: does an
actual system produce working code?  In the PostgreSQL case, you can
stop as soon as you discover that the patch doesn't apply to master or
that ./configure doesn't work, or that the code doesn't compile:
elapsed time = 5 minutes.  Or you can keep moving until you have made
progress for the time you've allotted.

But the bigger issue, as others have pointed out, has never been a
technical one.  It's motivating people whose time is already much in
demand to spend some of it on reviewing.

I wasn't discouraged by the preliminary patch review process or any
feedback I got.  My absence lately has more to do with other demands
on my time.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REVIEW: Track TRUNCATE via pgstat

2014-12-16 Thread Alex Shulgin
Jim Nasby jim.na...@bluetreble.com writes:

 https://commitfest.postgresql.org/action/patch_view?id=1661 (apologies for 
 not replying to the thread; I can't find it in my inbox)

 Patch applies and passes make check. Code formatting looks good.

Jim,

 The regression test partially tests this. It does not cover 2PC, nor
 does it test rolling back a subtransaction that contains a
 truncate. The latter actually doesn't work correctly.

Thanks for pointing out the missing 2PC test, I've added one.

The test you've added for rolling back a subxact actually works
correctly, if you consider the fact that aborted (sub)xacts still
account for insert/update/delete in pgstat.  I've added this test with
the corrected expected results.

 The test also adds 2.5 seconds of forced pg_sleep. I think that's both
 bad and unnecessary. When I removed the sleeps I still saw times of
 less than 0.1 seconds.

Well, I never liked that part, but the stats don't get updated if we
don't put the session to sleep for at least PGSTAT_STAT_INTERVAL (which
is 500ms).

Removing these extra sleep calls would theoretically not make a
difference as wait_for_trunc_test_stats() seems to have enough sleep
calls itself, but due to the pgstat_report_stat() being called from the
main loop only, there's no way short of placing the explicit pg_sleep at
top level, if we want to be able to check the effects reproducibly.

Another idea would be exposing pgstat_report_stat(true) at SQL level.
That would eleminate the need for explicit pg_sleep(=0.5), but we'll
still need the wait_for_... call to make sure the collector has picked
it up.

 Also, wait_for_trunc_test_stats() should display something if it times
 out; otherwise you'll have a test failure and won't have any
 indication why.

Oh, good catch.  Since I've copied this function from stats.sql, we
might want to update that one too in a separate commit.

 I've attached a patch that adds logging on timeout and contains a test
 case that demonstrates the rollback to savepoint bug.

I'm attaching the updated patch version.

Thank you for the review!
--
Alex

PS: re: your CF comment: I'm producing the patches using

  git format-patch --ext-diff

where diff.external is set to '/bin/bash src/tools/git-external-diff'.

Now that I try to apply it using git, looks like git doesn't like the
copied context diff very much...

From cc51823a01a194ef6fcd90bc763fa26498837322 Mon Sep 17 00:00:00 2001
From: Alex Shulgin a...@commandprompt.com
Date: Tue, 9 Dec 2014 16:35:14 +0300
Subject: [PATCH] WIP: track TRUNCATEs in pgstat transaction stats.

The n_live_tup and n_dead_tup counters need to be set to 0 after a
TRUNCATE on the relation.  We can't issue a special message to the stats
collector because the xact might be later aborted, so we track the fact
that the relation was truncated during the xact (and reset this xact's
insert/update/delete counters).  When xact is committed, we use the
`truncated' flag to reset the n_live_tup and n_dead_tup counters.
---
 src/backend/commands/tablecmds.c |   3 +
 src/backend/postmaster/pgstat.c  |  52 -
 src/include/pgstat.h |   3 +
 src/test/regress/expected/prepared_xacts.out |  50 
 src/test/regress/expected/truncate.out   | 164 +++
 src/test/regress/sql/prepared_xacts.sql  |  27 +
 src/test/regress/sql/truncate.sql| 118 +++
 7 files changed, 414 insertions(+), 3 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
new file mode 100644
index 1e737a0..4f0e3d8
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
***
*** 71,76 
--- 71,77 
  #include parser/parse_type.h
  #include parser/parse_utilcmd.h
  #include parser/parser.h
+ #include pgstat.h
  #include rewrite/rewriteDefine.h
  #include rewrite/rewriteHandler.h
  #include rewrite/rewriteManip.h
*** ExecuteTruncate(TruncateStmt *stmt)
*** 1224,1229 
--- 1225,1232 
  			 */
  			reindex_relation(heap_relid, REINDEX_REL_PROCESS_TOAST);
  		}
+ 
+ 		pgstat_count_heap_truncate(rel);
  	}
  
  	/*
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
new file mode 100644
index f71fdeb..b02e4a1
*** a/src/backend/postmaster/pgstat.c
--- b/src/backend/postmaster/pgstat.c
*** typedef struct TwoPhasePgStatRecord
*** 201,206 
--- 201,207 
  	PgStat_Counter tuples_deleted;		/* tuples deleted in xact */
  	Oid			t_id;			/* table's OID */
  	bool		t_shared;		/* is it a shared catalog? */
+ 	bool		t_truncated;	/* was the relation truncated? */
  } TwoPhasePgStatRecord;
  
  /*
*** pgstat_count_heap_delete(Relation rel)
*** 1864,1869 
--- 1865,1894 
  }
  
  /*
+  * pgstat_count_heap_truncate - update tuple counters due to truncate
+  */
+ void
+ pgstat_count_heap_truncate(Relation rel)
+ {
+ 	PgStat_TableStatus *pgstat_info = 

[HACKERS] .gitignore config.cache and comment about regression.(out|diff)

2014-12-16 Thread Jim Nasby

config.cache is created when you pass -C to configure, which speeds it up 
considerably (3.5s vs 16.5 on my laptop). It would be nice to just ignore the 
cache file it generates.

Originally this patch also ignored the regression output files, until I 
realized why that was a bad idea. Add a comment about that.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
From a681953a802230e73e5e4f91607eca9dd99c34f2 Mon Sep 17 00:00:00 2001
From: Jim Nasby jim.na...@bluetreble.com
Date: Mon, 15 Dec 2014 18:35:50 -0600
Subject: [PATCH] Ignore config.cache
Also add a comment about why regreesion.* aren't listed.
---
 .gitignore  | 1 +
 src/test/regress/.gitignore | 4 
 2 files changed, 5 insertions(+)

diff --git a/.gitignore b/.gitignore
index 681af08..715f817 100644
--- a/.gitignore
+++ b/.gitignore
@@ -28,6 +28,7 @@ lib*.pc
 
 # Local excludes in root directory
 /GNUmakefile
+/config.cache
 /config.log
 /config.status
 /pgsql.sln
diff --git a/src/test/regress/.gitignore b/src/test/regress/.gitignore
index 7573add..d0b055f 100644
--- a/src/test/regress/.gitignore
+++ b/src/test/regress/.gitignore
@@ -5,3 +5,7 @@
 /tmp_check/
 /results/
 /log/
+
+# Note: regreesion.* are only left behind on a failure; that's why they're not 
ignored
+#/regression.diffs
+#/regression.out
-- 
2.1.2


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] moving Orafce from pgFoundry - pgFoundry management

2014-12-16 Thread Marc Fournier

Project disabled on pgfoundry … do you want me to remove the mailing lists and 
all those archives?   Or leave them there … ?


 On Dec 13, 2014, at 9:18 AM, David Fetter da...@fetter.org wrote:
 
 On Sat, Dec 13, 2014 at 11:19:08AM +0100, Pavel Stehule wrote:
 Hi
 
 a Orafce package on pgFoundry is obsolete - we migrated to github few years
 ago. Please, can somebody modify a description about this migration? Or
 drop this project there.
 
 Pavel,
 
 I've removed pgFoundry references from the IRC documentation bot and
 replaced them with references to github.
 
 Marc,
 
 Could you please remove the Orafce project from pgFoundry?
 
 Cheers,
 David.
 -- 
 David Fetter da...@fetter.org http://fetter.org/
 Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
 Skype: davidfetter  XMPP: david.fet...@gmail.com
 
 Remember to vote!
 Consider donating to Postgres: http://www.postgresql.org/about/donate



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] explain sortorder

2014-12-16 Thread Timmer, Marius
Hi Mike,

Now I've replaced the spaces at the beginning of the lines with tabs.
Quotes() in the expected/explain_sortorder.out-File caused failing „make 
check“. So I changed them to single quotes(').

I’ve added the corrected patch as attachment. 


Kind regards,

Marius





explain_sortorder-v3.patch
Description: Binary data

---
Marius Timmer
Zentrum für Informationsverarbeitung
Westfälische Wilhelms-Universität Münster
Einsteinstraße 60


smime.p7s
Description: S/MIME cryptographic signature


Re: [HACKERS] REVIEW: Track TRUNCATE via pgstat

2014-12-16 Thread Alvaro Herrera
Alex Shulgin wrote:
 Jim Nasby jim.na...@bluetreble.com writes:

  The test also adds 2.5 seconds of forced pg_sleep. I think that's both
  bad and unnecessary. When I removed the sleeps I still saw times of
  less than 0.1 seconds.
 
 Well, I never liked that part, but the stats don't get updated if we
 don't put the session to sleep for at least PGSTAT_STAT_INTERVAL (which
 is 500ms).
 
 Removing these extra sleep calls would theoretically not make a
 difference as wait_for_trunc_test_stats() seems to have enough sleep
 calls itself, but due to the pgstat_report_stat() being called from the
 main loop only, there's no way short of placing the explicit pg_sleep at
 top level, if we want to be able to check the effects reproducibly.
 
 Another idea would be exposing pgstat_report_stat(true) at SQL level.
 That would eleminate the need for explicit pg_sleep(=0.5), but we'll
 still need the wait_for_... call to make sure the collector has picked
 it up.

We already have a stats test that sleeps.  Why not add this stuff there,
to avoid making another test slow?

I agree that tests that sleep are annoying.  (Try running the timeout
isolation test a few times and you'll see what I mean.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WALWriter active during recovery

2014-12-16 Thread Heikki Linnakangas

On 12/15/2014 08:51 PM, Simon Riggs wrote:

Currently, WALReceiver writes and fsyncs data it receives. Clearly,
while we are waiting for an fsync we aren't doing any other useful
work.

Following patch starts WALWriter during recovery and makes it
responsible for fsyncing data, allowing WALReceiver to progress other
useful actions.


What other useful actions can WAL receiver do while it's waiting? It 
doesn't do much else than receive WAL, and fsync it to disk.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-12-16 Thread Michael Paquier
On Tue, Dec 16, 2014 at 8:35 AM, Michael Paquier michael.paqu...@gmail.com
wrote:
 On Tue, Dec 16, 2014 at 3:46 AM, Robert Haas robertmh...@gmail.com
wrote:
 On Sat, Dec 13, 2014 at 9:36 AM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 Something to be aware of btw is that this patch introduces an
 additional 8 bytes per block image in WAL as it contains additional
 information to control the compression. In this case this is the
 uint16 compress_len present in XLogRecordBlockImageHeader. In the case
 of the measurements done, knowing that 63638 FPWs have been written,
 there is a difference of a bit less than 500k in WAL between HEAD and
 FPW off in favor of HEAD. The gain with compression is welcome,
 still for the default there is a small price to track down if a block
 is compressed or not. This patch still takes advantage of it by not
 compressing the hole present in page and reducing CPU work a bit.

 That sounds like a pretty serious problem to me.
 OK. If that's so much a problem, I'll switch back to the version using
 1 bit in the block header to identify if a block is compressed or not.
 This way, when switch will be off the record length will be the same
 as HEAD.
And here are attached fresh patches reducing the WAL record size to what it
is in head when the compression switch is off. Looking at the logic in
xlogrecord.h, the block header stores the hole length and hole offset. I
changed that a bit to store the length of raw block, with hole or
compressed as the 1st uint16. The second uint16 is used to store the hole
offset, same as HEAD when compression switch is off. When compression is
on, a special value 0x is saved (actually only filling 1 in the 16th
bit is fine...). Note that this forces to fill in the hole with zeros and
to compress always BLCKSZ worth of data.
Those patches pass make check-world, even WAL replay on standbys.

I have done as well measurements using this patch set, with the following
things that can be noticed:
- When compression switch is off, the same quantity of WAL as HEAD is
produced
- pglz is very bad at compressing page hole. I mean, really bad. Have a
look at the user CPU particularly when pages are empty and you'll
understand... Other compression algorithms would be better here. Tests are
done with various values of fillfactor, 10 means that after the update 80%
of the page is empty, at 50% the page is more or less completely full.

Here are the results, with 5 test cases:
- FPW on + 2 bytes, compression switch is on, using 2 additional bytes in
block header, resulting in WAL records longer as 8 more bytes are used per
block with lower CPU usage as page holes are not compressed by pglz.
- FPW off + 2 bytes, same as previous, with compression switch to on.
- FPW on + 0 bytes, compression switch to on, the same block header size as
HEAD is used, at the cost of compressing page holes filled with zeros
- FPW on + 0 bytes, compression switch to off, same as previous
- HEAD, unpatched master (except with hack to calculate user and system CPU)
- Record, the record-level compression, with compression lower-bound set at
0.

=# select test || ', ffactor ' || ffactor, pg_size_pretty(post_update -
pre_update), user_diff, system_diff from results;
   ?column?| pg_size_pretty | user_diff | system_diff
---++---+-
 FPW on + 2 bytes, ffactor 50  | 582 MB | 42.391894 |0.807444
 FPW on + 2 bytes, ffactor 20  | 229 MB | 14.330304 |0.729626
 FPW on + 2 bytes, ffactor 10  | 117 MB |  7.335442 |0.570996
 FPW off + 2 bytes, ffactor 50 | 746 MB | 25.330391 |1.248503
 FPW off + 2 bytes, ffactor 20 | 293 MB | 10.537475 |0.755448
 FPW off + 2 bytes, ffactor 10 | 148 MB |  5.762775 |0.763761
 FPW on + 0 bytes, ffactor 50  | 585 MB | 54.115496 |0.924891
 FPW on + 0 bytes, ffactor 20  | 234 MB | 26.270404 |0.755862
 FPW on + 0 bytes, ffactor 10  | 122 MB | 19.540131 |0.800981
 FPW off + 0 bytes, ffactor 50 | 746 MB | 25.102241 |1.110677
 FPW off + 0 bytes, ffactor 20 | 293 MB |  9.889374 |0.749884
 FPW off + 0 bytes, ffactor 10 | 148 MB |  5.286767 |0.682746
 HEAD, ffactor 50  | 746 MB | 25.181729 |1.133433
 HEAD, ffactor 20  | 293 MB |  9.962242 |0.765970
 HEAD, ffactor 10  | 148 MB |  5.693426 |0.775371
 Record, ffactor 50| 582 MB | 54.904374 |0.678204
 Record, ffactor 20| 229 MB | 19.798268 |0.807220
 Record, ffactor 10| 116 MB |  9.401877 |0.668454
(18 rows)

Attached are as well the results of the measurements, and the test case
used.
Regards,
-- 
Michael


20141216_fpw_compression_v7.tar.gz
Description: GNU Zip compressed data


results.sql
Description: Binary data


test_compress
Description: Binary data

-- 

Re: [HACKERS] REVIEW: Track TRUNCATE via pgstat

2014-12-16 Thread Alex Shulgin

Alvaro Herrera alvhe...@2ndquadrant.com writes:
 
 Another idea would be exposing pgstat_report_stat(true) at SQL level.
 That would eleminate the need for explicit pg_sleep(=0.5), but we'll
 still need the wait_for_... call to make sure the collector has picked
 it up.

 We already have a stats test that sleeps.  Why not add this stuff there,
 to avoid making another test slow?

Well, if we could come up with a set of statements to test that would
produce the end result unambigously, so that we can be certain the stats
we check at the end cannot be a result of neat interaction of buggy
behavior...

I'm not sure this is at all possible, but I know for sure it will make
debugging the possible fails harder.  Even with the current approach of
checking the stats after every isolated case it's sometimes takes quite
a little more than a glance to verify correctness due to side-effects of
rollback (ins/upd/del counters are still updated), and the way stats are
affecting the dead tuples counter.

I'll try to see if the checks can be converged though.

--
Alex


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-12-16 Thread Alvaro Herrera
Michael Paquier wrote:

 And here are attached fresh patches reducing the WAL record size to what it
 is in head when the compression switch is off. Looking at the logic in
 xlogrecord.h, the block header stores the hole length and hole offset. I
 changed that a bit to store the length of raw block, with hole or
 compressed as the 1st uint16. The second uint16 is used to store the hole
 offset, same as HEAD when compression switch is off. When compression is
 on, a special value 0x is saved (actually only filling 1 in the 16th
 bit is fine...). Note that this forces to fill in the hole with zeros and
 to compress always BLCKSZ worth of data.

Why do we compress the hole?  This seems pointless, considering that we
know it's all zeroes.  Is it possible to compress the head and tail of
page separately?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Decoding follows timelines

2014-12-16 Thread Heikki Linnakangas

On 12/15/2014 08:54 PM, Simon Riggs wrote:

Currently, it doesn't.

This patch is a WIP version of doing that, but only currently attempts
to do this in the WALSender.

Objective is to allow cascaded logical replication.

Very WIP, but here for comments.


With the patch, XLogSendLogical uses the same logic to calculate 
SendRqstPtr that XLogSendPhysical does. It would be good to refactor 
that into a common function, rather than copy-paste.


SendRqstPtr isn't actually used for anything in XLogSendLogical.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REVIEW: Track TRUNCATE via pgstat

2014-12-16 Thread Alvaro Herrera
Alex Shulgin wrote:

 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  
  Another idea would be exposing pgstat_report_stat(true) at SQL level.
  That would eleminate the need for explicit pg_sleep(=0.5), but we'll
  still need the wait_for_... call to make sure the collector has picked
  it up.
 
  We already have a stats test that sleeps.  Why not add this stuff there,
  to avoid making another test slow?
 
 Well, if we could come up with a set of statements to test that would
 produce the end result unambigously, so that we can be certain the stats
 we check at the end cannot be a result of neat interaction of buggy
 behavior...

It is always possible that things work just right because two bugs
cancel each other.

 I'm not sure this is at all possible, but I know for sure it will make
 debugging the possible fails harder.

Surely if some other patch introduces a failure, nobody will try to
debug it by looking only at the failures of this test.

 Even with the current approach of checking the stats after every
 isolated case it's sometimes takes quite a little more than a glance
 to verify correctness due to side-effects of rollback (ins/upd/del
 counters are still updated), and the way stats are affecting the dead
 tuples counter.

Honestly I think pg_regress is not the right tool to test stat counter
updates.  It kind-of works today, but only because we don't stress it
too much.  If you want to create a new test framework for pgstats, and
include some tests for truncate, be my guest.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WALWriter active during recovery

2014-12-16 Thread Andres Freund
On 2014-12-16 16:12:40 +0200, Heikki Linnakangas wrote:
 On 12/15/2014 08:51 PM, Simon Riggs wrote:
 Currently, WALReceiver writes and fsyncs data it receives. Clearly,
 while we are waiting for an fsync we aren't doing any other useful
 work.
 
 Following patch starts WALWriter during recovery and makes it
 responsible for fsyncing data, allowing WALReceiver to progress other
 useful actions.
 
 What other useful actions can WAL receiver do while it's waiting? It doesn't
 do much else than receive WAL, and fsync it to disk.

It can actually receive further data from the network and write it to
disk? On a relatively low latency network the buffers aren't that
large. Right now we generate quite a bursty IO pattern with the disks
alternating between idle and fully busy.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-12-16 Thread Michael Paquier
On Tue, Dec 16, 2014 at 11:24 PM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:

 Michael Paquier wrote:

  And here are attached fresh patches reducing the WAL record size to what
 it
  is in head when the compression switch is off. Looking at the logic in
  xlogrecord.h, the block header stores the hole length and hole offset. I
  changed that a bit to store the length of raw block, with hole or
  compressed as the 1st uint16. The second uint16 is used to store the hole
  offset, same as HEAD when compression switch is off. When compression is
  on, a special value 0x is saved (actually only filling 1 in the 16th
  bit is fine...). Note that this forces to fill in the hole with zeros and
  to compress always BLCKSZ worth of data.

 Why do we compress the hole?  This seems pointless, considering that we
 know it's all zeroes.  Is it possible to compress the head and tail of
 page separately?

This would take 2 additional bytes at minimum in the block header,
resulting in 8 additional bytes in record each time a FPW shows up. IMO it
is important to check the length of things obtained when replaying WAL,
that's something the current code of HEAD does quite well.
-- 
Michael


Re: [alvhe...@2ndquadrant.com: Re: [HACKERS] no test programs in contrib]

2014-12-16 Thread Alvaro Herrera
Hi Andrew,

Did you have a chance to review this?

Alvaro Herrera wrote:
 Andrew Dunstan wrote:
  
  On 11/29/2014 10:09 PM, Alvaro Herrera wrote:

  Anyway I just pushed this src/test/modules/ patch, which has
  implications for buildfarm: these new test modules are not invoked
  except explicitely.  How would go about getting members to run cd
  src/test/modules ; make check ; make installcheck?  I imagine it's
  impossible to do it unless each member maintainer update the buildfarm
  client script, right?
  
  Yes.
  
  Why are we going to run both check and installcheck? And what output files
  are created? The buildfarm will need to know.
 
 Well, initially the patch moved test_decoding to src/test/modules, which
 requires make check, but I left that in contrib due to complaints, and
 all remaining modules are happy to use make installcheck.  Attached is a
 patch to run_build.pl that adds src/test/modules build, install and
 check.  I also added the vcregress call (just copied it from the contrib
 one, really), but of course that doesn't work at all yet since MSVC
 doesn't build it.  Would you give it a look?  I would like to have
 buildfarm doing this before moving on with more stuff.



 diff --git a/run_build.pl b/run_build.pl
 index a358d9c..77fcf62 100755
 --- a/run_build.pl
 +++ b/run_build.pl
 @@ -665,6 +665,8 @@ make_bin_check();
  # contrib is builtunder standard build step for msvc
  make_contrib() unless ($using_msvc);
  
 +make_testmodules();
 +
  make_doc() if (check_optional_step('build_docs'));
  
  make_install();
 @@ -672,6 +674,8 @@ make_install();
  # contrib is installed under standard install for msvc
  make_contrib_install() unless ($using_msvc);
  
 +make_testmodules_install();
 +
  process_module_hooks('configure');
  
  process_module_hooks('build');
 @@ -753,6 +757,19 @@ foreach my $locale (@locales)
  make_contrib_install_check($locale);
  }
  
 + if (step_wanted('testmodules-install-check'))
 +{
 + print time_str(),restarting db ($locale)...\n if $verbose;
 +
 + stop_db($locale);
 + start_db($locale);
 +
 +print time_str(),running make test-modules installcheck 
 ($locale)...\n
 +  if $verbose;
 +
 +make_testmodules_install_check($locale);
 +}
 +
  print time_str(),stopping db ($locale)...\n if $verbose;
  
  stop_db($locale);
 @@ -1062,6 +1079,22 @@ sub make_contrib
  $steps_completed .=  Contrib;
  }
  
 +sub make_testmodules
 +{
 + return unless step_wanted('testmodules');
 + print time_str(),running make src/test/modules ...\n if $verbose;
 +
 + my $make_cmd = $make;
 + $make_cmd = $make -j $make_jobs
 +   if ($make_jobs  1  ($branch eq 'HEAD' || $branch ge 'REL9_1'));
 + my @makeout = `cd $pgsql/src/test/modules  $make_cmd 21`;
 + my $status = $?  8;
 + writelog('make-testmodules',\@makeout);
 +print  make testmodules log ===\n,@makeout if 
 ($verbose  1);
 + send_result('TestModules',$status,\@makeout) if $status;
 + $steps_completed .=  TestModules;
 +}
 +
  sub make_contrib_install
  {
  return
 @@ -1081,6 +1114,23 @@ sub make_contrib_install
  $steps_completed .=  ContribInstall;
  }
  
 +sub make_testmodules_install
 +{
 + return
 +   unless (step_wanted('testmodules')
 + and step_wanted('install'));
 + print time_str(),running make testmodules install ...\n
 +   if $verbose;
 +
 + my @makeout = `cd $pgsql/src/test/modules  $make install 21`;
 + my $status = $? 8;
 + writelog('install-testmodules',\@makeout);
 +print  make testmodules install log ===\n,@makeout
 +   if ($verbose  1);
 + send_result('TestModulesInstall',$status,\@makeout) if $status;
 + $steps_completed .=  TestModulesInstall;
 +}
 +
  sub initdb
  {
  my $locale = shift;
 @@ -1317,6 +1367,50 @@ sub make_contrib_install_check
  $steps_completed .=  ContribCheck-$locale;
  }
  
 +sub make_testmodules_install_check
 +{
 + my $locale = shift;
 +return unless step_wanted('testmodules-install-check');
 +my @checklog;
 +unless ($using_msvc)
 +{
 +@checklog =
 +  `cd $pgsql/src/test/modules  $make USE_MODULE_DB=1 installcheck 
 21`;
 +}
 +else
 +{
 +chdir $pgsql/src/tools/msvc;
 +@checklog = `perl vcregress.pl modulescheck 21`;
 +chdir $branch_root;
 +}
 +my $status = $? 8;
 +my @logs = glob($pgsql/src/test/modules/*/regression.diffs);
 +push(@logs,$installdir/logfile);
 +foreach my $logfile (@logs)
 +{
 +next unless (-e $logfile);
 +push(@checklog,\n\n= $logfile 
 ===\n);
 +my $handle;
 +open($handle,$logfile);
 +while($handle)
 +{
 +push(@checklog,$_);
 +}
 +close($handle);
 +}
 +if ($status)
 +{
 +my @trace =
 +  

Re: [HACKERS] Comment typo in typedef struct BrinTuple

2014-12-16 Thread Heikki Linnakangas

On 12/15/2014 09:04 AM, Amit Langote wrote:


Hi,

Find attached that does:

-* mt_info is laid out in the following fashion:
+* bt_info is laid out in the following fashion:
snip-comment
 uint8   bt_info;
} BrinTuple;


Thanks. Fixed along with a bunch of other misc comment typos I've bumped 
into.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Mark Cave-Ayland
On 16/12/14 13:37, Claudio Freire wrote:

 For the second project, I can skim through my inbox daily picking up
 specific areas I work on/are interested in, hit reply to add a couple of
 lines of inline comments to the patch and send feedback to the
 author/list in just a few minutes.
 
 Notice though that on the second project, the review is made in the
 air. You didn't build, nor ran tests, you're guessing how the code
 performs rather than seeing it perform, so the review is light
 compared to the first.

I think this doing developers in general a little injustice here. The
general rule for a submitting patch to *any* project is: i) does it
apply to the current source tree and ii) does it build and pass the
regression tests?

Maybe there is a greater incidence of this happening in PostgreSQL
compared to other projects? But in any case the project has every right
to refuse further review if these 2 simple criteria are not met. Also
with a submission from git, you can near 100% guarantee that the author
has actually built and run the code before submission. I have seen
occasions where a patch has been submitted, and a committer will send a
quick email along the lines of The patch looks good, but I've just
applied a patch that will conflict with this, so please rebase and
resubmit.

You mention about performance, but again I've seen from this list that
good developers can generally pick up on a lot of potential regressions
by eye, e.g. lookup times go from O(N) to O(N^2) without having to
resort to building and testing the code. And by breaking into small
chunks people can focus their time on reviewing the areas they are good at.

Occasionally sometimes people do get a patch ready for commit but it
fails build/test on one particular platform. In that case, the committer
simply posts the build/test failure to the list and requests that the
patchset be fixed ready for re-test. This is a very rare occurrence though.

Also you mentioned about light review but I wouldn't call it that. I
see it more as with each iteration the magnifying glass used to look at
the code gets bigger and bigger.

A lot of initial comments for the second project are along the lines of
this doesn't look right - won't this overflow on values 2G? or
you're assuming here that A occurs before B, whereas if you run with
-foo this likely won't work. And this is the area which I believe will
have the greatest benefit for PostgreSQL - making it easier to catch and
rework the obvious flaws in the patch in a timely manner before it gets
to the committer.


ATB,

Mark.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-12-16 Thread Michael Paquier
On Tue, Dec 16, 2014 at 11:30 PM, Michael Paquier michael.paqu...@gmail.com
 wrote:



 On Tue, Dec 16, 2014 at 11:24 PM, Alvaro Herrera alvhe...@2ndquadrant.com
  wrote:

 Michael Paquier wrote:

  And here are attached fresh patches reducing the WAL record size to
 what it
  is in head when the compression switch is off. Looking at the logic in
  xlogrecord.h, the block header stores the hole length and hole offset. I
  changed that a bit to store the length of raw block, with hole or
  compressed as the 1st uint16. The second uint16 is used to store the
 hole
  offset, same as HEAD when compression switch is off. When compression is
  on, a special value 0x is saved (actually only filling 1 in the 16th
  bit is fine...). Note that this forces to fill in the hole with zeros
 and
  to compress always BLCKSZ worth of data.

 Why do we compress the hole?  This seems pointless, considering that we
 know it's all zeroes.  Is it possible to compress the head and tail of
 page separately?

 This would take 2 additional bytes at minimum in the block header,
 resulting in 8 additional bytes in record each time a FPW shows up. IMO it
 is important to check the length of things obtained when replaying WAL,
 that's something the current code of HEAD does quite well.

Actually, the original length of the compressed block in saved in
PGLZ_Header, so if we are fine to not check the size of the block
decompressed when decoding WAL we can do without the hole filled with
zeros, and use only 1 bit to see if the block is compressed or not.
-- 
Michael


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Mark Cave-Ayland
On 16/12/14 13:44, David Fetter wrote:

 On Tue, Dec 16, 2014 at 11:09:34AM +, Mark Cave-Ayland wrote:
 On 16/12/14 07:33, David Rowley wrote:

 On 16 December 2014 at 18:18, Josh Berkus j...@agliodbs.com
 mailto:j...@agliodbs.com wrote:

  Man. You're equating stuff that's not the same. You didn't get your 
 way
  (and I'm tentatively on your side onthat one) and take that to imply
  that we don't want more reviewers.

 During that thread a couple people said that novice reviewers added no
 value to the review process, and nobody argued with them then.  I've
 also been told this to my face at pgCon, and when I've tried organizing
 patch review events.  I got the message, which is why I stopped trying
 to get new reviewers.

 And frankly: if we're opposed to giving credit to patch reviewers, we're
 opposed to having them.



 I'd just like to add something which might be flying below the radar of
 more senior people. There are people out there  (ike me)  working on
 PostgreSQL more for the challenge and perhaps the love of the product,
 who make absolutely zero money out of it. For these people getting
 credit where it's due is very important. I'm pretty happy with this at
 the moment and I can't imagine any situation where not crediting
 reviewers would be beneficial to anyone.

 This is exactly where I am at the moment, having previously been more
 involved with the development side of PostgreSQL during the past.

 Personally having a credit as a patch reviewer isn't particularly
 important to me, since mail archives are good enough these days that if
 people do query my contributions towards projects then I can point them
 towards any reasonable search engine.

 The biggest constraint on my ability to contribute is *time*.

 Imagine the situation as a reviewer that I am currently on the mailing
 list for two well-known open source projects and I also have a day job
 and a home life to contend with.

 For the spare time that I have for review, one of these projects
 requires me to download attachment(s), apply them to a git tree
 (hopefully it still applies), run a complete make check regression
 series, try and analyse a patch which will often reference parts to
 which I have no understanding, and then write up a coherent email and
 submit it to the mailing list. Realistically to do all this and provide
 a review that is going to be of use to a committer is going to take a
 minimum of 1-2 hours, and even then there's a good chance that I've
 easily missed obvious bugs in the parts of the system I don't understand
 well.

 For the second project, I can skim through my inbox daily picking up
 specific areas I work on/are interested in, hit reply to add a couple of
 lines of inline comments to the patch and send feedback to the
 author/list in just a few minutes.
 
 With utmost respect, you've missed something really important in the
 second that the first has, and frankly isn't terribly onerous: does an
 actual system produce working code?  In the PostgreSQL case, you can
 stop as soon as you discover that the patch doesn't apply to master or
 that ./configure doesn't work, or that the code doesn't compile:
 elapsed time = 5 minutes.  Or you can keep moving until you have made
 progress for the time you've allotted.

As per my previous email, it's generally quite rare for a developer to
post non-working code to a list (in some cases someone will send a quick
reply pointing that this needs to be rebased because it appears to
reference an old API). From what I see in PostgreSQL this mostly happens
because of bitrot between the time the patch was posted to the list and
the actual commitfest itself - so in one way the commitfest system
exacerbates this particular problem further.

 But the bigger issue, as others have pointed out, has never been a
 technical one.  It's motivating people whose time is already much in
 demand to spend some of it on reviewing.
 
 I wasn't discouraged by the preliminary patch review process or any
 feedback I got.  My absence lately has more to do with other demands
 on my time.

I am completely in agreement with you here. My approach is more along
the lines of that I know access to long periods of time to review
patches is often impractical, and so how can the review process be made
more time-efficient in order to allow you, me and other people in
similar time-limited environments to be able to participate more?


ATB,

Mark.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-12-16 Thread Merlin Moncure
On Mon, Dec 15, 2014 at 5:37 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Tue, Dec 16, 2014 at 5:14 AM, Merlin Moncure mmonc...@gmail.com wrote:
 OTOH, Our built in compressor as we all know is a complete dog in
 terms of cpu when stacked up against some more modern implementations.
 All that said, as long as there is a clean path to migrating to
 another compression alg should one materialize, that problem can be
 nicely decoupled from this patch as Robert pointed out.
 I am curious to see some numbers about that. Has anyone done such
 comparison measurements?

I don't, but I can make some.  There are some numbers on the web but
it's better to make some new ones because IIRC some light optimization
had gone into plgz of late.

Compressing *one* file with lz4 and a quick/n/dirty plgz i hacked out
of the source (borrowing heavily from
https://github.com/maropu/pglz_bench/blob/master/pglz_bench.cpp),  I
tested the results:

lz4 real time:  0m0.032s
pglz real time: 0m0.281s

mmoncure@mernix2 ~/src/lz4/lz4-r125 $ ls -lh test.*
-rw-r--r-- 1 mmoncure mmoncure 2.7M Dec 16 09:04 test.lz4
-rw-r--r-- 1 mmoncure mmoncure 2.5M Dec 16 09:01 test.pglz

A better test would examine all manner of different xlog files in a
fashion closer to how your patch would need to compress them but the
numbers here tell a fairly compelling story: similar compression
results for around 9x the cpu usage.  Be advised that compression alg
selection is one of those types of discussions that tends to spin off
into outer space; that's not something you have to solve today.  Just
try and make things so that they can be switched out if things
change

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On partitioning

2014-12-16 Thread Robert Haas
On Mon, Dec 15, 2014 at 6:55 PM, Amit Langote
langote_amit...@lab.ntt.co.jp wrote:
 Robert wrote:
 On Sun, Dec 14, 2014 at 9:12 PM, Amit Langote
 langote_amit...@lab.ntt.co.jp wrote:
  This means if a user puts arbitrary expressions in a partition definition, 
  say,
 
  ... FOR VALUES  extract(month from current_date) TO extract(month from
 current_date + interval '3 months'),
 
  we make sure that those expressions are pre-computed to literal values.

 I would expect that to fail, just as it would fail if you tried to
 build an index using a volatile expression.

 Oops, wrong example, sorry. In case of an otherwise good expression?

I'm not really sure what you are getting here.  An otherwise-good
expression basically means a constant.  Index expressions have to be
things that always produce the same result given the same input,
because otherwise you might get a different result when searching the
index than you did when building it, and then you would fail to find
keys that are actually present.  In the same way, partition boundaries
also need to be constants.  Maybe you could allow expressions that can
be constant-folded, but that's about it.  If you allow anything else,
then the partition boundary might move once it's been established
and then some of the data will be in the wrong partition.

What possible use case is there for defining partitions with
non-constant boundaries?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NUMERIC private methods?

2014-12-16 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes:
 On 12/16/2014 08:34 AM, David Fetter wrote:
 While noodling with some weighted statistics
 https://github.com/davidfetter/weighted_stats, I noticed I was
 having to jump through a lot of hoops because of all the private
 methods in numeric.c, especially NumericVar.  Would there be some
 major objection to exposing NumericVar as an opaque blob?

 Hmm. You'd want to make add_var, mul_var etc. non-static?

-1 for that.

 Looking at the weighed_stats code, this probably illustrates the hoops 
 you had to jump through:

 /* sqrt((n/(n-1)) * ((s0*s2 - s1*s1)/(s0*s0)) */

If you're concerned about arithmetic performance, there is a very obvious
fix here: use double.  Is there some utterly compelling reason to use
numeric, despite the fact that it's certain to be orders of magnitude
slower?

(It would still be orders of magnitude slower, no matter how much we
were willing to destroy numeric.c's modularity boundary.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On partitioning

2014-12-16 Thread Claudio Freire
On Tue, Dec 16, 2014 at 12:15 PM, Robert Haas robertmh...@gmail.com wrote:
 langote_amit...@lab.ntt.co.jp wrote:
 Robert wrote:
 On Sun, Dec 14, 2014 at 9:12 PM, Amit Langote
 langote_amit...@lab.ntt.co.jp wrote:
  This means if a user puts arbitrary expressions in a partition 
  definition, say,
 
  ... FOR VALUES  extract(month from current_date) TO extract(month from
 current_date + interval '3 months'),
 
  we make sure that those expressions are pre-computed to literal values.

 I would expect that to fail, just as it would fail if you tried to
 build an index using a volatile expression.

 Oops, wrong example, sorry. In case of an otherwise good expression?

 I'm not really sure what you are getting here.  An otherwise-good
 expression basically means a constant.  Index expressions have to be
 things that always produce the same result given the same input,
 because otherwise you might get a different result when searching the
 index than you did when building it, and then you would fail to find
 keys that are actually present.

I think the point is partitioning based on the result of an expression
over row columns. Or if it's not, it should be made anyway:

PARTITION BY LIST (extract(month from date_created) VALUES (1, 3, 6, 9, 12);

Or something like that.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Claudio Freire
On Tue, Dec 16, 2014 at 11:47 AM, Mark Cave-Ayland
mark.cave-ayl...@ilande.co.uk wrote:
 On 16/12/14 13:37, Claudio Freire wrote:

 For the second project, I can skim through my inbox daily picking up
 specific areas I work on/are interested in, hit reply to add a couple of
 lines of inline comments to the patch and send feedback to the
 author/list in just a few minutes.

 Notice though that on the second project, the review is made in the
 air. You didn't build, nor ran tests, you're guessing how the code
 performs rather than seeing it perform, so the review is light
 compared to the first.

 I think this doing developers in general a little injustice here. The
 general rule for a submitting patch to *any* project is: i) does it
 apply to the current source tree and ii) does it build and pass the
 regression tests?

That it's a policy doesn't guarantee that people submitting patches
will adhere. They could be failing to adhere even unknowingly (ie:
bitrot - there's quite some time going on from first submission to
first review).

 Maybe there is a greater incidence of this happening in PostgreSQL
 compared to other projects?

Perhaps, but it's because the reviewers take care to test things
before they hit the build farm.

That basic testing really is something for a CI tool, like the build
farm itself, not reviewers. But you cannot wait until after comitting
to let the build farm tell you something's broken: you need CI results
*during* review.

 But in any case the project has every right
 to refuse further review if these 2 simple criteria are not met.

Nobody said otherwise

 Also
 with a submission from git, you can near 100% guarantee that the author
 has actually built and run the code before submission.

I don't see how. Forks don't have travis ci unless you add it, or am I mistaken?

 You mention about performance, but again I've seen from this list that
 good developers can generally pick up on a lot of potential regressions
 by eye, e.g. lookup times go from O(N) to O(N^2) without having to
 resort to building and testing the code. And by breaking into small
 chunks people can focus their time on reviewing the areas they are good at.

I meant performance as in running, not really performance. Sorry for
the confusion.

I meant, you're looking at the code and guessing how it runs, but you
don't really know. It's easy to make assumptions that are false while
reviewing, quickly disproved by actually running tests.

A light review without ever building can fail to detect those issues.
A heavier review patching up and building manually is too much manual
work that could really be automated. The optimum is somewhere between
those two extremes.

 Occasionally sometimes people do get a patch ready for commit but it
 fails build/test on one particular platform.

Again, pre-review CI can take care of this.

 In that case, the committer
 simply posts the build/test failure to the list and requests that the
 patchset be fixed ready for re-test. This is a very rare occurrence though.

It shouldn't need to reach the repo, don't you think?

 Also you mentioned about light review but I wouldn't call it that.

I've made my fair share of light reviews (not for pg though) and can
recognize the description. It is a light review what you described.

Surely not all reviews with inline comments are light reviews, but
what you described was indeed a light review.

 A lot of initial comments for the second project are along the lines of
 this doesn't look right - won't this overflow on values 2G? or
 you're assuming here that A occurs before B, whereas if you run with
 -foo this likely won't work. And this is the area which I believe will
 have the greatest benefit for PostgreSQL - making it easier to catch and
 rework the obvious flaws in the patch in a timely manner before it gets
 to the committer.

If you read carefully my reply, I'm not at all opposed to that. I'm
just pointing out, that easier reviews need not result in better
reviews.

Better, easier reviews are those where the trivial reviews are
automated, as is done in the project I linked.

Formatting, whether it still applies, and whether it builds and checks
pass, all those are automatable.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres TR for missing chunk

2014-12-16 Thread Robert Haas
On Tue, Dec 16, 2014 at 1:02 AM, M Tarkeshwar Rao
m.tarkeshwar@ericsson.com wrote:
 Can you please tell me the how can I track the which bugs are fixed in which
 release and when they will be fixed,

We don't have a bug tracker, but you can look at the mailing list
threads and the source code repository.

 BUG #9187: corrupt toast tables

 http://www.postgresql.org/message-id/30154.1392153...@sss.pgh.pa.us
 http://www.postgresql.org/message-id/cafj8praufpttn5+ohfqpbcd1jzkersck51uakhcwd8nt4os...@mail.gmail.com
 http://www.postgresql.org/message-id/20140211162408.2713.81...@wrigleys.postgresql.org

There's not enough information in that bug report for us to be certain
whether it's a hardware issue or a bug, so I wouldn't expect any
further action on that particular report.

 BUG #7819: missing chunk number 0 for toast value 1235919 in pg_toast_35328

 http://www.postgresql.org/message-id/C62EC84B2D3CF847899CCF4B589CCF70B20AA08F@BBMBX.backbone.local

I don't think this has been fixed, but it's probably unlikely to
account for your issue.  It seems that errors due to this problem
would be transient, and you'd have to be doing something strange, like
increasing vacuum_defer_cleanup_age on the fly.  If you're seeing the
same error on the same TOAST table and TOAST value repeatedly, there's
something else going on.  More details might help us figure out what.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Mark Cave-Ayland
On 16/12/14 15:42, Claudio Freire wrote:

 Also
 with a submission from git, you can near 100% guarantee that the author
 has actually built and run the code before submission.
 
 I don't see how. Forks don't have travis ci unless you add it, or am I 
 mistaken?

Well as I mentioned in my last email, practically all developers will
rebase and run make check on their patched tree before submitting to
the list. Hopefully there aren't hordes of developers deliberating
creating and submitting broken patches ;)

 You mention about performance, but again I've seen from this list that
 good developers can generally pick up on a lot of potential regressions
 by eye, e.g. lookup times go from O(N) to O(N^2) without having to
 resort to building and testing the code. And by breaking into small
 chunks people can focus their time on reviewing the areas they are good at.
 
 I meant performance as in running, not really performance. Sorry for
 the confusion.

Okay no worries.

 I meant, you're looking at the code and guessing how it runs, but you
 don't really know. It's easy to make assumptions that are false while
 reviewing, quickly disproved by actually running tests.
 
 A light review without ever building can fail to detect those issues.
 A heavier review patching up and building manually is too much manual
 work that could really be automated. The optimum is somewhere between
 those two extremes.

Correct. My analogy here was that people with varying abilities review
the patches at their experience level, and once low-hanging/obvious
design issues have been resolved then more experienced developers will
start to review the patch at a deeper level.

 Occasionally sometimes people do get a patch ready for commit but it
 fails build/test on one particular platform.
 
 Again, pre-review CI can take care of this.

Yes - see my next reply...

 In that case, the committer
 simply posts the build/test failure to the list and requests that the
 patchset be fixed ready for re-test. This is a very rare occurrence though.
 
 It shouldn't need to reach the repo, don't you think?

When I say repo, I mean the local repo of the tree maintainer. Currently
the tree maintainer pulls each merge request into his local tree,
performs a buildfarm test and only pushes the merge upstream once this
has passed. I guess the PostgreSQL equivalent of this would be having a
merge-pending branch on the buildfarm rather than just a post-commit
build of git master (which we see from reports to the list periodically
fails in this way) and only pushing a set of patches when the buildfarm
comes back green.

 Also you mentioned about light review but I wouldn't call it that.
 
 I've made my fair share of light reviews (not for pg though) and can
 recognize the description. It is a light review what you described.
 
 Surely not all reviews with inline comments are light reviews, but
 what you described was indeed a light review.
 
 A lot of initial comments for the second project are along the lines of
 this doesn't look right - won't this overflow on values 2G? or
 you're assuming here that A occurs before B, whereas if you run with
 -foo this likely won't work. And this is the area which I believe will
 have the greatest benefit for PostgreSQL - making it easier to catch and
 rework the obvious flaws in the patch in a timely manner before it gets
 to the committer.
 
 If you read carefully my reply, I'm not at all opposed to that. I'm
 just pointing out, that easier reviews need not result in better
 reviews.
 
 Better, easier reviews are those where the trivial reviews are
 automated, as is done in the project I linked.

Yes I can definitely agree with that. See below again:

 Formatting, whether it still applies, and whether it builds and checks
 pass, all those are automatable.

I should add that QEMU provides a branch of checkpatch.pl in the source
tree which submitters are requested to run on their patchset before
submission to the list. This catches patches that don't meet the project
style guidelines including casing, braces, trailing whitespace,
tab/space issues etc. and that's before the patch is even submitted to
the list.


ATB,

Mark.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [alvhe...@2ndquadrant.com: Re: [HACKERS] no test programs in contrib]

2014-12-16 Thread Andrew Dunstan




On 12/16/2014 09:31 AM, Alvaro Herrera wrote:

Hi Andrew,

Did you have a chance to review this?




Oh, darn, not yet. I will try to take a look today.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] speedup tidbitmap patch: cache page

2014-12-16 Thread Teodor Sigaev

I've been having a look at this and I'm wondering about a certain scenario:

In tbm_add_tuples, if tbm_page_is_lossy() returns true for a given block, and on
the next iteration of the loop we have the same block again, have you
benchmarked any caching code to store if tbm_page_is_lossy() returned true for
that block on the previous iteration of the loop? This would save from having to
call tbm_page_is_lossy() again for the same block. Or are you just expecting
that tbm_page_is_lossy() returns true so rarely that you'll end up caching the
page most of the time, and gain on skipping both hash lookups on the next loop,
since page will be set in this case?
I believe that if we fall in lossy pages then tidbitmap will not have a 
significant impact on preformance because postgres will spend a lot of  time on 
tuple rechecking on page. If work_mem is to small to keep exact tidbitmap then 
postgres will significantly slowdown. I implemented it, (v2.1 in attachs) but

I don't think that is an improvement, at least significant improvement.



It would be nice to see a comment to explain why it might be a good idea to
cache the page lookup. Perhaps something like:



added, see attachment (v2)



I also wondered if there might be a small slowdown in the case where the index
only finds 1 matching tuple. So I tried the following:
avg.2372.4456 2381.909 99.6%
med.2371.224 2359.494 100.5%

It appears that if it does, then it's not very much.


I believe, that's unmeasurable because standard deviation of your tests is about 
2% what is greater that difference between pathed and master versions.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


tbm_cachepage-2.patch.gz
Description: GNU Zip compressed data


tbm_cachepage-2.1.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup vs. Windows and tablespaces

2014-12-16 Thread Andrew Dunstan


On 12/16/2014 04:34 AM, Amit Kapila wrote:
On Tue, Dec 16, 2014 at 12:58 PM, Amit Kapila amit.kapil...@gmail.com 
mailto:amit.kapil...@gmail.com wrote:


 On Sun, Dec 14, 2014 at 11:54 AM, Amit Kapila 
amit.kapil...@gmail.com mailto:amit.kapil...@gmail.com wrote:
  On Sat, Dec 13, 2014 at 10:48 PM, Tom Lane t...@sss.pgh.pa.us 
mailto:t...@sss.pgh.pa.us wrote:
   Sorry, I was not paying very close attention to this thread and 
missed

   the request for comments.  A few such:
  
   1. The patch is completely naive about what might be in the symlink
   path string; eg embedded spaces in the path would break it.  On at
   least some platforms, newlines could be in the path as well.  
I'm not
   sure about how to guard against this while maintaining human 
readability

   of the file.
  
 
  I will look into this and see what best can be done.
 

 I have chosen #3 (Make pg_basebackup check for and fail on symlinks
 containing characters (currently newline only) it can't handle) from the
 different options suggested by Tom.  This keeps the format same as
 previous and human readable.


Actually, here instead of an error a warning is issued and that particular
path (containing new line) will be skipped.  This is similar to what
is already done for the cases when there is any problem in reading
link paths.




I'm not clear why human readability is the major criterion here. As for 
that, it will be quite difficult for a human to distinguish a name with 
a space at the end from one without. I really think a simple encoding 
scheme would be much the best. For normal cases it will preserve 
readability completely, and for special cases it will preserve lack of 
any ambiguity.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Robert Haas
On Tue, Dec 16, 2014 at 2:33 AM, David Rowley dgrowle...@gmail.com wrote:
 I'd just like to add something which might be flying below the radar of more
 senior people. There are people out there  (ike me)  working on PostgreSQL
 more for the challenge and perhaps the love of the product, who make
 absolutely zero money out of it. For these people getting credit where it's
 due is very important. I'm pretty happy with this at the moment and I can't
 imagine any situation where not crediting reviewers would be beneficial to
 anyone.

We routinely and regularly contribute reviews in the commit logs for
precisely this reason.  I don't think anyone is opposed to that.
There is some opposition to crediting them in the release notes
because the one time Bruce tried it made for an enormous volume of
additional text in the release notes, and there were cases where
people's names were mentioned on relatively equal footing when their
contributions were very much unequal.  For example, let's take a look
at the commit message for Hot Standby:

Simon Riggs, with significant and lengthy review by Heikki
Linnakangas, including streamlined redesign of snapshot creation and
two-phase commit.

Important contributions from Florian Pflug, Mark Kirkwood, Merlin
Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing,
Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback
from many other community members.

The release note ended up looking like this:

Allow a standby server to accept read-only queries (Simon Riggs,
Heikki Linnakangas)

Including all of the other names of people who made important
contributions, many of which consisted of reviewing, would make that
release note item - and many others - really, really long, so I'm not
in favor of that.  Crediting reviewers is important, but so is having
the release notes be readable.

It has been proposed that we do a general list of people at the bottom
of the release notes who helped review during that cycle.  That would
be less intrusive and possibly a good idea, but would we credit the
people who did a TON of reviewing?  Everyone who reviewed even one
patch?  Somewhere in between? Would committers be excluded because we
just expect them to help or included because credit is important to
established community members too?  To what extent would this be
duplicative of http://www.postgresql.org/community/contributors/ ?
I'm not necessarily averse to doing something here, but the reason why
nothing has happened has much more to do with the fact that it's hard
to figure out exactly what the best thing would be than any idea that
we don't want to credit reviewers.  We do want to credit reviewers,
AND WE DO, as a quick look at 'git log' will speedily reveal.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup vs. Windows and tablespaces

2014-12-16 Thread Heikki Linnakangas

On 12/16/2014 06:30 PM, Andrew Dunstan wrote:

I'm not clear why human readability is the major criterion here. As for
that, it will be quite difficult for a human to distinguish a name with
a space at the end from one without. I really think a simple encoding
scheme would be much the best. For normal cases it will preserve
readability completely, and for special cases it will preserve lack of
any ambiguity.


Agreed. Besides, this:

16387 E:\\Program\ Files\\PostgreSQL\\tbs

is almost as human-readable as this:

16387 E:\Program Files\PostgreSQL\tbs

It's obvious how the escaping works, just by looking at the file.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Robert Haas
On Tue, Dec 16, 2014 at 12:18 AM, Josh Berkus j...@agliodbs.com wrote:
 On 12/15/2014 07:34 PM, Andres Freund wrote:
 On 2014-12-15 16:14:30 -0800, Josh Berkus wrote:
 Read the thread on this list where I suggested crediting reviewers in
 the release notes.

 Man. You're equating stuff that's not the same. You didn't get your way
 (and I'm tentatively on your side onthat one) and take that to imply
 that we don't want more reviewers.

 During that thread a couple people said that novice reviewers added no
 value to the review process, and nobody argued with them then.  I've
 also been told this to my face at pgCon, and when I've tried organizing
 patch review events.  I got the message, which is why I stopped trying
 to get new reviewers.

I think what was said is that it isn't very useful to have reviewers
who just report that the patch applies and passes make check.  But any
review that does any study of the code, or finds a bug in the
functionality, or reports that the patch does NOT apply and/or pass
make check, or suggests a way that the documentation could be
improved, or fixes a typo in a comment, or diagnoses a whitespace
error is useful.  Summarizing that as novice reviewers added no value
to the review process is like summarizing the plot of Superman as
alien invades earth.

 And frankly: if we're opposed to giving credit to patch reviewers, we're
 opposed to having them.

This is also an incredibly misleading summary of what the real issue
was, as I just said in my previous email.  We do credit reviewers,
routinely, and have for years.  We have not reached an agreement on
whether or exactly how to credit them in the release notes.  You may
think that there's no value in having your name show up in a commit
log message and that the release notes are the only thing that counts,
but I don't think everyone feels that way.  I *still* get a kick out
of it every time somebody types my name into one of those messages.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Andy Colson

On 12/16/2014 4:32 AM, Simon Riggs wrote:

On 15 December 2014 at 19:52, Josh Berkus j...@agliodbs.com wrote:

On 12/15/2014 11:27 AM, Robert Haas wrote:

I feel like we used to be better at encouraging people to participate
in the CF even if they were not experts, and to do the best they can
based on what they did know.  That was a helpful dynamic.  Sure, the
reviews weren't perfect, but more people helped, and reviewing some of
the patch well and some of it in a more cursory manner is way better
than reviewing none of it at all.


Well, it was strongly expressed to me by a number of senior contributors
on this list and at the developer meeting that inexpert reviews were not
really wanted, needed or helpful.  As such, I stopped recruiting new
reviewers (and, for that matter, doing them myself).  I don't know if
the same goes for anyone else.


I don't remember saying that, hearing it or agreeing with it and I
don't agree with it now.



As a reviewer from long long ago, I can tell you that I wasn't sure I 
was even helpful.


Things got busy at work, and I may not have been useful, and I annoyed 
some people on pg-general, so I walked away for a while.


I have no knowledge of community-building so this might be a bad idea:

Perhaps levels (or titles) of reviewer's would be helpful.  A freshman 
reviewer is not expected to do anything useful, is expected to make 
mistakes, and is there to learn.


The community votes and promotes them to junior (or whatever).  They 
know they are on the right track.  A junior review is useful but maybe 
not as complete as a senior reviewer.  Maybe I'll aspire to work harder, 
and maybe not, but at least I know what I'm doing is useful.  If I never 
get promoted, then I know, as well.


Freshmen know its ok to make mistakes.  They know who they can contact 
for help.


I think I like belts better (yellow, green, red, black).

I think this gives me two things:
  1) permission to mess up
  2) ability to measure myself

-Andy


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Mike Blackwell
On Tue, Dec 16, 2014 at 10:15 AM, Mark Cave-Ayland 
mark.cave-ayl...@ilande.co.uk wrote:


 Well as I mentioned in my last email, practically all developers will
 rebase and run make check on their patched tree before submitting to
 the list.


​Even when this is true, and with people new to the project submitting
patches
I'm not sure it can be assumed, time passes and things can change between
submission​ and review.  I've seen a fair number of Needs rebase comments
on patches, through no fault of the original submitter.


Re: [HACKERS] [Bug] Inconsistent result for inheritance and FOR UPDATE.

2014-12-16 Thread Robert Haas
On Tue, Dec 16, 2014 at 12:21 AM, Kyotaro HORIGUCHI
horiguchi.kyot...@lab.ntt.co.jp wrote:
 Hello, I have a favor for you committers.

 I confirmed that this issue the another have been fixed in the
 repository, thank you.

 Then, could you please give me when the next release of 9.2.10 is
 to come?

 The bugs are found in some system under developing, which is to
 make use of PG9.2 and it wants to adopt the new release.

We seem not to have had a new release of 9.2 since July, which is an
awfully long time ago.  So, hopefully soon?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres TR for missing chunk

2014-12-16 Thread Tom Lane
Jaime Casanova ja...@2ndquadrant.com writes:
 You know, that toast table name ringed a bell.
 Look at this thread maybe this is your problem, and if it is then is
 already fixed and you should update.
 http://www.postgresql.org/message-id/12138.1336019...@sss.pgh.pa.us

That was about transient failures though, not persistent ones,
which is what the OP seems to be claiming he's getting.

 Btw, when giving a bug report you should start but saying your PostgreSQL's
 version and explain what you did based on Google's wisdom

Yeah.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Bug] Inconsistent result for inheritance and FOR UPDATE.

2014-12-16 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Dec 16, 2014 at 12:21 AM, Kyotaro HORIGUCHI
 horiguchi.kyot...@lab.ntt.co.jp wrote:
 Then, could you please give me when the next release of 9.2.10 is
 to come?

 We seem not to have had a new release of 9.2 since July, which is an
 awfully long time ago.  So, hopefully soon?

Nothing's likely to happen during the holidays, so probably mid-January
is the earliest feasible target.

I agree we're a bit overdue.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] speedup tidbitmap patch: hash BlockNumber

2014-12-16 Thread Teodor Sigaev

I think this suggestion is misguided, and the patch itself needs
rethinking.  Instead of doing this, let's hack dynahash.c itself
to substitute a shim like this when it's told function == tag_hash and
keysize == sizeof(uint32).  Then we can remove any similar shims that
already exist, and possibly end up with a net savings of code rather than
adding more.

done, actoually I found oid_hash shim only.

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


tbm_blocknumber-3.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WALWriter active during recovery

2014-12-16 Thread Simon Riggs
On 16 December 2014 at 14:12, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 On 12/15/2014 08:51 PM, Simon Riggs wrote:

 Currently, WALReceiver writes and fsyncs data it receives. Clearly,
 while we are waiting for an fsync we aren't doing any other useful
 work.

 Following patch starts WALWriter during recovery and makes it
 responsible for fsyncing data, allowing WALReceiver to progress other
 useful actions.


 What other useful actions can WAL receiver do while it's waiting? It doesn't
 do much else than receive WAL, and fsync it to disk.

So now it will only need to do one of those two things.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 release scheduling (was Re: logical column ordering)

2014-12-16 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 2. It's not clear that we're going to have a particularly-impressive
 list of major features for 9.5.  So far we've got RLS and BRIN. I
 expect that GROUPING SETS is far enough along that it should be
 possible to get it in before development ends, and there are a few
 performance patches pending (Andres's lwlock scalability patches,
 Rahila's work on compressing full-page writes) that I think will
 probably make the grade.  But after that it seems to me that it gets
 pretty thin on the ground.  

When it comes to a list of major features for 9.5, it'd be pretty
terrible, imv, if we manage to screw up and not get UPSERT taken care
of (again..).  BDR will be great to have too, but we lose out far more
often for lack of what those outside the community perceive as a simple
and obvious feature that nearly every other system they deal with has.

 Now, against all that, if we don't get back on our usual release
 schedule then (a) it will look like we're losing momentum, which I'm
 actually afraid may be true rather than merely a perception, and (b)
 people whose stuff did get in will have to wait longer to see it
 released.  So, I'm not sure waiting is any better.  But there are
 certainly some things not to like about where we are.

I agree with both of these.  It doesn't help that we have non-committers
working on major patches for years and aren't able to see the fruits of
their labors.  I'm as much at fault for that happening at times as
anyone and I don't have any silver bullets but I certainly don't like
it.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] compress method for spgist - 2

2014-12-16 Thread Teodor Sigaev

 For some datatypes, the compress method might be useful even if the leaf
 type is the same as the column type. For example, you could allow
 indexing text datums larger than the page size, with a compress function
 that just truncates the input.

Agree, and patch allows to use compress method in this case, see begining of 
spgdoinsert()



 Could you find some use for this in one of the built-in or contrib
 types? Just to have something that exercises it as part of the
 regression suite. How about creating an opclass for the built-in polygon
 type that stores the bounding box, like the PostGIS guys are doing?

Will try, but I don't have nice idea. Polygon opclass will have awful 
performance until PostGIS guys show the tree structure.


 The documentation needs to be updated.
Added.

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


spgist_compress_method-3.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Josh Berkus
On 12/16/2014 08:48 AM, Mike Blackwell wrote:
 On Tue, Dec 16, 2014 at 10:15 AM, Mark Cave-Ayland
 mark.cave-ayl...@ilande.co.uk mailto:mark.cave-ayl...@ilande.co.ukwrote:
 
 
 Well as I mentioned in my last email, practically all developers will
 rebase and run make check on their patched tree before submitting to
 the list. 
 
 
 ​Even when this is true, and with people new to the project submitting
 patches
 I'm not sure it can be assumed, time passes and things can change between
 submission​ and review.  I've seen a fair number of Needs rebase comments
 on patches, through no fault of the original submitter.

This really should be taken care of by automation.  Magnus's new system
will be a significant step forwards on enabling that.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Stephen Frost
* Craig Ringer (cr...@2ndquadrant.com) wrote:
 It's not like development on a patch series is difficult. You commit
 small fixes and changes, then you 'git rebase -i' and reorder them to
 apply to the appropriate changesets. Or you can do a 'rebase -i' and in
 'e'dit mode make amendments to individual commits. Or you can commit
 'fixup!'s that get auto-squashed.

I don't have a problem with this, but it's an independent consideration
for how a patch might be developed vs. what the main git repo looks
like.  I don't think it makes sense to commit to the main repo a new
catalog table without any commands to manage it, nor to have a catalog
table which can be managed through the grammar but doesn't actually do
anything due to lacking the backend code for it.

Now, I fully agree that we want to continue to build features on top of
other features, but, in general, those need to be independently valuable
features when they are committed to the main repo (eg: WITH CHECK for
security barrier views went in first as it was independently useful, and
then RLS simply built on top of it).

 This is part of my grumbling about the use of git like it's still CVS.

Our git repo is actually readable and reasonably easy to follow and, for
my part, that's a great thing we have that most other projects don't.
That isn't to say that we shouldn't develop with smaller pieces, but I
tend to agree with Tom that it's actually easier (for me, at least) to
review a single, complete, patch than a bunch of independent ones which
build on each other.  Perhaps that's my failing or a fault of my
processes, but I will actually sit and read through a patch in my email
client quite often.  In general, I expect the code to compile and pass
'make check', those are necessary, of course, but detecting compile-time
problems is something the compiler is good at and I'm not.  Thinking
about the impact of a new data structure, a given code block, or making
sure that all of the pieces of a new catalog row are set correctly are
things that the compiler isn't good at and is where a reviewer is most
valuable.

Pulling the code in and testing it by hand is useful, as is getting into
gdb and looking at the structures as they are manipulated, but I find it
extremely important to also actually review the *code*, which means
reading it and considering are there places this patch needs to be
touching that it isn't?  how will this other bit of code react to these
changes?  does this code still look sane and like one person thought
through the whole code path?  do the comments explain why things are
happening or do they just repeat what the code already says?, etc.

This goes back to the excellent point elsewhere on this thread that the
current documentation for reviewers is far too focused on the mechanical
bits which could basically be automated (in fact, I think Peter's
already got most of it automated..).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-12-16 Thread Stephen Frost
* Michael Paquier (michael.paqu...@gmail.com) wrote:
 Do you have real numbers about the performance impact that this module
 has when plugged in the server? If yes, it would be good to get an
 idea of how much audit costs and if it has a clear performance impact
 this should be documented to warn the user. Some users may really need
 audit features as you mention, but the performance drop could be an
 obstacle for others so they may prefer continue betting on performance
 instead of pgaudit.

While these performance numbers would be interesting, I don't feel it's
necessary to consider the performance of this module as part of the
question about if it should go into contrib or not.

 Where are we on this? This patch has no activity for the last two
 months... So marking it as returned with feedback. It would be good to
 see actual numbers about the performance impact this involves.

What I was hoping to see were the changes that I had suggested
up-thread, but I discovered about a week or two ago that there was a
misunderstanding between at least Abhijit and myself about what was
being suggested.

For the sake of the archives, the idea I had been trying to propose is
to use a role's permissions as a mechanism to define what should be
audited.  An example is:

The magic audit role has SELECT rights on a given table.  When any
user does a SELECT against that table, ExecCheckRTPerms is called and
there's a hook there which the module can use to say ok, does the audit
role have any permissions here? and, if the result is yes, then the
command is audited.  Note that this role, from core PG's perspective,
wouldn't be special at all; it would just be that pgaudit would use the
role's permissions as a way to figure out if a given command should be
audited or not.

That's not to say that we couldn't commit pgaudit without this
capability and add it later, but I had been expecting to see progress
along these lines prior to reviewing it.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Stephen Frost
David,

* David Rowley (dgrowle...@gmail.com) wrote:
 I'd just like to add something which might be flying below the radar of
 more senior people. There are people out there  (ike me)  working on
 PostgreSQL more for the challenge and perhaps the love of the product, who
 make absolutely zero money out of it. For these people getting credit where
 it's due is very important. I'm pretty happy with this at the moment and I
 can't imagine any situation where not crediting reviewers would be
 beneficial to anyone.

Thanks for this.  One question which has been brought up in the past is
the specific level of credit.  That you're happy with the credit
you've been given thus far is great as it happens to support the side
that I'm on. :D

However, could you quantify what, exactly, you feel is approrpiate
credit for reviewers and authors..?  I'll intentionally omit the options
that have been presented in the past to try and avoid influencing your
response.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Atri Sharma
On Wed, Dec 17, 2014 at 12:03 AM, Stephen Frost sfr...@snowman.net wrote:

 David,

 * David Rowley (dgrowle...@gmail.com) wrote:
  I'd just like to add something which might be flying below the radar of
  more senior people. There are people out there  (ike me)  working on
  PostgreSQL more for the challenge and perhaps the love of the product,
 who
  make absolutely zero money out of it. For these people getting credit
 where
  it's due is very important. I'm pretty happy with this at the moment and
 I
  can't imagine any situation where not crediting reviewers would be
  beneficial to anyone.

 Thanks for this.  One question which has been brought up in the past is
 the specific level of credit.  That you're happy with the credit
 you've been given thus far is great as it happens to support the side
 that I'm on. :D

 However, could you quantify what, exactly, you feel is approrpiate
 credit for reviewers and authors..?  I'll intentionally omit the options
 that have been presented in the past to try and avoid influencing your
 response.



Mentioning them in list of contributors, for one.


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 Including all of the other names of people who made important
 contributions, many of which consisted of reviewing, would make that
 release note item - and many others - really, really long, so I'm not
 in favor of that.  Crediting reviewers is important, but so is having
 the release notes be readable.

Agreed.

 It has been proposed that we do a general list of people at the bottom
 of the release notes who helped review during that cycle.  That would
 be less intrusive and possibly a good idea, but would we credit the
 people who did a TON of reviewing?  Everyone who reviewed even one
 patch?  Somewhere in between? Would committers be excluded because we
 just expect them to help or included because credit is important to
 established community members too?  To what extent would this be
 duplicative of http://www.postgresql.org/community/contributors/ ?

I don't particularly like this idea.

 I'm not necessarily averse to doing something here, but the reason why
 nothing has happened has much more to do with the fact that it's hard
 to figure out exactly what the best thing would be than any idea that
 we don't want to credit reviewers.  We do want to credit reviewers,
 AND WE DO, as a quick look at 'git log' will speedily reveal.

Agreed.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] On partitioning

2014-12-16 Thread Josh Berkus
On 12/15/2014 10:55 AM, Robert Haas wrote:
 This means if a user puts arbitrary expressions in a partition definition, 
 say,
 
  ... FOR VALUES  extract(month from current_date) TO extract(month from 
  current_date + interval '3 months'),
 
  we make sure that those expressions are pre-computed to literal values.
 I would expect that to fail, just as it would fail if you tried to
 build an index using a volatile expression.

Yes, I wasn't saying that expressions should be used when *creating* the
partitions, which strikes me as a bad idea for several reasons.
Expressions should be usable when SELECTing data from the partitions.
Right now, they aren't, because the planner picks parttiions well before
the rewrite phase which would reduce extract (month from current_date)
to a constant.

Right now, if you partition by an integer ID even, and do:

SELECT * FROM partitioned_table WHERE ID = ( 3 + 4 )

... postgres will scan all partitions because ( 3 + 4 ) is an expression
and isn't evaluated until after CE is done.

I don't think there's an easy way to do the expression rewrite while
we're still in planning, is there?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-12-16 Thread Jeff Janes
On Mon, Dec 15, 2014 at 5:05 PM, Peter Geoghegan p...@heroku.com wrote:

 
  Maybe IGNORE is defined as a macro in MinGW?
  Try s/IGNORE/IGNORE_P/g throughout the patch.

 BTW, the gcc -E flag does this. So figure out what exact arguments
 MinGW's gcc is passed in the ordinary course of compiling gram.c, and
 prepend -E to the list of existing flags while manually executing
 gcc -- that should let you know exactly what's happening here.


Yep, I tried that trick and had decided it didn't work in MinGW.  But I
think it was a user error--I must have somehow broken up the build tree and
'make' didn't detect the problem.  Now I see that IGNORE is getting turned
to 0.

Your new version 1.7 of the patches fixes that issue, as well as the OID
conflict.

Thanks,

Jeff


Re: [HACKERS] Streaming replication and WAL archive interactions

2014-12-16 Thread Heikki Linnakangas

On 12/16/2014 10:24 AM, Borodin Vladimir wrote:

12 дек. 2014 г., в 16:46, Heikki Linnakangas
hlinnakan...@vmware.com написал(а):


There have been a few threads on the behavior of WAL archiving,
after a standby server is promoted [1] [2]. In short, it doesn't
work as you might expect. The standby will start archiving after
it's promoted, but it will not archive files that were replicated
from the old master via streaming replication. If those files were
not already archived in the master before the promotion, they are
not archived at all. That's not good if you wanted to restore from
a base backup + the WAL archive later.

The basic setup is a master server, a standby, a WAL archive that's
shared by both, and streaming replication between the master and
standby. This should be a very common setup in the field, so how
are people doing it in practice? Just live with the wisk that you
might miss some files in the archive if you promote? Don't even
realize there's a problem? Something else?


Yes, I do live like that (with streaming replication and shared
archive between master and replicas) and don’t even realize there’s a
problem :( And I think I’m not the only one. Maybe at least a note
should be added to the documentation?


Let's try to figure out a way to fix this in master, but yeah, a note in 
the documentation is in order.



And how would we like it to work?


Here's a plan:

Have a mechanism in the standby, to track how far the master has 
archived its WAL, and don't throw away WAL in the standby that hasn't 
been archived in the master yet. This is similar to the physical 
replication slots, which prevent the master from recycling WAL that a 
standby hasn't received yet, but in reverse. I think we can use the 
.done and .ready files for this. Whenever a file is streamed 
(completely) from the master, create a .ready file for it. When we get 
an acknowledgement from the master that it has archived it, create a 
.done file for it. To get the information from the master, add the last 
archived WAL segment e.g. in the streaming replication keep-alive 
message, or invent a new message type for it.


At promotion, archive all the WAL from the old timeline that the master 
hadn't already archived. While doing this, the archive_command can be 
called for files that have in fact already been archived in the master, 
so the command needs to return success if it's asked to archive a file 
and an identical file already exists in the archive. That's a bit 
difficult to write into a one-liner, but hopefully we can still provide 
an example of this. Or have another command, e.g. 
promotion_archive_command, which can just assume that everything is OK 
if the file already exists.


To enable this new mode, let's add a third option to archive_mode, 
besides on/off. Or just make this the default; I'm not sure if anyone 
would want the old behavior.



There was some discussion in August on enabling WAL archiving in
the standby, always [3]. That's a related idea, but it assumes that
you have a separate archive in the master and the standby. The
problem at promotion happens when you have a shared archive between
the master and standby.


AFAIK most people use the scheme with shared archive.


Yeah. Anyway, we can support both scenarios.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REVIEW: Track TRUNCATE via pgstat

2014-12-16 Thread Alex Shulgin

Alvaro Herrera alvhe...@2ndquadrant.com writes:

 Alex Shulgin wrote:

 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  
  Another idea would be exposing pgstat_report_stat(true) at SQL level.
  That would eleminate the need for explicit pg_sleep(=0.5), but we'll
  still need the wait_for_... call to make sure the collector has picked
  it up.
 
  We already have a stats test that sleeps.  Why not add this stuff there,
  to avoid making another test slow?
 
 Well, if we could come up with a set of statements to test that would
 produce the end result unambigously, so that we can be certain the stats
 we check at the end cannot be a result of neat interaction of buggy
 behavior...

 It is always possible that things work just right because two bugs
 cancel each other.

 I'm not sure this is at all possible, but I know for sure it will make
 debugging the possible fails harder.

 Surely if some other patch introduces a failure, nobody will try to
 debug it by looking only at the failures of this test.

 Even with the current approach of checking the stats after every
 isolated case it's sometimes takes quite a little more than a glance
 to verify correctness due to side-effects of rollback (ins/upd/del
 counters are still updated), and the way stats are affecting the dead
 tuples counter.

 Honestly I think pg_regress is not the right tool to test stat counter
 updates.  It kind-of works today, but only because we don't stress it
 too much.  If you want to create a new test framework for pgstats, and
 include some tests for truncate, be my guest.

Yes, these are all good points.  Actually, moving the test to stats.sql
helped me realize the current patch behavior is not strictly correct:
there's a corner case when you insert/update before truncate in
transaction, which is later aborted.  I need to take a closer look.

--
Alex
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-12-16 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 So my two cents is that when considering a qualified name, this patch
 should take levenshtein distance across the two components equally.
 There's no good reason to suppose that typos will attack one name
 component more (nor less) than the other.

Agreed (since it seems like folks are curious for the opinion's of
mostly bystanders).

+1 to the above for my part.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Andrew Dunstan


On 12/16/2014 01:38 PM, Stephen Frost wrote:

* Robert Haas (robertmh...@gmail.com) wrote:

Including all of the other names of people who made important
contributions, many of which consisted of reviewing, would make that
release note item - and many others - really, really long, so I'm not
in favor of that.  Crediting reviewers is important, but so is having
the release notes be readable.

Agreed.


It has been proposed that we do a general list of people at the bottom
of the release notes who helped review during that cycle.  That would
be less intrusive and possibly a good idea, but would we credit the
people who did a TON of reviewing?  Everyone who reviewed even one
patch?  Somewhere in between? Would committers be excluded because we
just expect them to help or included because credit is important to
established community members too?  To what extent would this be
duplicative of http://www.postgresql.org/community/contributors/ ?

I don't particularly like this idea.



I do. I think it's an emminently sensible idea that gives credit without 
disturbing the readability of the release notes.


As for where we draw the line, I would rather me more than less 
inclusive. Anyone who gets a review credit in the git log should be 
mentioned. I don't care that much whether or not committers are mentioned.





I'm not necessarily averse to doing something here, but the reason why
nothing has happened has much more to do with the fact that it's hard
to figure out exactly what the best thing would be than any idea that
we don't want to credit reviewers.  We do want to credit reviewers,
AND WE DO, as a quick look at 'git log' will speedily reveal.

Agreed.





I can't believe how much we are tying ourselves up in knots over this. 
It's not a good sign. Surely we trust the committers and the preparers 
of the release notes to use some judgement, once we agree on general 
guidelines.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] advance local xmin more aggressively

2014-12-16 Thread Jeff Janes
On Wed, Dec 10, 2014 at 3:46 PM, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Dec 10, 2014 at 3:28 PM, Heikki Linnakangas
 hlinnakan...@vmware.com wrote:
  Care to code it up?
 
  Here you are.

 That was quick.

 You need to add a semicolon to the end of line 20 in pairingheap.c.


In addition to the semicolon, it doesn't build under cassert.  There are
some pairingheap_empty that need to be pairingheap_is_empty, and snapmgr.c
needs an address of operator near line 355 and something is wrong
in snapmgr.c near line 811.

Cheers,

Jeff


Re: [alvhe...@2ndquadrant.com: Re: [HACKERS] no test programs in contrib]

2014-12-16 Thread Andrew Dunstan


On 12/16/2014 11:22 AM, Andrew Dunstan wrote:




On 12/16/2014 09:31 AM, Alvaro Herrera wrote:

Hi Andrew,

Did you have a chance to review this?




Oh, darn, not yet. I will try to take a look today.



I have pushed this change, and crake will be running the code. See 
https://github.com/PGBuildFarm/client-code/commit/d656c1c3ce46f290791c5ba5ede2f8ac8dfa342e 
Any brave buildfarm owners on *nix can try it by replacing their copy of 
run_build.pl with the bleeding edge version. We can't put it in a client 
release until we fix up the MSVC side of things.


cheers

andrew





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-12-16 Thread Simon Riggs
On 16 December 2014 at 18:28, Stephen Frost sfr...@snowman.net wrote:

 For the sake of the archives, the idea I had been trying to propose is
 to use a role's permissions as a mechanism to define what should be
 audited.  An example is:

My understanding is that was done.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [alvhe...@2ndquadrant.com: Re: [HACKERS] no test programs in contrib]

2014-12-16 Thread Alvaro Herrera
Andrew Dunstan wrote:

 I have pushed this change, and crake will be running the code. See 
 https://github.com/PGBuildFarm/client-code/commit/d656c1c3ce46f290791c5ba5ede2f8ac8dfa342e

Crake just uploaded its first test results with the testmodules stuff
working:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crakedt=2014-12-16%2020%3A46%3A04

Thanks for setting it up.

 Any brave buildfarm owners on *nix can try it by replacing their copy of
 run_build.pl with the bleeding edge version. We can't put it in a client
 release until we fix up the MSVC side of things.

I guess I will have to find someone to assist me in architecting a
solution for the MSVC stuff.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [alvhe...@2ndquadrant.com: Re: [HACKERS] no test programs in contrib]

2014-12-16 Thread Andrew Dunstan


On 12/16/2014 04:02 PM, Alvaro Herrera wrote:

Andrew Dunstan wrote:


I have pushed this change, and crake will be running the code. See 
https://github.com/PGBuildFarm/client-code/commit/d656c1c3ce46f290791c5ba5ede2f8ac8dfa342e

Crake just uploaded its first test results with the testmodules stuff
working:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crakedt=2014-12-16%2020%3A46%3A04

Thanks for setting it up.


Any brave buildfarm owners on *nix can try it by replacing their copy of
run_build.pl with the bleeding edge version. We can't put it in a client
release until we fix up the MSVC side of things.

I guess I will have to find someone to assist me in architecting a
solution for the MSVC stuff.



I might be able to help in a couple of days.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-12-16 Thread Peter Geoghegan
On Tue, Dec 16, 2014 at 11:08 AM, Jeff Janes jeff.ja...@gmail.com wrote:
 Your new version 1.7 of the patches fixes that issue, as well as the OID
 conflict.

Good.

You're probably aware that I maintain a stress testing suite for the
patch here: https://github.com/petergeoghegan/upsert

In the past, you've had a lot of success with coming up with stress
tests that find bugs. Maybe you can come up with some improvements to
the suite, if you'd care to test the patch. I can authorize your
Github account to push code to that repo, if you're interested.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] POLA violation with \c service=

2014-12-16 Thread David Fetter
Folks,

I've noticed that psql's  \c function handles service= requests in a
way that I can only characterize as broken.

This came up in the context of connecting to a cloud hosting service
named after warriors or a river or something, whose default hostnames
are long, confusing, and easy to typo, so I suspect that service= may
come up more often going forward than it has until now.

For example, when I try to use

\c service=foo

It will correctly figure out which database I'm trying to connect to,
but fail to notice that it's on a different host, port, etc., and
hence fail to connect with a somewhat unhelpful error message.

I can think of a few approaches for fixing this:

0.  Leave it broken.
1.  Disable service= requests entirely in \c context, and error out if 
attempted.
2.  Ensure that \c actually uses all of the available information.

Is there another one I missed?

If not, which of the approaches seems reasonable?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Decoding follows timelines

2014-12-16 Thread Simon Riggs
On 16 December 2014 at 14:25, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 On 12/15/2014 08:54 PM, Simon Riggs wrote:

 Currently, it doesn't.

 This patch is a WIP version of doing that, but only currently attempts
 to do this in the WALSender.

 Objective is to allow cascaded logical replication.

 Very WIP, but here for comments.


 With the patch, XLogSendLogical uses the same logic to calculate SendRqstPtr
 that XLogSendPhysical does. It would be good to refactor that into a common
 function, rather than copy-paste.

Some of the logic is similar, but not all.

 SendRqstPtr isn't actually used for anything in XLogSendLogical.

It exists to allow the call which resets TLI.

I'll see if I can make it exactly identical; I didn't think so when I
first looked, will look again.

Thanks

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [alvhe...@2ndquadrant.com: Re: [HACKERS] no test programs in contrib]

2014-12-16 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 I have pushed this change, and crake will be running the code. See 
 https://github.com/PGBuildFarm/client-code/commit/d656c1c3ce46f290791c5ba5ede2f8ac8dfa342e
  
 Any brave buildfarm owners on *nix can try it by replacing their copy of 
 run_build.pl with the bleeding edge version. We can't put it in a client 
 release until we fix up the MSVC side of things.

I've put this in dromedary as well (though the HEAD build that's running
right this moment is still using the 4.13 script).  I take it I don't need
to adjust the configuration file?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-12-16 Thread Stephen Frost
* Simon Riggs (si...@2ndquadrant.com) wrote:
 On 16 December 2014 at 18:28, Stephen Frost sfr...@snowman.net wrote:
 
  For the sake of the archives, the idea I had been trying to propose is
  to use a role's permissions as a mechanism to define what should be
  audited.  An example is:
 
 My understanding is that was done.

Based on the discussion I had w/ Abhijit on IRC, and what I saw him
commit, it's not the same.  I've been trying to catch up with him on IRC
to get clarification but havn't managed to yet.

Abhijit, could you comment on the above (or, well, my earlier email
which had the details)?  It's entirely possible that I've completely
misunderstood as I haven't dug into the code yet but rather focused on
the documentation.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-12-16 Thread Gavin Flower

On 17/12/14 10:11, Peter Geoghegan wrote:

On Tue, Dec 16, 2014 at 11:08 AM, Jeff Janes jeff.ja...@gmail.com wrote:

Your new version 1.7 of the patches fixes that issue, as well as the OID
conflict.

Good.

You're probably aware that I maintain a stress testing suite for the
patch here: https://github.com/petergeoghegan/upsert

In the past, you've had a lot of success with coming up with stress
tests that find bugs. Maybe you can come up with some improvements to
the suite, if you'd care to test the patch. I can authorize your
Github account to push code to that repo, if you're interested.

Yeah!

I have just released a prototype software (not related to pg): I'm going 
to tell them to treat it with extreme suspicion, no matter how much they 
may respect the developer (me)!


Though like Pg, it is critical that it records data with reliability. 
Also, both need testing to try and detect intermittent errors (I already 
found one myself in the prototype - fortunately, not so critical it 
needs to be fixed in the prototype, but would have to be eliminated from 
the production version!).


So I think it really great to encourage people to come up with demanding 
tests, especially automated stress testing for pg.



Cheers,
Gavin

(Who wishes he had the time  experience to contribute to pg.)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [alvhe...@2ndquadrant.com: Re: [HACKERS] no test programs in contrib]

2014-12-16 Thread Andrew Dunstan


On 12/16/2014 04:44 PM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

I have pushed this change, and crake will be running the code. See
https://github.com/PGBuildFarm/client-code/commit/d656c1c3ce46f290791c5ba5ede2f8ac8dfa342e
Any brave buildfarm owners on *nix can try it by replacing their copy of
run_build.pl with the bleeding edge version. We can't put it in a client
release until we fix up the MSVC side of things.

I've put this in dromedary as well (though the HEAD build that's running
right this moment is still using the 4.13 script).  I take it I don't need
to adjust the configuration file?


Nope, no config changes required.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Stephen Frost
* Andrew Dunstan (and...@dunslane.net) wrote:
 On 12/16/2014 01:38 PM, Stephen Frost wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
 It has been proposed that we do a general list of people at the bottom
 of the release notes who helped review during that cycle.  That would
 be less intrusive and possibly a good idea, but would we credit the
 people who did a TON of reviewing?  Everyone who reviewed even one
 patch?  Somewhere in between? Would committers be excluded because we
 just expect them to help or included because credit is important to
 established community members too?  To what extent would this be
 duplicative of http://www.postgresql.org/community/contributors/ ?
 I don't particularly like this idea.
 
 I do. I think it's an emminently sensible idea that gives credit
 without disturbing the readability of the release notes.

So, when I was first getting started with PG however many years ago, I
was ecstatic to see my name show up in a commit message.  Hugely
increasing our release notes to include a bunch of names all shoved
together without any indication of what was done by each individual
doesn't feel, to me at least, as likely to change that feeling in
either direction.

On the flip side, I would be strongly against *not* including authors
and reviewers in the commit messages, regardless of some big list in the
release notes.

Basically, I see the value of giving credit in the commit history and
the mailing lists as huge while having a long list of names in the
release notes really isn't valuable.

 I'm not necessarily averse to doing something here, but the reason why
 nothing has happened has much more to do with the fact that it's hard
 to figure out exactly what the best thing would be than any idea that
 we don't want to credit reviewers.  We do want to credit reviewers,
 AND WE DO, as a quick look at 'git log' will speedily reveal.
 Agreed.
 
 I can't believe how much we are tying ourselves up in knots over
 this. It's not a good sign. Surely we trust the committers and the
 preparers of the release notes to use some judgement, once we agree
 on general guidelines.

I agree with this.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Joshua D. Drake


This whole conversation reminds me of an interview I just read:

https://opensource.com/business/14/12/interview-jono-bacon-xprize-director-community

--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc
If we send our children to Caesar for their education, we should
 not be surprised when they come back as Romans.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 So, when I was first getting started with PG however many years ago, I
 was ecstatic to see my name show up in a commit message.  Hugely
 increasing our release notes to include a bunch of names all shoved
 together without any indication of what was done by each individual
 doesn't feel, to me at least, as likely to change that feeling in
 either direction.

 On the flip side, I would be strongly against *not* including authors
 and reviewers in the commit messages, regardless of some big list in the
 release notes.

 Basically, I see the value of giving credit in the commit history and
 the mailing lists as huge while having a long list of names in the
 release notes really isn't valuable.

We'd have to continue the practice of crediting people in individual
commit messages in any case, because the commit log is the raw material
from which the release notes are made.

I have no strong feelings either way about whether we should change the
current practice of crediting authors but not reviewers in the release
notes.  I don't feel that it's broken as-is, but I'm open to change if
enough people want to.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PrivateRefCount patch has got issues

2014-12-16 Thread Tom Lane
I just happened to look into bufmgr.c for the first time in awhile, and
noticed the privaterefcount-is-no-longer-a-simple-array stuff.  It doesn't
look too well thought out to me.  In particular, PinBuffer_Locked calls
GetPrivateRefCountEntry while holding a buffer-header spinlock.  That
seems completely unacceptable.  It's certainly a huge violation of our
design principle that spinlocks should be held for only a few
instructions; and I rather suspect that a palloc failure down inside the
hashtable entry-allocation code would leave things in a bad state.  It's
also depressing that the very common code path ReleaseBuffer-UnpinBuffer
results in a double search of the array/hashtable; that should be
refactored to avoid that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest problems

2014-12-16 Thread Jaime Casanova
On Tue, Dec 16, 2014 at 11:32 AM, Robert Haas robertmh...@gmail.com wrote:

 It has been proposed that we do a general list of people at the bottom
 of the release notes who helped review during that cycle.  That would
 be less intrusive and possibly a good idea, but would we credit the
 people who did a TON of reviewing?  Everyone who reviewed even one
 patch?  Somewhere in between? Would committers be excluded because we
 just expect them to help or included because credit is important to
 established community members too?  To what extent would this be
 duplicative of http://www.postgresql.org/community/contributors/ ?

Not much really, I tried to get my name on that list a couple of years
ago, when i reviewed more than i do now, and never got it.
And while my name is in a couple commit messages, that is a lot harder
to show to people...

you know, it's kind of frustrating when some not-yet customers ask for
certificated engineers, and there isn't any official (as in from
community) certificate so you need to prove you're a contributor so
let's see this random commit messages...

-- 
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-12-16 Thread Michael Paquier
On Wed, Dec 17, 2014 at 12:12 AM, Merlin Moncure mmonc...@gmail.com wrote:

 Compressing *one* file with lz4 and a quick/n/dirty plgz i hacked out
 of the source (borrowing heavily from
 https://github.com/maropu/pglz_bench/blob/master/pglz_bench.cpp),  I
 tested the results:

 lz4 real time:  0m0.032s
 pglz real time: 0m0.281s

 mmoncure@mernix2 ~/src/lz4/lz4-r125 $ ls -lh test.*
 -rw-r--r-- 1 mmoncure mmoncure 2.7M Dec 16 09:04 test.lz4
 -rw-r--r-- 1 mmoncure mmoncure 2.5M Dec 16 09:01 test.pglz


 A better test would examine all manner of different xlog files in a
 fashion closer to how your patch would need to compress them but the
 numbers here tell a fairly compelling story: similar compression
 results for around 9x the cpu usage.

Yes that could be better... Thanks for some real numbers that's really
informative.

Be advised that compression alg
 selection is one of those types of discussions that tends to spin off
 into outer space; that's not something you have to solve today.  Just
 try and make things so that they can be switched out if things
 change

One way to get around that would be a set of hooks to allow people to set
up the compression algorithm they want:
- One for buffer compression
- One for buffer decompression
- Perhaps one to set up the size of the buffer used for
compression/decompression scratch buffer. In the case of pglz, this needs
for example to be PGLZ_MAX_OUTPUT(buffer_size). The part actually tricky is
that as xlogreader.c is used by pg_xlogdump, we cannot directly use a hook
in it, but we may as well be able to live with a fixed maximum size like
BLCKSZ * 2 by default, this would let largely enough room for all kinds of
compression algos IMO...
Regards,
-- 
Michael


Re: [HACKERS] POLA violation with \c service=

2014-12-16 Thread David G Johnston
I do indeed see this behavior in some very quick testing using 9.3


David Fetter wrote
 I've noticed that psql's  \c function handles service= requests in a
 way that I can only characterize as broken.

Looking at the docs the fact it attempts to treat service=foo as anything
other than a database name is broken...


 I can think of a few approaches for fixing this:
 
 0.  Leave it broken.
 1.  Disable service= requests entirely in \c context, and error out if
 attempted.
 2.  Ensure that \c actually uses all of the available information.
 
 Is there another one I missed?
 
 If not, which of the approaches seems reasonable?

#2 has a few possible final implementations to consider given that both \c
and service= can be incompletely specified and what happens if both \c-host
and service-host, for instance, are specified...but I'm not in a position to
reason out the various possibilities right now.  Regardless, the ability to
specify a service name is valuable (if one presumes \c is valuable) so the
tasks are finding an implementer and, depending on that outcome, how to
handle back-branches.

I don't think the status-quo is safe enough to leave so for head either #1
or #2 get my vote.  Leaving it broken in back branches is not appealing but
maybe we can selectively break it if we cannot get a #2 implementation that
can be back-patched.

An aside - from the docs:  If there is no previous connection [...] - how
is this possible when issuing \c?

David J.




--
View this message in context: 
http://postgresql.nabble.com/POLA-violation-with-c-service-tp5831001p5831026.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On partitioning

2014-12-16 Thread Amit Langote
On 17-12-2014 AM 12:15, Robert Haas wrote:
 On Mon, Dec 15, 2014 at 6:55 PM, Amit Langote
 langote_amit...@lab.ntt.co.jp wrote:
 Robert wrote:
 I would expect that to fail, just as it would fail if you tried to
 build an index using a volatile expression.

 Oops, wrong example, sorry. In case of an otherwise good expression?
 
 I'm not really sure what you are getting here.  An otherwise-good
 expression basically means a constant.  Index expressions have to be
 things that always produce the same result given the same input,
 because otherwise you might get a different result when searching the
 index than you did when building it, and then you would fail to find
 keys that are actually present.  In the same way, partition boundaries
 also need to be constants.  Maybe you could allow expressions that can
 be constant-folded, but that's about it.  

Yeah, this is what I meant. Expressions that can be constant-folded.
Sorry, the example I chose was pretty lame. I was just thinking about
kind of stuff that something like pg_node_tree would be a good choice
for as on-disk representation of partition values. Though definitely it
wouldn't be to store arbitrary expressions that evaluate to different
values at different times.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On partitioning

2014-12-16 Thread Amit Langote
On 17-12-2014 AM 12:28, Claudio Freire wrote:
 On Tue, Dec 16, 2014 at 12:15 PM, Robert Haas robertmh...@gmail.com wrote:
 I'm not really sure what you are getting here.  An otherwise-good
 expression basically means a constant.  Index expressions have to be
 things that always produce the same result given the same input,
 because otherwise you might get a different result when searching the
 index than you did when building it, and then you would fail to find
 keys that are actually present.
 
 I think the point is partitioning based on the result of an expression
 over row columns. 

Actually, in this case, I was thinking about a partition definition not
partition key definition. That is, using an expression as partition
value which has problems that I see.

 Or if it's not, it should be made anyway:
 
 PARTITION BY LIST (extract(month from date_created) VALUES (1, 3, 6, 9, 12);
 
 Or something like that.
 

Such a thing seems very desirable though there are some tradeoffs
compared to having partitioning key be just attrnums. Or at least we can
start with that.

An arbitrary expression as partitioning key means that we have to
recompute such an expression for each input row. Think how inefficient
that may be when bulk-loading into a partitioned table during, say, a
COPY. Though there may be ways to fix that.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On partitioning

2014-12-16 Thread Robert Haas
On Tue, Dec 16, 2014 at 1:45 PM, Josh Berkus j...@agliodbs.com wrote:
 Yes, I wasn't saying that expressions should be used when *creating* the
 partitions, which strikes me as a bad idea for several reasons.
 Expressions should be usable when SELECTing data from the partitions.
 Right now, they aren't, because the planner picks parttiions well before
 the rewrite phase which would reduce extract (month from current_date)
 to a constant.

 Right now, if you partition by an integer ID even, and do:

 SELECT * FROM partitioned_table WHERE ID = ( 3 + 4 )

 ... postgres will scan all partitions because ( 3 + 4 ) is an expression
 and isn't evaluated until after CE is done.

Well, actually, that case works fine:

rhaas=# create table partitioned_table (id integer, data text);
CREATE TABLE
rhaas=# create table child1 (check (id  1000)) inherits (partitioned_table);
CREATE TABLE
rhaas=# create table child2 (check (id = 1000)) inherits (partitioned_table);
CREATE TABLE
rhaas=# explain select * from partitioned_table where id = (3 + 4);
   QUERY PLAN

 Append  (cost=0.00..25.38 rows=7 width=36)
   -  Seq Scan on partitioned_table  (cost=0.00..0.00 rows=1 width=36)
 Filter: (id = 7)
   -  Seq Scan on child1  (cost=0.00..25.38 rows=6 width=36)
 Filter: (id = 7)
(5 rows)

The reason is that 3 + 4 gets constant-folded pretty early on in the process.

But in a more complicated case where the value there isn't known until
runtime, yeah, it scans everything.  I'm not sure what the best way to
fix that is.  If the partition bounds were stored in a structured way,
as we've been discussing, then the Append or Merge Append node could,
when initialized, check which partition the id = X qual routes to and
ignore the rest.  But that's more iffy with the current
representation, I think.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On partitioning

2014-12-16 Thread Josh Berkus
On 12/16/2014 05:52 PM, Robert Haas wrote:
 But in a more complicated case where the value there isn't known until
 runtime, yeah, it scans everything.  I'm not sure what the best way to
 fix that is.  If the partition bounds were stored in a structured way,
 as we've been discussing, then the Append or Merge Append node could,
 when initialized, check which partition the id = X qual routes to and
 ignore the rest.  But that's more iffy with the current
 representation, I think.

Huh.  I was just testing:

WHERE event_time BETWEEN timestamptz '2014-12-01' and ( timestamptz
'2014-12-01' + interval '1 month')

In that case, the expression above got folded to constants by the time
Postgres did the index scans, but it scanned all partitions.  So somehow
(timestamptz + interval) doesn't get constant-folded until after
planning, at least not on 9.3.

And of course this leaves out common patterns like now() - interval '30
days' or to_timestamp('20141201','MMDD')

Anyway, what I'm saying is that I personally regard the inability to
handle even moderately complex expressions a major failing of our
existing partitioning scheme (possibly its worst single failing), and I
would regard any new partitioning feature which didn't address that
issue as suspect.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >