Re: [HACKERS] checkpointer continuous flushing

2015-06-19 Thread Fabien COELHO


Hello Andres,


- Move fsync as early as possible, suggested by Andres Freund?

My opinion is that this should be left out for the nonce.


"for the nonce" - what does that mean?


 Nonce \Nonce\ (n[o^]ns), n. [For the nonce, OE. for the nones, ...
 {for the nonce}, i. e. for the present time.


I'm doubtful that it's a good idea to separate this out, if you did.


Actually I did, because as explained in another mail the fsync time when 
the other options are activated as reported in the logs is essentially 
null, so it would not bring significant improvements on these runs,

and also the patch changes enough things as it is.

So this is an evidence-based decision.

I also agree that it seems interesting on principle and should be 
beneficial in some case, but I would rather keep that on a TODO list 
together with trying to do better things in the bgwriter and try to focus 
on the current proposal which already changes significantly the 
checkpointer throttling logic.



 - as version 2: checkpoint buffer sorting based on a 2007 patch by
   Takahiro Itagaki but with a smaller and static buffer allocated once.
   Also, sorting is done by chunks of 131072 pages in the current version,
   with a guc to change this value.


I think it's a really bad idea to do this in chunks.


The small problem I see is that for a very large setting there could be 
several seconds or even minutes of sorting, which may or may not be 
desirable, so having some control on that seems a good idea.


Another argument is that Tom said he wanted that:-)

In practice the value can be set at a high value so that it is nearly 
always sorted in one go. Maybe value "0" could be made special and used to 
trigger this behavior systematically, and be the default.



That'll mean we'll frequently uselessly cause repetitive random IO,


This is not an issue if the chunks are large enough, and anyway the guc 
allows to change the behavior as desired. As I said, keeping some control 
seems a good idea, and the "full sorting" can be made the default 
behavior.


often interleaved. That pattern is horrible for SSDs too. We should 
always try to do this at once, and only fail back to using less memory 
if we couldn't allocate everything.


The memory is needed anyway in order to avoid a double or significantly 
more heavy implementation for the throttling loop. It is allocated once on 
the first checkpoint. The allocation could be moved to the checkpointer 
initialization if this is a concern. The memory needed is one int per 
buffer, which is smaller than the 2007 patch.



 . tiny: scale=10 shared_buffers=1GB checkpoint_timeout=30s time=6400s


It'd be interesting to see numbers for tiny, without the overly small
checkpoint timeout value. 30s is below the OS's writeback time.


The point of tiny was to trigger a lot of checkpoints. The size is pretty 
ridiculous anyway, as "tiny" implies. I think I did some tests on other 
versions of the patch and longer checkpoint_timeout on pretty small 
database that showed smaller benefit from the options, as one would 
expect. I'll try to re-run some.



So you've not run things at more serious concurrency, that'd be
interesting to see.


I do not have a box available for "serious concurrency".


I'd also like to see concurrent workloads with synchronous_commit=off -
I've seen absolutely horrible latency behaviour for that, and I'm hoping
this will help. It's also a good way to simulate faster hardware than
you have.



It's also curious that sorting is detrimental for full speed 'tiny'.


Yep.


With SSD probably both options would probably have limited benefit.


I doubt that. Small random writes have bad consequences for wear
leveling. You might not notice that with a short tests - again, I doubt
it - but it'll definitely become visible over time.


Possibly. Testing such effects does not seem easy, though. At least I have 
not seen "write stalls" on SSD, which is my primary concern.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H

2015-06-19 Thread Feng Tian
On Wed, Jun 17, 2015 at 10:52 AM, Tomas Vondra  wrote:

> Hi,
>
> I'm currently running some tests on a 3TB TPC-H data set, and I tripped
> over a pretty bad n_distinct underestimate, causing OOM in HashAgg (which
> somehow illustrates the importance of the memory-bounded hashagg patch Jeff
> Davis is working on).
>
> The problem is Q18, particularly this simple subquery:
>
> select l_orderkey
> from lineitem
> group by l_orderkey
> having sum(l_quantity) > 313;
>
> which is planned like this:
>
>QUERY PLAN
>
> -
>  HashAggregate  (cost=598510163.92..598515393.93 rows=418401 width=12)
>Group Key: l_orderkey
>Filter: (sum(l_quantity) > '313'::double precision)
>->  Seq Scan on lineitem  (cost=0.00..508509923.28 rows=1848128
> width=12)
> (4 rows)
>
> but sadly, in reality the l_orderkey cardinality looks like this:
>
> tpch=# select count(distinct l_orderkey) from lineitem;
>count
> 
>  45
> (1 row)
>
> That's a helluva difference - not the usual one or two orders of
> magnitude, but 1x underestimate.
>
> The usual thing to do in this case is increasing statistics target, and
> while this improves the estimate, the improvement is rather small:
>
>statistics target estimatedifference
>--
>100 429491 1
>1000   4240418  1000
>1 42913759   100
>
> I find the pattern rather strange - every time the statistics target
> increases 10x, the difference decreases 10x - maybe that's natural, but the
> perfect proportionality is suspicious IMHO.
>
> Also, this is a quite large dataset - the table has ~18 billion rows, and
> even with target=1 we're sampling only 3M rows, which is ~0.02%. That's
> a tiny sample, so inaccuracy is naturally expected, but OTOH the TPC-H
> dataset is damn uniform - there's pretty much no skew in the distributions
> AFAIK. So I'd expect a slightly better result.
>
> With target=1 the plan switches to GroupAggregate, because the
> estimate gets sufficient to exceed work_mem (2GB). But it's still way off,
> and it's mostly just a lucky coincidence.
>
> So I'm wondering if there's some bug because of the dataset size (an
> integer overflow or something like), so I added a bunch of logging into the
> estimator, logging all the parameters computed:
>
> target=100 (samplerows=3)
> -
> WARNING:  attnum=1 attname=l_orderkey f1=27976 ndistinct=28977
> nmultiple=1001 toowide_cnt=0 d=28977 numer=86931.00
> denom=2024.046627 stadistinct=429491.094029
> WARNING:  ndistinct estimate attnum=1 attname=l_orderkey current=429491.09
> adaptive=443730.00
>
> target=1000 (samplerows=30)
> ---
> WARNING:  attnum=1 attname=l_orderkey f1=279513 ndistinct=289644
> nmultiple=10131 toowide_cnt=0 d=289644 numer=8689320.00
> denom=20491.658538 stadistinct=4240418.111618
> WARNING:  ndistinct estimate attnum=1 attname=l_orderkey
> current=4240418.11 adaptive=4375171.00
>
> target=1 (samplerows=300)
> -
> WARNING:  attnum=1 attname=l_orderkey f1=2797888 ndistinct=2897799
> nmultiple=99911 toowide_cnt=0 d=2897799 numer=869339700.00
> denom=202578.313396 stadistinct=42913759.396282
> WARNING:  ndistinct estimate attnum=1 attname=l_orderkey
> current=42913759.40 adaptive=9882.00
>
> It's totalrows=1849031 in all cases. The logs also show estimate
> produced by the adaptive estimate (discussed in a separate thread), but
> apparently that does not change the estimates much :-(
>
> Any ideas?
>
> --
> Tomas Vondra   http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


While better sample/stats is important for choosing a good plan, in this
query, hash agg is really the right plan.   If a sort agg is chosen, the
performance will be really really bad.   The patch that Jeff is working on
is critical for a decent TPCH number (unless you have unlimited amount of
memory).

Thanks,


Re: [HACKERS] anole: assorted stability problems

2015-06-19 Thread Alvaro Herrera
Alvaro Herrera wrote:
> Tom Lane wrote:
> > Andres Freund  writes:
> 
> > > Uh. I'm pretty sure there were some back when that patch went in. And
> > > there definitely used to be a couple earlier. I guess itanium really is
> > > dying (mixed bad: It's a horrible architecture, but more coverage would
> > > still be good).
> > 
> > Since that machine is run by EDB, maybe we could persuade them to set up
> > a second critter on it that uses gcc.  That would at least help narrow
> > down whether it's a compiler-specific issue.
> 
> I pinged EDB about this several days ago, and they have now set up
> buildfarm member gharial running on the same machine, using gcc.

FWIW gharial does not show any problem whatsoever.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] castoroides spinlock failure on test_shm_mq

2015-06-19 Thread Alvaro Herrera
Has anybody noticed the way castoroides is randomly failing?

  SELECT test_shm_mq_pipelined(16384, (select 
string_agg(chr(32+(random()*95)::int), '') from generate_series(1,27)), 
200, 3);
! PANIC:  stuck spinlock (100cb92f4) detected at atomics.c:30
! server closed the connection unexpectedly
!   This probably means the server terminated abnormally
!   before or while processing the request.
! connection to server was lost


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H

2015-06-19 Thread Jeff Janes
On Fri, Jun 19, 2015 at 1:39 PM, Tomas Vondra 
wrote:

> On 06/19/2015 09:48 PM, Jeff Janes wrote:
>
>> On Fri, Jun 19, 2015 at 12:27 PM, Tomas Vondra
>> mailto:tomas.von...@2ndquadrant.com>>
>> wrote:
>>
>> But I think you might be on to something, because I manually
>> collected a random sample with 30k rows (by explicitly generating
>> 30k random TIDs), and I get this:
>>
>> tpch=# select cnt, count(*) from (select l_orderkey, count(*) AS cnt
>> from lineitem_sample group by 1) foo group by 1;
>>
>>   cnt | count
>> -+---
>> 1 | 29998
>> 2 | 1
>> (2 rows)
>>
>>
>> That's quite different compared to what analyze gets, which
>> effectively looks something like this (this is derived from the
>> logs, so not perfectly accurate - I only have f1, ndistinct,
>> nmultiple):
>>
>>   cnt | count
>> -+---
>> 1 | 27976
>> 2 |   976
>> 3 |24
>>
>> Am I wrong or is the sample not that random?
>>
>>
>> The sample is not truly random.  The two-stage sampling method causes
>> too few blocks to have exactly one row chosen from them, and too many to
>> have either 0 or 2+ rows chosen from them.
>>
>> When values in the same block are likely to be equal, then it finds too
>> many duplicates because it too often picks two rows from a single block.
>>
>
> Yeah, I came to the same conclusion after a bit of experimenting. I've
> logged the block numbers for all the 30k sampled tuples (target=100) and I
> get this statistics for number of repetitions:
>
>  cnt | count
> -+---
>1 | 11020
>2 |  5637
>3 |  1800
>4 |   450
>5 |94
>6 | 6
>
> so 11020 blocks have exactly 1 tuple sampled from them, 5637 blocks have 2
> tuples sampled etc.
>
> With truly random sampling (just generating 30k random numbers between 0
> and 328509442 (number of pages of this particular table), I get this:
>
> test=# select cnt, count(*) from (select (328509442 * random())::int AS
> blockno, count(*) AS cnt from blocks group by 1) foo group by 1 order by 1;
>
>  cnt | count
> -+---
>1 | 29994
>2 | 3
>
> So yeah, not really random.
>
>  See analysis here:
>>
>>
>> http://www.postgresql.org/message-id/CAMkU=1wrh_jopycayukbdqy4dwhsx1-1e2s0vvgfrryfxde...@mail.gmail.com
>>
>
> Thanks.
>
>  If we assume all the blocks have the same tuple density, then it is
>> easy to correct this. But without that assumption of constant tuple
>> density, I don't know how to get a truly random sample while still
>> skipping most of the table.
>>
>
> Hmmm, that's probably true. OTOH correlated columns are not all that
> uncommon (e.g. table storing time-series data etc.), and this blowup is
> quite bad ...
>

True, but we don't know how big of a problem the density-skew problem might
be (since the current algorithm isn't sensitive to it).  It might be just
as big of a problem.   Tom mentioned some ancient history in the above
mentioned thread that made me think the density skew was enough of a
problem to motivate the current system.


>
> I don't think we need to really assume the density to be constant, maybe
> we can verify that while collecting the sample? I mean, we're already
> reading the pages, so we can track the density, and either do the
> correction or not.
>

Maybe.  I don't know how that would work.  We would have to keep two
samples, and dynamically decide which to use. And what if the decision is
that both density skew is a problem and that value clustering is also a
problem?

I wonder if the n_distinct could be tweaked so that it counted any given
value only once for each block it finds it in?  So instead of asking "how
many times was this value sampled", ask "in how many blocks was this value
sampled at least once"?


> Also, doesn't Vitter do pretty much the same assumption implicitly,
> otherwise it couldn't skipping some of the blocks?
>

Vitter samples an unstructured stream in a single pass, and is unbiased.
The explicit block sampling is not part of Vitter, it is something we
bolted on top of it.

My solution was to just unbolt the block sampling from the top, and let it
sample the rows (still 300 * stats_target of them) from the whole table
rather than a random 300 * 100 blocks of the table.  On tables of the size
I was worried about, block sampling was not very helpful anyway.  Reading
30,000 blocks out of 250,000 blocks lead to no meaningful IO advantage on
my hardware. Any advantage from skipped blocks was made up for (and
sometimes exceeded) by fouling up the read-ahead mechanisms.

With 1000 times more blocks, that probably won't work for you.

But I do wonder, how much time does it take to read a random 1/10, 1/100,
1/1000, 1/1 of a table of your size, just from an IO perspective?  How
much are we gaining by doing the block sample?

Cheers,

Jeff


[HACKERS] Insufficient locking for ALTER DEFAULT PRIVILEGES

2015-06-19 Thread Vik Fearing
I came across the following bug this week:

Session 0:
begin;
create schema bug;
alter default privileges in schema bug grant all on tables to postgres;
commit;

Session 1:
begin;
alter default privileges in schema bug grant all on tables to postgres;

Session 2:
alter default privileges in schema bug grant all on tables to postgres;


Session 1:
commit;

Session 2:
ERROR:  tuple concurrently updated
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Michael Paquier
On Fri, Jun 19, 2015 at 11:45 PM, Tom Lane  wrote:
> Michael Paquier  writes:
>> Attached is a patch fixing those problems and improving the log
>> facility as it really helped me out with those issues. The simplest
>> fix would be to include the -w switch missing in the tests of
>> pg_rewind and pg_ctl though.
>
> I agree with adding the -w switch and will go do that.  As far as the
> rest of this patch goes, it seems like it could be made less invasive
> if the logs got dumped into a subdirectory of tmp_check rather than
> adding another top-level directory that has to be cleaned?

tmp_check remains unused since dcae5fac (missing something perhaps?),
but I guess we could use it again to save the logs in it, and have
pg_rewind do the same.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need Multixact Freezing Docs

2015-06-19 Thread Alvaro Herrera
Jim Nasby wrote:
> On 6/14/15 9:50 AM, Alvaro Herrera wrote:
> >+values[0] = MultiXactState->oldestMultiXactId;
> 
> What about oldestOffset and offsetStopLimit? Seems those would be useful
> too. Looks good other than that.

Yeah, that's what I was trying to say.  How about this?

I realized that pg_get_multixact_members() was not documented, so I
added a blurb about it too.  I guess I could backpatch that part to 9.3
because it's been present all along.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 9a2a1f6..1925c6c 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -14874,6 +14874,57 @@ SELECT collation for ('foo' COLLATE "de_DE");
 For example 10:20:10,14,15 means
 xmin=10, xmax=20, xip_list=10, 14, 15.

+
+   
+pg_get_multixact_members
+   
+
+   
+pg_get_multixact_range
+   
+
+   
+The functions shown in 
+obtain information relative to multixacts in the system.
+   
+
+   
+Multixact Functions
+
+ 
+  Name Return Type Description
+ 
+
+ 
+  
+   pg_get_multixact_range()
+   
+oldestmultixid,
+nextmultixid,
+stopmultixid,
+oldestmemberxid,
+nextmemberxid,
+stopmemberxid
+   
+   
+get current oldest, next and stop limits for multixact IDs,
+and oldest, next and stop limit for multixact members
+   
+  
+
+  
+   pg_get_multixact_members(xid)
+   
+xidxid,
+modetext
+   
+   
+get Xid and mode of the members of the given multixact
+   
+  
+ 
+
+   
   
 
   
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 4383862..db4d457 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -662,6 +662,14 @@ HINT:  Stop the postmaster and use a standalone backend to VACUUM in "mydb".
  Both of these kinds of whole-table scans will occur even if autovacuum is
  nominally disabled.
 
+
+
+ To know the limits of the live range of both the multixact ID counter and
+ of the members storage area, and the limits which would cause further
+ multixact creation to be rejected, see
+ .
+
+

   
 
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 5c25c2f..296c85b 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -66,6 +66,7 @@
  */
 #include "postgres.h"
 
+#include "access/htup_details.h"
 #include "access/multixact.h"
 #include "access/slru.h"
 #include "access/transam.h"
@@ -3308,3 +3309,38 @@ pg_get_multixact_members(PG_FUNCTION_ARGS)
 
 	SRF_RETURN_DONE(funccxt);
 }
+
+Datum
+pg_get_multixact_range(PG_FUNCTION_ARGS)
+{
+	TupleDesc		tupdesc;
+	HeapTuple		htup;
+	Datum			values[6];
+	bool			nulls[6];
+
+	/* Build a tuple descriptor for our result type */
+	if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+		elog(ERROR, "return type must be a row type");
+	tupdesc = BlessTupleDesc(tupdesc);
+
+	MemSet(nulls, 0, sizeof(nulls));
+
+	LWLockAcquire(MultiXactGenLock, LW_SHARED);
+	values[0] = MultiXactState->oldestMultiXactId;
+	values[1] = MultiXactState->nextMXact;
+	values[2] = MultiXactState->multiStopLimit;
+	if (MultiXactState->oldestOffsetKnown)
+		values[3] = MultiXactState->oldestOffset;
+	else
+		nulls[3] = true;
+	values[4] = MultiXactState->nextOffset;
+	if (MultiXactState->offsetStopLimitKnown)
+		values[5] = MultiXactState->offsetStopLimit;
+	else
+		nulls[5] = true;
+	LWLockRelease(MultiXactGenLock);
+
+	htup = heap_form_tuple(tupdesc, values, nulls);
+
+	PG_RETURN_DATUM(HeapTupleGetDatum(htup));
+}
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 92af36d..8241732 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2911,6 +2911,8 @@ DATA(insert OID = 1065 (  pg_prepared_xact PGNSP PGUID 12 1 1000 0 0 f f f f t t
 DESCR("view two-phase transactions");
 DATA(insert OID = 3819 (  pg_get_multixact_members PGNSP PGUID 12 1 1000 0 0 f f f f t t v 1 0 2249 "28" "{28,28,25}" "{i,o,o}" "{multixid,xid,mode}" _null_ pg_get_multixact_members _null_ _null_ _null_ ));
 DESCR("view members of a multixactid");
+DATA(insert OID = 3846 (  pg_get_multixact_range   PGNSP PGUID 12 1 0 0 0 f f f f t f s 0 0 2249 "" "{28,28,28,28,28,28}" "{o,o,o,o,o,o}" "{oldestmulti,nextmulti,stopmulti,oldestmember,nextmember,stopmember}" _null_ pg_get_multixact_range _null_ _null_ _null_ ));
+DESCR("get range and limits of multixacts");
 
 DATA(insert OID = 3537 (  pg_describe_object		PGNSP PGUID 12 1 0 0 0 f f f f t f s 3 0 25 "26 26 23" _null_ _null_ _null_ _null_ pg_describe_object _null_ _null_ _null_ ));
 DESCR("get identification of SQL object");
diff --git a/src/include/utils/builtins.h b/src/i

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Michael Paquier
On Sat, Jun 20, 2015 at 12:44 AM, Tom Lane  wrote:
> Andres Freund  writes:
>> On 2015-06-19 11:16:18 -0400, Robert Haas wrote:
>>> On Fri, Jun 19, 2015 at 11:07 AM, Tom Lane  wrote:
 I wonder whether it's such a good idea for the postmaster to give
 up waiting before all children are gone (postmaster.c:1722 in HEAD).
>
>>> I doubt it.
>
>> Seconded. It's pretty bad to possibly not be able to start again after
>> stopping it. I don't see the benefit in not waiting - sure, the
>> poastmaster exits faster, but postgres hasn't shut down at that point...
>
> Yeah.  I'll see about fixing this.  Hard to be sure if it will fix
> hamster's issue though.

I have re-enabled the TAP tests in hamster. What happens in the next
couple of days will be interesting to see.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Michael Paquier
On Sat, Jun 20, 2015 at 12:07 AM, Tom Lane  wrote:
> Michael Paquier  writes:
>> Now if we look at RewindTest.pm, there is the following code:
>> if ($test_master_datadir)
>> {
>> system
>>   "pg_ctl -D $test_master_datadir -s -m immediate stop
>> 2> /dev/null";
>> }
>> if ($test_standby_datadir)
>> {
>> system
>>   "pg_ctl -D $test_standby_datadir -s -m immediate
>> stop 2> /dev/null";
>> }
>> And I think that the problem is triggered because we are missing a -w
>> switch here, meaning that we do not wait until the confirmation that
>> the server has stopped, and visibly if stop is slow enough the next
>> server to use cannot start because the port is already taken by the
>> server currently stopping.
>
> After I woke up a bit more, I remembered that -w is already the default
> for "pg_ctl stop", so your diagnosis here is incorrect.

Ah right. I forgot that. Perhaps I got just lucky in my runs.

> I suspect that the real problem is the arbitrary decision to use -m
> immediate.  The postmaster would ordinarily wait for its children to
> die, but on a slow machine we could perhaps reach the end of that
> 5-second timeout, whereupon the postmaster would SIGKILL its children
> *and exit immediately*.  I'm not sure how instantaneous SIGKILL is,
> but it seems possible that we could end up trying to start the new
> postmaster before all the children of the old one are dead.  If the
> shmem interlock is working properly that ought to fail.
>
> I wonder whether it's such a good idea for the postmaster to give
> up waiting before all children are gone (postmaster.c:1722 in HEAD).

I don't think so as well.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H

2015-06-19 Thread Tomas Vondra

On 06/19/2015 09:48 PM, Jeff Janes wrote:

On Fri, Jun 19, 2015 at 12:27 PM, Tomas Vondra
mailto:tomas.von...@2ndquadrant.com>> wrote:

But I think you might be on to something, because I manually
collected a random sample with 30k rows (by explicitly generating
30k random TIDs), and I get this:

tpch=# select cnt, count(*) from (select l_orderkey, count(*) AS cnt
from lineitem_sample group by 1) foo group by 1;

  cnt | count
-+---
1 | 29998
2 | 1
(2 rows)


That's quite different compared to what analyze gets, which
effectively looks something like this (this is derived from the
logs, so not perfectly accurate - I only have f1, ndistinct, nmultiple):

  cnt | count
-+---
1 | 27976
2 |   976
3 |24

Am I wrong or is the sample not that random?


The sample is not truly random.  The two-stage sampling method causes
too few blocks to have exactly one row chosen from them, and too many to
have either 0 or 2+ rows chosen from them.

When values in the same block are likely to be equal, then it finds too
many duplicates because it too often picks two rows from a single block.


Yeah, I came to the same conclusion after a bit of experimenting. I've 
logged the block numbers for all the 30k sampled tuples (target=100) and 
I get this statistics for number of repetitions:


 cnt | count
-+---
   1 | 11020
   2 |  5637
   3 |  1800
   4 |   450
   5 |94
   6 | 6

so 11020 blocks have exactly 1 tuple sampled from them, 5637 blocks have 
2 tuples sampled etc.


With truly random sampling (just generating 30k random numbers between 0 
and 328509442 (number of pages of this particular table), I get this:


test=# select cnt, count(*) from (select (328509442 * random())::int AS 
blockno, count(*) AS cnt from blocks group by 1) foo group by 1 order by 1;


 cnt | count
-+---
   1 | 29994
   2 | 3

So yeah, not really random.


See analysis here:

http://www.postgresql.org/message-id/CAMkU=1wrh_jopycayukbdqy4dwhsx1-1e2s0vvgfrryfxde...@mail.gmail.com


Thanks.


If we assume all the blocks have the same tuple density, then it is
easy to correct this. But without that assumption of constant tuple
density, I don't know how to get a truly random sample while still
skipping most of the table.


Hmmm, that's probably true. OTOH correlated columns are not all that 
uncommon (e.g. table storing time-series data etc.), and this blowup is 
quite bad ...


I don't think we need to really assume the density to be constant, maybe 
we can verify that while collecting the sample? I mean, we're already 
reading the pages, so we can track the density, and either do the 
correction or not.


Also, doesn't Vitter do pretty much the same assumption implicitly, 
otherwise it couldn't skipping some of the blocks?


regards
Tomas

--
Tomas Vondra   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tab completion for TABLESAMPLE

2015-06-19 Thread Brendan Jurd
On Fri, 19 Jun 2015 at 21:05 Petr Jelinek  wrote:

> On 2015-06-19 09:08, Brendan Jurd wrote:
> > I
> > think it would be convenient and user-friendly to complete the opening
> > bracket -- it would make it perfectly clear that an argument is required
> > for the syntax to be valid.
> >
>
> Agreed, here is updated version which does that.
>

Thanks Petr, I have tested your v2 and it works well.  This is ready for
committer.

Cheers,
BJ


Re: [HACKERS] 9.5 release notes

2015-06-19 Thread Andres Freund
Hi,

On 2015-06-11 00:15:21 -0400, Bruce Momjian wrote:
> I have committed the first draft of the 9.5 release notes.  You can view
> the output here:

I'm looking through all the commits, checking which I think should
possibly be mentioned additionally:
- 9f03ca91 - Speed up CREATE INDEX by avoiding to copy index tuples
- 9da86753 - Reject duplicate column names in foreign key referenced-columns 
lists.
  behavioural change, should be documented as intentional
- 680513ab - Break out OpenSSL-specific code to separate files.
  should perhaps be mentioned in the developer section.
- 0709b7ee7 - Change the spinlock primitives to function as compiler barriers.
  Significant behavioural change for developers.
- 94028691 - Advance backend's advertised xmin more aggressively.
  Pretty good feature imo.
- 5028f22f6 - Switch to CRC-32C in WAL and other places.
  This might have compability impact in some external tools.
- 4f1b890b1 -
  Unsure if this should be mentioned, at least it's a fairly
  large change.
- 27846f02 - Optimize locking a tuple already locked by another subxact
  Several users ran into this slowness, so I think we should mention the
  optimization.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H

2015-06-19 Thread Jeff Janes
On Fri, Jun 19, 2015 at 12:27 PM, Tomas Vondra  wrote:


> But I think you might be on to something, because I manually collected a
> random sample with 30k rows (by explicitly generating 30k random TIDs), and
> I get this:
>
> tpch=# select cnt, count(*) from (select l_orderkey, count(*) AS cnt from
> lineitem_sample group by 1) foo group by 1;
>
>  cnt | count
> -+---
>1 | 29998
>2 | 1
> (2 rows)
>
>
> That's quite different compared to what analyze gets, which effectively
> looks something like this (this is derived from the logs, so not perfectly
> accurate - I only have f1, ndistinct, nmultiple):
>
>  cnt | count
> -+---
>1 | 27976
>2 |   976
>3 |24
>
> Am I wrong or is the sample not that random?


The sample is not truly random.  The two-stage sampling method causes too
few blocks to have exactly one row chosen from them, and too many to have
either 0 or 2+ rows chosen from them.

When values in the same block are likely to be equal, then it finds too
many duplicates because it too often picks two rows from a single block.

See analysis here:

http://www.postgresql.org/message-id/CAMkU=1wrh_jopycayukbdqy4dwhsx1-1e2s0vvgfrryfxde...@mail.gmail.com

If we assume all the blocks have the same tuple density, then it is easy to
correct this.  But without that assumption of constant tuple density, I
don't know how to get a truly random sample while still skipping most of
the table.

Cheers,

Jeff


Re: [HACKERS] pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H

2015-06-19 Thread Tomas Vondra



On 06/19/2015 08:32 PM, Jeff Janes wrote:

On Wed, Jun 17, 2015 at 10:52 AM, Tomas Vondra
mailto:tomas.von...@2ndquadrant.com>> wrote:

Hi,

I'm currently running some tests on a 3TB TPC-H data set, and I
tripped over a pretty bad n_distinct underestimate, causing OOM in
HashAgg (which somehow illustrates the importance of the
memory-bounded hashagg patch Jeff Davis is working on).

The problem is Q18, particularly this simple subquery:

 select l_orderkey
 from lineitem
 group by l_orderkey
 having sum(l_quantity) > 313;

which is planned like this:

QUERY PLAN

-
  HashAggregate  (cost=598510163.92..598515393.93 rows=418401 width=12)
Group Key: l_orderkey
Filter: (sum(l_quantity) > '313'::double precision)
->  Seq Scan on lineitem  (cost=0.00..508509923.28
rows=1848128 width=12)
(4 rows)

but sadly, in reality the l_orderkey cardinality looks like this:

 tpch=# select count(distinct l_orderkey) from lineitem;
count
 
  45
 (1 row)

That's a helluva difference - not the usual one or two orders of
magnitude, but 1x underestimate.



Is the row order in the table correlated with the value l_orderkey?


The statistics look like this:

 attnum | attname | n_distinct  | correlation
+-+-+-
  1 | l_orderkey  | 4.27349e+07 |0.988631

so yes, it's pretty much perfectly correlated.


Could you create copy of the table ordered at random, and see if it
exhibits the same estimation issue?


Sadly no - this is a 2.5TB table, so it's really easy to do. But I don't 
really see why that should matter, because the sample is supposed to be 
random.


But I think you might be on to something, because I manually collected a 
random sample with 30k rows (by explicitly generating 30k random TIDs), 
and I get this:


tpch=# select cnt, count(*) from (select l_orderkey, count(*) AS cnt 
from lineitem_sample group by 1) foo group by 1;


 cnt | count
-+---
   1 | 29998
   2 | 1
(2 rows)


That's quite different compared to what analyze gets, which effectively 
looks something like this (this is derived from the logs, so not 
perfectly accurate - I only have f1, ndistinct, nmultiple):


 cnt | count
-+---
   1 | 27976
   2 |   976
   3 |24

Am I wrong or is the sample not that random?

--
Tomas Vondra   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - allow backslash-continuations in custom scripts

2015-06-19 Thread Josh Berkus
On 06/19/2015 02:51 PM, Tom Lane wrote:
> Josh Berkus  writes:
>> On 05/14/2015 12:10 PM, Fabien COELHO wrote:
>>> Add backslash continuations to pgbench custom scripts.
> 
>> I don't personally agree.  I believe that it it worth breaking backwards
>> compatibility to support line breaks in pgbench statements, and that if
>> we're not going to do that, supporting \ continuations is of little value.
> 
> I tend to agree on that bottom line; having this be inconsistent with psql
> does not seem like a win.
> 
>> I'm not clear on why we'd need a full SQL lexer.
> 
> So you don't get fooled by semicolons embedded in string literals or
> comments.

I take it we ignore those now?  I mean, personally, it wouldn't break
anything for me but since some other benhcmarks involve random text
generators 


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Alvaro Herrera
Tom Lane wrote:
> Andres Freund  writes:
> > On 2015-06-19 13:56:21 -0300, Alvaro Herrera wrote:
> >> We discussed this when that patch got in (82233ce7ea42d6b).  The reason
> >> for not waiting, it was argued, is that the most likely reason for those
> >> processes not to have already gone away by the time we send SIGKILL was
> >> that they are stuck somewhere in the kernel, and so we might not be able
> >> to actually get them to go away with the SIGKILL.  As I recall, that was
> >> the actual problem that MauMau was trying to get fixed.
> 
> > How does exiting before they're dead help? They're still going to be
> > attached to shared memeory and thus prevent a restart. I don't think
> > hiding the problem by exiting the postmaster helps at all.
> 
> My thought exactly.

I guess you have a point there.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_regress not waiting for postmaster to stop

2015-06-19 Thread Andrew Dunstan


On 06/19/2015 02:33 AM, Michael Paquier wrote:

Hi all,

In pg_regress.c, there is the following code:
static void
stop_postmaster(void)
{
 if (postmaster_running)
 {
 /* We use pg_ctl to issue the kill and wait for stop */
 charbuf[MAXPGPATH * 2];
 int r;

 /* On Windows, system() seems not to force fflush, so... */
 fflush(stdout);
 fflush(stderr);

 snprintf(buf, sizeof(buf),
  "\"%s%spg_ctl\" stop -D \"%s/data\"
-w -s -m fast",
  bindir ? bindir : "",
  bindir ? "/" : "",
  temp_instance);

Except if I am missing something, the comment mentions that we should
wait for the stop but there is no -w switch in the command of pg_ctl.
Doesn't the patch attached make sense?




-w is the default for stop.

So the patch shouldn't make any difference.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - allow backslash-continuations in custom scripts

2015-06-19 Thread Tom Lane
Josh Berkus  writes:
> On 05/14/2015 12:10 PM, Fabien COELHO wrote:
>> Add backslash continuations to pgbench custom scripts.

> I don't personally agree.  I believe that it it worth breaking backwards
> compatibility to support line breaks in pgbench statements, and that if
> we're not going to do that, supporting \ continuations is of little value.

I tend to agree on that bottom line; having this be inconsistent with psql
does not seem like a win.

> I'm not clear on why we'd need a full SQL lexer.

So you don't get fooled by semicolons embedded in string literals or
comments.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H

2015-06-19 Thread Jeff Janes
On Wed, Jun 17, 2015 at 10:52 AM, Tomas Vondra  wrote:

> Hi,
>
> I'm currently running some tests on a 3TB TPC-H data set, and I tripped
> over a pretty bad n_distinct underestimate, causing OOM in HashAgg (which
> somehow illustrates the importance of the memory-bounded hashagg patch Jeff
> Davis is working on).
>
> The problem is Q18, particularly this simple subquery:
>
> select l_orderkey
> from lineitem
> group by l_orderkey
> having sum(l_quantity) > 313;
>
> which is planned like this:
>
>QUERY PLAN
>
> -
>  HashAggregate  (cost=598510163.92..598515393.93 rows=418401 width=12)
>Group Key: l_orderkey
>Filter: (sum(l_quantity) > '313'::double precision)
>->  Seq Scan on lineitem  (cost=0.00..508509923.28 rows=1848128
> width=12)
> (4 rows)
>
> but sadly, in reality the l_orderkey cardinality looks like this:
>
> tpch=# select count(distinct l_orderkey) from lineitem;
>count
> 
>  45
> (1 row)
>
> That's a helluva difference - not the usual one or two orders of
> magnitude, but 1x underestimate.
>


Is the row order in the table correlated with the value l_orderkey?

Could you create copy of the table ordered at random, and see if it
exhibits the same estimation issue?

Cheers,

Jeff


Re: [HACKERS] pgbench - allow backslash-continuations in custom scripts

2015-06-19 Thread Josh Berkus
On 05/14/2015 12:10 PM, Fabien COELHO wrote:
> 
> Add backslash continuations to pgbench custom scripts.
> 
> The benefit of this approach is that it is upward compatible, and it is
> also pretty simple to implement. The downside is that backslash
> continuation is not the best syntax ever invented, but then you do not
> have to use it if you do not like it.
> 
> The alternative would be to make semi-colon a mandatory end-of-line
> marker, which would introduce an incompatibility and requires more
> efforts to implement, including some kind of SQL-compatible lexer.
> 
> IMHO this approach is the best compromise.

I don't personally agree.  I believe that it it worth breaking backwards
compatibility to support line breaks in pgbench statements, and that if
we're not going to do that, supporting \ continuations is of little value.

As someone who actively uses pgbench to write custom benchmarks, I need
to write queries which I can test.  \ continuation does NOT work on the
psql command line, so that's useless for testing my queries; I still
have to reformat and troubleshoot.  If we added \ continuation, I
wouldn't use it.

I think we should support line breaks, and require semicolons for
end-of-statement.  Backwards-compatability in custom pgbench scripts is
not critical; pgbench scripts are neither used in produciton, nor used
in automated systems much that I know of.

I'm not clear on why we'd need a full SQL lexer.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Tom Lane
Andres Freund  writes:
> On 2015-06-19 13:56:21 -0300, Alvaro Herrera wrote:
>> We discussed this when that patch got in (82233ce7ea42d6b).  The reason
>> for not waiting, it was argued, is that the most likely reason for those
>> processes not to have already gone away by the time we send SIGKILL was
>> that they are stuck somewhere in the kernel, and so we might not be able
>> to actually get them to go away with the SIGKILL.  As I recall, that was
>> the actual problem that MauMau was trying to get fixed.

> How does exiting before they're dead help? They're still going to be
> attached to shared memeory and thus prevent a restart. I don't think
> hiding the problem by exiting the postmaster helps at all.

My thought exactly.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 release notes

2015-06-19 Thread Andres Freund
Hi,

On 2015-06-11 00:15:21 -0400, Bruce Momjian wrote:
> I have committed the first draft of the 9.5 release notes.  You can view
> the output here:

So, I did a pass through master's state:

>  
>   
>Add Block Range Indexes (BRIN)
>(Álvaro Herrera, Heikki Linnakangas, Emre Hasegeli)
>   
>
>   
>BRIN indexes are very compact and store the min/max
>values for a range of heap blocks.
>   
>  

Maybe we should mention that they're cheap to maintain?

>  
>   
>Improve in-memory hash performance (Tomas Vondra, Robert Haas)
>   
>  

"hash performance" is pretty general, there's lot of places where we use
hashing that aren't affected.

>  
>   
>Improve concurrency of shared
>buffer replacement (Robert Haas, Amit Kapila)
>   
>  

I think in the end that patch was enhanced to a significant degree by
making it lockless in d72731a70450b. I think the three (?) involved
patches should just combined under one entry.

>  
>   
>Improve concurrent locking and buffer scan performance (Andres
>Freund, Kevin Grittner)
>   
>  

If this is ab5194e6f, I don't think it makes sense to mention "buffer
scan" - it's just any lwlock, and buffer locks aren't the primary
benefit (ProcArrayLock, buffer mapping lock probably are that). I also
don't think Kevin was involved?

I think ed127002d8 and 4b4b680c should be mentioned in this section as
well. 4b4b680c will considerably reduce the per backend memory usage for
servers with large shared buffers.

>
> Server Settings
>
> 
>
>  
>   
>Replace checkpoint_segments with linkend="guc-min-wal-size">min_wal_size and
>max_wal_size
>(Heikki Linnakangas)
>   
>
>   
>This allows the allocation of a large number of WAL
>files without keeping them if they are not needed.
>   
>  

Hm. This affects performance significantly, should we also list it there?

>  
>   
>Add GUC linkend="guc-wal-compression">wal_compression to
>enable compression of full page images stored in WAL
>(Rahila Syed, Michael Paquier)
>   
>  

Also rather performance relevant?

>  
>   
>Archive WAL files with suffix .partial
>during standby promotion (Heikki Linnakangas)
>   
>  

This should be expanded, will mention to Heikki. Possibly also need to
be mentioned in the backward incompat section.

>  
>   
>Allow the labeling
>of the origin of logical replication changes (Andres Freund)
>   
>
>   
>This helps with change tracking.
>   
>  

I think it should be 'origin and progress'. The explanation should
probably rather be 'This is helpful when implementing replication
solutions" or something like it.


>  
>   
>Allow control of table WAL logging after table creation
>with ALTER TABLE .. SET
>LOGGED / UNLOGGED (Fabrízio de Royes Mello)
>   
>  

This sounds a bit confusing. Maybe "Allow to convert a WAL logged table
to an UNLOGGED one, and the other way round"?


>
> System Information Functions and Views

I wonder if

>  
>   
>Report the backends holding replication slots in 
> linkend="catalog-pg-replication-slots">pg_replication_slots
>(Craig Ringer)
>   
>
>   
>The new output column is active_pid.
>   
>  

shouldn't be moved her?

>  
>   
>Allow pg_dump to share a snapshot taken by another
>session using --snapshot (Simon Riggs, Michael Paquier)
>   
>
>   
>The remote snapshot must have been exported by
>pg_export_snapshot() or been defined when creating
>a logical replication slot.

'or exported by logical replication slot creation'?

>This can be used by parallel
>pg_dump to use a consistent snapshot across
>pg_dump processes.
>   

What do you mean by this comment? Parallel pg_dump internally does all
the snapshot stuff already, and it's independent of this option.

>  
>   
>Change pg_ctl default shutdown mode from
>smart to fast (Bruce Momjian)
>   
>  

Wonder if this should be listed as an incompatibility. This does have
have impact on existing setups/scripts.

>  
>   
>Add basic atomics API support (Andres Freund, Oskari
>Saarenmaa)
>   
>  

Out of fairness I think either Oskari shouldn't be listed, or Amit
should as well. Amit surely has spent more time on the patch than
Oskari.

>  
>   
>Add native compiler and memory barriers for Solaris
>Studio (Oskari Saarenmaa)
>   
>
>   
>IS THIS PART OF ATOMICS?
>   
>  

Not really, barriers are independent. But I guess we could just combine
it nonetheless.

>   
>Additional Modules
>
>
>
> 
> 

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Alvaro Herrera
Alvaro Herrera wrote:

> We discussed this when that patch got in (82233ce7ea42d6b).  The reason
> for not waiting, it was argued, is that the most likely reason for those
> processes not to have already gone away by the time we send SIGKILL was
> that they are stuck somewhere in the kernel, and so we might not be able
> to actually get them to go away with the SIGKILL.  As I recall, that was
> the actual problem that MauMau was trying to get fixed.

Hm, I reviewed the discussion and actually I'm wrong: I came up with
the unkillable process argument on my own.  MauMau's problem was a
deadlocked process because of trying to malloc() on its way out.
I still think that unkillable processes are an issue, but perhaps it's
sane to have postmaster wait for some time instead of just going away
immediately.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Andres Freund
On 2015-06-19 13:56:21 -0300, Alvaro Herrera wrote:
> We discussed this when that patch got in (82233ce7ea42d6b).  The reason
> for not waiting, it was argued, is that the most likely reason for those
> processes not to have already gone away by the time we send SIGKILL was
> that they are stuck somewhere in the kernel, and so we might not be able
> to actually get them to go away with the SIGKILL.  As I recall, that was
> the actual problem that MauMau was trying to get fixed.

How does exiting before they're dead help? They're still going to be
attached to shared memeory and thus prevent a restart. I don't think
hiding the problem by exiting the postmaster helps at all.

I don't think it's our job to protect against processes stuck in the
kernel. If that happens something seriously has gone wrong, and we
shouldn't just continue without making that visible.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Auto-vacuum is not running in 9.1.12

2015-06-19 Thread Andres Freund
On 2015-06-17 18:10:42 -0300, Alvaro Herrera wrote:
> Yeah, the case is pretty weird and I'm not really sure that the server
> ought to be expected to behave.  But if this is actually the only part
> of the server that misbehaves because of sudden gigantic time jumps, I
> think it's fair to patch it.  Here's a proposed patch.

We probably should go through the server and look at the various sleeps
and make sure thy all have a upper limit. I doubt this is the only
location without one.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Alvaro Herrera
Tom Lane wrote:
> Andres Freund  writes:
> > On 2015-06-19 11:16:18 -0400, Robert Haas wrote:
> >> On Fri, Jun 19, 2015 at 11:07 AM, Tom Lane  wrote:
> >>> I wonder whether it's such a good idea for the postmaster to give
> >>> up waiting before all children are gone (postmaster.c:1722 in HEAD).
> 
> >> I doubt it.
> 
> > Seconded. It's pretty bad to possibly not be able to start again after
> > stopping it. I don't see the benefit in not waiting - sure, the
> > poastmaster exits faster, but postgres hasn't shut down at that point...
> 
> Yeah.  I'll see about fixing this.  Hard to be sure if it will fix
> hamster's issue though.

We discussed this when that patch got in (82233ce7ea42d6b).  The reason
for not waiting, it was argued, is that the most likely reason for those
processes not to have already gone away by the time we send SIGKILL was
that they are stuck somewhere in the kernel, and so we might not be able
to actually get them to go away with the SIGKILL.  As I recall, that was
the actual problem that MauMau was trying to get fixed.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] pg_upgrade fails when postgres/template1 isn't in default tablespace

2015-06-19 Thread Marti Raudsepp
Hi list

One of my databases failed to upgrade successfully and produced this error
in the copying phase:

error while copying relation "pg_catalog.pg_largeobject"
("/srv/ssd/PG_9.3_201306121/1/12023" to "/PG_9.4_201409291/1/12130"): No
such file or directory

Turns out this happens when either the "postgres" or "template1" databases
have been moved to a non-default tablespace. pg_dumpall does not dump
attributes (such as tablespace) for these databases; pg_upgrade queries the
new cluster about the tablespace for these relations and builds a broken
destination path for the copy/link operation.

The least bad solution seems to be generating ALTER DATBASE SET TABLESPACE
commands for these from pg_dumpall. Previously a --globals-only dump didn't
generate psql \connect commands, but you can't run SET TABLESPACE from
within the same database you're connected to. So to move "postgres", it
needs to connect to "template1" and vice versa. That seems fine for the
purpose of pg_upgrade which can assume a freshly created cluster with both
databases intact.

I have implemented a proof of concept patch for this. Currently I'm only
tackling the binary upgrade failure and not general pg_dumpall.

Alternatively, we could allow SET TABLESPACE in the current database, which
seems less ugly to me. A code comment says "Obviously can't move the tables
of my own database", but it's not obvious to me why. If I'm the only
connected backend, it seems that any caches and open files could be
invalidated. But I don't know how big of an undertaking that would be.



While working on this, I also noticed other things that pg_dumpall (and
this pg_upgrade) misses:
* Nothing at all is dumped for the template0 database, although ACLs,
settings and the tablespace can be changed by the user
* template1 & postgres databases retain ACLs and settings, but not
attributes like TABLESPACE or CONNECTION LIMIT. Other attributes like
LC_COLLATE and LC_CTYPE can't be changed without recreating the DB, but
those don't  matter for the pg_upgrade case anyway.

It seems those would be good material for another patch?

Regards,
Marti Raudsepp


0001-Fix-pg_upgrade-when-postgres-template1-aren-t-in-def.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_receivexlog --create-slot-if-not-exists

2015-06-19 Thread Cédric Villemain


Le 19/06/2015 17:21, Andres Freund a écrit :
> On 2015-06-19 11:19:23 -0400, Magnus Hagander wrote:
>> On Fri, Jun 19, 2015 at 11:17 AM, Stephen Frost  wrote:
>>
>>> * Andres Freund (and...@anarazel.de) wrote:
 To make slot usage in pg_receivexlog easier, should we add
 --create-slot-if-not-exists? That'd mean you could run the same command
 the first and later invocation.
>>>
>>> Yes, please.
>>
>> +1.
> 
> Arguably we could do that for 9.5 - the whole slot stuff for receivexlog
> is new, and this is just adjusting the API slightly. I won't fight much
> for that opinion though.

+1 for it in 9.5

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Auto-vacuum is not running in 9.1.12

2015-06-19 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera  writes:
> > Yeah, the case is pretty weird and I'm not really sure that the server
> > ought to be expected to behave.  But if this is actually the only part
> > of the server that misbehaves because of sudden gigantic time jumps, I
> > think it's fair to patch it.  Here's a proposed patch.
> 
> I think the parenthetical bit in the comment should read "plus any
> fractional second, for simplicity".  Also, maybe replace "300" by
> a #define constant MAX_AUTOVAC_SLEEPTIME for readability?

Okay, done.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] backend depends on libpgport but Make doesn't know

2015-06-19 Thread Alvaro Herrera
I think we're missing a dependency declaration from the backend to
libpgport in the makefiles somewhere.  I just git updated on 9.1 on a
tree that was already built, ran make, and got this:

../../src/port/libpgport_srv.a(dirmod_srv.o): In function `rmtree':
/pgsql/source/REL9_1_STABLE/src/port/dirmod.c:546: undefined reference to 
`snprintf_throw_on_fail'
../../src/port/libpgport_srv.a(exec_srv.o): In function `pclose_check':
/pgsql/source/REL9_1_STABLE/src/port/exec.c:535: undefined reference to 
`snprintf_throw_on_fail'
../../src/port/libpgport_srv.a(exec_srv.o): In function `find_other_exec':
/pgsql/source/REL9_1_STABLE/src/port/exec.c:319: undefined reference to 
`snprintf_throw_on_fail'
/pgsql/source/REL9_1_STABLE/src/port/exec.c:325: undefined reference to 
`snprintf_throw_on_fail'
../../src/port/libpgport_srv.a(exec_srv.o): In function `pipe_read_line':
/pgsql/source/REL9_1_STABLE/src/port/exec.c:366: undefined reference to 
`fprintf_throw_on_fail'
../../src/port/libpgport_srv.a(exec_srv.o): In function 
`set_pglocale_pgservice':
/pgsql/source/REL9_1_STABLE/src/port/exec.c:609: undefined reference to 
`snprintf_throw_on_fail'
/pgsql/source/REL9_1_STABLE/src/port/exec.c:598: undefined reference to 
`snprintf_throw_on_fail'
collect2: error: ld returned 1 exit status
make[2]: *** [postgres] Error 1

The solution is obvious, just "make clean" on src/port -- or
alternatively as Tom says, just clean the build every time before git
pull'ing.  But since most cases with --enable-depend actually work fine
without either of those things, it's annoying that this small part
doesn't.

Anybody has a clue how to fix this, so that libpgport is rebuilt when
its files have changed?  I guess we would have to have the backend
depend on each individual file on src/port ...?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] outstanding multixact bugs

2015-06-19 Thread Robert Haas
During the Developer Unconference at PGCon, we made this wiki page to
track the MultiXact bugs that are outstanding:

https://wiki.postgresql.org/wiki/MultiXact_Bugs

Let's update this page as we find or fix things, and let's release
when we've fixed enough stuff.

Even if you are not involved in the fixing, you might want to read
over the list of issues.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Tom Lane
Andres Freund  writes:
> On 2015-06-19 11:16:18 -0400, Robert Haas wrote:
>> On Fri, Jun 19, 2015 at 11:07 AM, Tom Lane  wrote:
>>> I wonder whether it's such a good idea for the postmaster to give
>>> up waiting before all children are gone (postmaster.c:1722 in HEAD).

>> I doubt it.

> Seconded. It's pretty bad to possibly not be able to start again after
> stopping it. I don't see the benefit in not waiting - sure, the
> poastmaster exits faster, but postgres hasn't shut down at that point...

Yeah.  I'll see about fixing this.  Hard to be sure if it will fix
hamster's issue though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-19 Thread Tom Lane
Petr Jelinek  writes:
> On 2015-06-19 01:04, Petr Jelinek wrote:
>> On 2015-06-19 00:38, Petr Jelinek wrote:
>>> On 2015-06-18 22:04, Tom Lane wrote:
 By the by, the tablesample additions to range_table_mutator are
 obviously broken.

> Apparently it's not a good idea to do this at 1AM after long day :/

> The previous diff included some garbage in tests from my local 
> experimentations.

Pushed with minor adjustments.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] checkpointer continuous flushing

2015-06-19 Thread Andres Freund
Hi,

On 2015-06-17 08:24:38 +0200, Fabien COELHO wrote:
> Here is version 3, including many performance tests with various settings,
> representing about 100 hours of pgbench run. This patch aims at improving
> checkpoint I/O behavior so that tps throughput is improved, late
> transactions are less frequent, and overall performances are more stable.

First off: This is pretty impressive stuff. Being at pgcon, I don't have
time to look into this in detail, but I do plan to comment more
extensively.

> >- Move fsync as early as possible, suggested by Andres Freund?
> >
> >My opinion is that this should be left out for the nonce.

"for the nonce" - what does that mean?

> I did that.

I'm doubtful that it's a good idea to separate this out, if you did.

>  - as version 2: checkpoint buffer sorting based on a 2007 patch by
>Takahiro Itagaki but with a smaller and static buffer allocated once.
>Also, sorting is done by chunks of 131072 pages in the current version,
>with a guc to change this value.

I think it's a really bad idea to do this in chunks. That'll mean we'll
frequently uselessly cause repetitive random IO, often interleaved. That
pattern is horrible for SSDs too. We should always try to do this at
once, and only fail back to using less memory if we couldn't allocate
everything.

> * PERFORMANCE TESTS
> 
> Impacts on "pgbench -M prepared -N -P 1 ..." (simple update test, mostly
> random write activity on one table), checkpoint_completion_target=0.8, with
> different settings on a 16GB 8-core host:
> 
>  . tiny: scale=10 shared_buffers=1GB checkpoint_timeout=30s time=6400s
>  . small: scale=120 shared_buffers=2GB checkpoint_timeout=300s time=4000s
>  . medium: scale=250 shared_buffers=4GB checkpoint_timeout=15min time=4000s
>  . large: scale=1000 shared_buffers=4GB checkpoint_timeout=40min time=7500s

It'd be interesting to see numbers for tiny, without the overly small
checkpoint timeout value. 30s is below the OS's writeback time.

> Note: figures noted with a star (*) had various issues during their run, so
> pgbench progress figures were more or less incorrect, thus the standard
> deviation computation is not to be trusted beyond "pretty bad".
> 
> Caveat: these are only benches on one host at a particular time and
> location, which may or may not be reproducible nor be representative
> as such of any other load.  The good news is that all these tests tell
> the same thing.
> 
> - full-speed 1-client
> 
>  options   | tps performance over per second data
>   flush | sort |tiny|small |   medium |large
> off |  off | 687 +- 231 | 163 +- 280 * | 191 +- 626 * | 37.7 +- 25.6
> off |   on | 699 +- 223 | 457 +- 315   | 479 +- 319   | 48.4 +- 28.8
>  on |  off | 740 +- 125 | 143 +- 387 * | 179 +- 501 * | 37.3 +- 13.3
>  on |   on | 722 +- 119 | 550 +- 140   | 549 +- 180   | 47.2 +- 16.8
> 
> - full speed 4-clients
> 
>   options  | tps performance over per second data
>   flush | sort |tiny | small |medium
> off |  off | 2006 +- 748 | 193 +- 1898 * | 205 +- 2465 *
> off |   on | 2086 +- 673 | 819 +-  905 * | 807 +- 1029 *
>  on |  off | 2212 +- 451 | 169 +- 1269 * | 160 +-  502 *
>  on |   on | 2073 +- 437 | 743 +-  413   | 822 +-  467
> 
> - 100-tps 1-client max 100-ms latency
> 
>  options   | percent of late transactions
>   flush | sort |  tiny | small | medium
> off |  off |  6.31 | 29.44 | 30.74
> off |   on |  6.23 |  8.93 |  7.12
>  on |  off |  0.44 |  7.01 |  8.14
>  on |   on |  0.59 |  0.83 |  1.84
> 
> - 200-tps 1-client max 100-ms latency
> 
>  options   | percent of late transactions
>   flush | sort |  tiny | small | medium
> off |  off | 10.00 | 50.61 | 45.51
> off |   on |  8.82 | 12.75 | 12.89
>  on |  off |  0.59 | 40.48 | 42.64
>  on |   on |  0.53 |  1.76 |  2.59
> 
> - 400-tps 1-client (or 4 for medium) max 100-ms latency
> 
>  options   | percent of late transactions
>   flush | sort | tiny | small | medium
> off |  off | 12.0 | 64.28 | 68.6
> off |   on | 11.3 | 22.05 | 22.6
>  on |  off |  1.1 | 67.93 | 67.9
>  on |   on |  0.6 |  3.24 |  3.1
> 

So you've not run things at more serious concurrency, that'd be
interesting to see.

I'd also like to see concurrent workloads with synchronous_commit=off -
I've seen absolutely horrible latency behaviour for that, and I'm hoping
this will help. It's also a good way to simulate faster hardware than
you have.

It's also curious that sorting is detrimental for full speed 'tiny'.

> * CONCLUSION :
> 
> For most of these HDD tests, when both options are activated the tps
> throughput is improved (+3 to +300%), late transactions are reduced (by 91%
> to 97%) and overall the performance is more stable (tps standard deviation
> is typically halved).
> 
> The option effects are somehow orthogonal:
> 
>  - latency is essentially limited by flushing, although sorting also
>contributes.
> 
>

Re: [HACKERS] pg_receivexlog --create-slot-if-not-exists

2015-06-19 Thread Andres Freund
On 2015-06-19 11:19:23 -0400, Magnus Hagander wrote:
> On Fri, Jun 19, 2015 at 11:17 AM, Stephen Frost  wrote:
>
> > * Andres Freund (and...@anarazel.de) wrote:
> > > To make slot usage in pg_receivexlog easier, should we add
> > > --create-slot-if-not-exists? That'd mean you could run the same command
> > > the first and later invocation.
> >
> > Yes, please.
>
> +1.

Arguably we could do that for 9.5 - the whole slot stuff for receivexlog
is new, and this is just adjusting the API slightly. I won't fight much
for that opinion though.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_receivexlog --create-slot-if-not-exists

2015-06-19 Thread Cédric Villemain
> To make slot usage in pg_receivexlog easier, should we add
> --create-slot-if-not-exists? That'd mean you could run the same command
> the first and later invocation.

+1 (with a shorter name please, if you can find one... )

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Andres Freund
On 2015-06-19 11:16:18 -0400, Robert Haas wrote:
> On Fri, Jun 19, 2015 at 11:07 AM, Tom Lane  wrote:
> > I wonder whether it's such a good idea for the postmaster to give
> > up waiting before all children are gone (postmaster.c:1722 in HEAD).
> 
> I doubt it.

Seconded. It's pretty bad to possibly not be able to start again after
stopping it. I don't see the benefit in not waiting - sure, the
poastmaster exits faster, but postgres hasn't shut down at that point...


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_receivexlog --create-slot-if-not-exists

2015-06-19 Thread Magnus Hagander
On Fri, Jun 19, 2015 at 11:17 AM, Stephen Frost  wrote:

> * Andres Freund (and...@anarazel.de) wrote:
> > To make slot usage in pg_receivexlog easier, should we add
> > --create-slot-if-not-exists? That'd mean you could run the same command
> > the first and later invocation.
>
> Yes, please.
>


+1.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] pg_receivexlog --create-slot-if-not-exists

2015-06-19 Thread Stephen Frost
* Andres Freund (and...@anarazel.de) wrote:
> To make slot usage in pg_receivexlog easier, should we add
> --create-slot-if-not-exists? That'd mean you could run the same command
> the first and later invocation.

Yes, please.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Robert Haas
On Fri, Jun 19, 2015 at 11:07 AM, Tom Lane  wrote:
> I wonder whether it's such a good idea for the postmaster to give
> up waiting before all children are gone (postmaster.c:1722 in HEAD).

I doubt it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Missing tab-complete for PASSWORD word in CREATE ROLE syntax

2015-06-19 Thread Robert Haas
On Fri, Jun 19, 2015 at 5:52 AM, Jeevan Chalke
 wrote:
> I have observed that we are not tab-completing word PASSWORD in the
> following
> syntaxes:
>
> 1.
> CREATE|ALTER ROLE|USER rolname
>
> 2.
> CREATE|ALTER ROLE|USER rolname WITH
>
> PASSWORD is used many times and should be in the tab-complete list.
>
> Was there any reason we have deliberately kept this out?
> If yes, please ignore my patch.
> Attached patch to add those missing tab-completes.

Thanks, committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Tom Lane
Michael Paquier  writes:
> Now if we look at RewindTest.pm, there is the following code:
> if ($test_master_datadir)
> {
> system
>   "pg_ctl -D $test_master_datadir -s -m immediate stop
> 2> /dev/null";
> }
> if ($test_standby_datadir)
> {
> system
>   "pg_ctl -D $test_standby_datadir -s -m immediate
> stop 2> /dev/null";
> }
> And I think that the problem is triggered because we are missing a -w
> switch here, meaning that we do not wait until the confirmation that
> the server has stopped, and visibly if stop is slow enough the next
> server to use cannot start because the port is already taken by the
> server currently stopping.

After I woke up a bit more, I remembered that -w is already the default
for "pg_ctl stop", so your diagnosis here is incorrect.

I suspect that the real problem is the arbitrary decision to use -m
immediate.  The postmaster would ordinarily wait for its children to
die, but on a slow machine we could perhaps reach the end of that
5-second timeout, whereupon the postmaster would SIGKILL its children
*and exit immediately*.  I'm not sure how instantaneous SIGKILL is,
but it seems possible that we could end up trying to start the new
postmaster before all the children of the old one are dead.  If the
shmem interlock is working properly that ought to fail.

I wonder whether it's such a good idea for the postmaster to give
up waiting before all children are gone (postmaster.c:1722 in HEAD).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_regress not waiting for postmaster to stop

2015-06-19 Thread Tom Lane
Michael Paquier  writes:
> Except if I am missing something, the comment mentions that we should
> wait for the stop but there is no -w switch in the command of pg_ctl.
> Doesn't the patch attached make sense?

Per pg_ctl's help output:

(The default is to wait for shutdown, but not for start or restart.)

So -w is already the default in a stop call.  Perhaps it would be worth
adding it for documentation's sake, but there's no bug here AFAICS.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_regress not waiting for postmaster to stop

2015-06-19 Thread Robert Haas
On Fri, Jun 19, 2015 at 2:33 AM, Michael Paquier
 wrote:
> In pg_regress.c, there is the following code:
> static void
> stop_postmaster(void)
> {
> if (postmaster_running)
> {
> /* We use pg_ctl to issue the kill and wait for stop */
> charbuf[MAXPGPATH * 2];
> int r;
>
> /* On Windows, system() seems not to force fflush, so... */
> fflush(stdout);
> fflush(stderr);
>
> snprintf(buf, sizeof(buf),
>  "\"%s%spg_ctl\" stop -D \"%s/data\"
> -w -s -m fast",
>  bindir ? bindir : "",
>  bindir ? "/" : "",
>  temp_instance);
>
> Except if I am missing something, the comment mentions that we should
> wait for the stop but there is no -w switch in the command of pg_ctl.
> Doesn't the patch attached make sense?

Do we need to wait, or is failing to wait harmless?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_receivexlog --create-slot-if-not-exists

2015-06-19 Thread Andres Freund
Hi,

To make slot usage in pg_receivexlog easier, should we add
--create-slot-if-not-exists? That'd mean you could run the same command
the first and later invocation.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Tom Lane
Michael Paquier  writes:
> Attached is a patch fixing those problems and improving the log
> facility as it really helped me out with those issues. The simplest
> fix would be to include the -w switch missing in the tests of
> pg_rewind and pg_ctl though.

I agree with adding the -w switch and will go do that.  As far as the
rest of this patch goes, it seems like it could be made less invasive
if the logs got dumped into a subdirectory of tmp_check rather than
adding another top-level directory that has to be cleaned?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time to get rid of PQnoPasswordSupplied?

2015-06-19 Thread Tom Lane
Robert Haas  writes:
> On Sun, Jun 14, 2015 at 11:34 PM, Craig Ringer  wrote:
>> I frequently see users confused by one of our more common and less
>> clear error messages:
>> 
>> fe_sendauth: no password supplied

> I've never seen this error message, but I'm not opposed to improving
> it in some way.

The reason you don't see it is that psql and other clients take special
action (ie, prompt you for a password) instead of actually showing you
that error string when it's returned by libpq.

The reason the string is considered part of libpq's external API is that
once upon a time, strcmp() to that particular string value was the only
way for clients to realize that prompt-for-a-password was the appropriate
response.  Nowadays you're supposed to use PQconnectionNeedsPassword()
instead.

Apparently, Craig is responsible for some client-side code that has never
heard of either convention and just fails instead of prompting for a
password.  So improving that might be something to put on somebody's
to-do list.

However, the question for us is whether there are any clients that still
use strcmp() to the PQnoPasswordSupplied literal, and so would be broken
if we change it.  My guess is yes, but I'd still be on board with
replacing the string with something more message-style-guide-compliant
(and localizable).  PQnoPasswordSupplied has been marked as deprecated
since 2007.

On the other hand, you could argue that improving the string is going
to break clients that do the right thing (even if klugily) in order
to help clients that are doing the wrong thing (ie, failing without
offering the opportunity to enter a password).  Ideally no client app
would ever show this message to users and so its readability would not
matter.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time to get rid of PQnoPasswordSupplied?

2015-06-19 Thread Robert Haas
On Sun, Jun 14, 2015 at 11:34 PM, Craig Ringer  wrote:
> I frequently see users confused by one of our more common and less
> clear error messages:
>
> fe_sendauth: no password supplied

I've never seen this error message, but I'm not opposed to improving
it in some way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN function of pageinspect has inconsistency data type.

2015-06-19 Thread Sawada Masahiko
On Wed, Jun 17, 2015 at 4:11 PM, Jim Nasby  wrote:
> On 6/16/15 8:26 AM, Sawada Masahiko wrote:
>>
>> I noticed while using gin function of pageinspect that there are some
>> inconsistency data types.
>> For example, data type of GinMetaPageData.head, and tail  is
>> BlockNumber, i.g, uint32.
>> But in ginfunc.c, we get that value as int64.
>
>
> You can't put a uint32 into a signed int32 (which is what the SQL int is).
> That's why it's returning a bigint.

I understood why it's returning a bigint.
Thanks!

Regards,

--
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is it possible to have a "fast-write" Index?

2015-06-19 Thread Merlin Moncure
On Thu, Jun 11, 2015 at 9:32 PM, Qingqing Zhou
 wrote:
> On Fri, Jun 5, 2015 at 10:59 AM, Tom Lane  wrote:
>> So I really doubt that anyone would have any enthusiasm for saddling btree
>> with a similar mechanism.  It's complicated (and has been the cause of
>> multiple bugs); it's hard to figure out when is the optimal time to flush
>> the pending insertions; and it slows down searches in favor of making
>> inserts cheap, which is generally not the way to bet --- if that's the
>> tradeoff you want, why not drop the index altogether?
>
> Hint bits update is also painful in above case, but it is out of the topic 
> here.

Are your records spread out around many transactions or so you tend to
have large batches all with the same xid?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is it possible to have a "fast-write" Index?

2015-06-19 Thread Simon Riggs
On 19 June 2015 at 14:30, Robert Haas  wrote:

> > So I really doubt that anyone would have any enthusiasm for saddling
> btree
> > with a similar mechanism.  It's complicated (and has been the cause of
> > multiple bugs); it's hard to figure out when is the optimal time to flush
> > the pending insertions; and it slows down searches in favor of making
> > inserts cheap, which is generally not the way to bet --- if that's the
> > tradeoff you want, why not drop the index altogether?
>
> I'm not sure you're right about that.  MySQL has a feature called
> secondary index buffering:
>
> https://dev.mysql.com/doc/refman/5.0/en/innodb-insert-buffering.html
>
> Now that might not be exactly what we want to do for one reason or
> another, but I think it would be silly to think that they implemented
> that for any reason other than performance, so there may be some
> performance to be gained there.
>
> Consider that on a table with multiple indexes, we've got to insert
> into all of them.  If it turns out that the first leaf page we need
> isn't in shared buffers, we'll wait for it to be read in.  We won't
> start the second index insertion until we've completed the first one,
> and so on.  So the whole thing is serial.  In the system MySQL has
> implemented, the foreground process would proceed unimpeded and any
> indexes whose pages were not in the buffer pool would get updated in
> the background.
>
> Ignoring for the moment the complexities of whether they've got the
> right design and how to implement it, that's sort of cool.
>

Interesting.

Reading that URL it shows that they would need to write WAL to insert into
the buffer and then again to insert into the index. You might get away with
skipping WAL logs on the index buffer if you had a special WAL record to
record the event "all indexes updated for xid ", but since that would
be written lazily it would significantly complicate the lazy update
mechanism to track that.

It doesn't say anything about their being only one index buffer per table,
nor do I think it would make sense to do it that way. So ISTM that the
foreground process still has to insert serially into N index buffers, with
each insert being WAL logged.

So the only saving for the foreground process is the random I/O from
inserting into the indexes, which means the value of the technique is in
the case where you have many very large secondary indexes - which is now
covered by BRIN.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] Is it possible to have a "fast-write" Index?

2015-06-19 Thread Robert Haas
> So I really doubt that anyone would have any enthusiasm for saddling btree
> with a similar mechanism.  It's complicated (and has been the cause of
> multiple bugs); it's hard to figure out when is the optimal time to flush
> the pending insertions; and it slows down searches in favor of making
> inserts cheap, which is generally not the way to bet --- if that's the
> tradeoff you want, why not drop the index altogether?

I'm not sure you're right about that.  MySQL has a feature called
secondary index buffering:

https://dev.mysql.com/doc/refman/5.0/en/innodb-insert-buffering.html

Now that might not be exactly what we want to do for one reason or
another, but I think it would be silly to think that they implemented
that for any reason other than performance, so there may be some
performance to be gained there.

Consider that on a table with multiple indexes, we've got to insert
into all of them.  If it turns out that the first leaf page we need
isn't in shared buffers, we'll wait for it to be read in.  We won't
start the second index insertion until we've completed the first one,
and so on.  So the whole thing is serial.  In the system MySQL has
implemented, the foreground process would proceed unimpeded and any
indexes whose pages were not in the buffer pool would get updated in
the background.

Ignoring for the moment the complexities of whether they've got the
right design and how to implement it, that's sort of cool.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewind failure by file deletion in source server

2015-06-19 Thread Robert Haas
On Fri, Jun 19, 2015 at 8:18 AM, Robert Haas  wrote:
> On Fri, Jun 19, 2015 at 12:14 AM, Michael Paquier
>  wrote:
>>> Listing the directories with pg_ls_dir() has the same problem.
>>
>> (After some discussion on IM with Heikki on this one).
>> This is actually more tricky because pg_ls_dir() does not return '.'
>> or '..' that we could use to identify that the directory actually
>> exists or not when it is empty. Hence I think that we should add two
>> options to pg_ls_dir:
>> - include_self, default to false. If set to true, '.' is added in the
>> list of items.
>> - if_not_exists, to bypass error that a folder does not exist, default
>> at false. If if_not_exists = true and include_self = true, returning
>> only '.' would mean that the folder exist but that it is empty. If
>> if_not_exists = true and include_self = false, no rows are returned.
>> We could as well ERROR as well if both options are set like that. I am
>> fine with any of them as long as behavior is properly documented.
>
> Including '.' to distinguish between an empty directory and a
> nonexistent one seems like an unnecessarily complicated and
> non-obvious API.  How about just one additional parameter bool
> *exists.  If NULL and no directory, ERROR, else on return set *exists
> to true or false.

Err, wait.  You're talking about an SQL function, heh heh.  So that
won't work.  Maybe what you proposed is the best we can do, then,
although I would suggest that you rename the include_self parameter to
something like include_dot_dirs and return both "." and "..".
Returning only "." seems like it will seem weird to people.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewind failure by file deletion in source server

2015-06-19 Thread Robert Haas
On Fri, Jun 19, 2015 at 12:14 AM, Michael Paquier
 wrote:
>> Listing the directories with pg_ls_dir() has the same problem.
>
> (After some discussion on IM with Heikki on this one).
> This is actually more tricky because pg_ls_dir() does not return '.'
> or '..' that we could use to identify that the directory actually
> exists or not when it is empty. Hence I think that we should add two
> options to pg_ls_dir:
> - include_self, default to false. If set to true, '.' is added in the
> list of items.
> - if_not_exists, to bypass error that a folder does not exist, default
> at false. If if_not_exists = true and include_self = true, returning
> only '.' would mean that the folder exist but that it is empty. If
> if_not_exists = true and include_self = false, no rows are returned.
> We could as well ERROR as well if both options are set like that. I am
> fine with any of them as long as behavior is properly documented.

Including '.' to distinguish between an empty directory and a
nonexistent one seems like an unnecessarily complicated and
non-obvious API.  How about just one additional parameter bool
*exists.  If NULL and no directory, ERROR, else on return set *exists
to true or false.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tab completion for TABLESAMPLE

2015-06-19 Thread Petr Jelinek

On 2015-06-19 09:08, Brendan Jurd wrote:

On Sun, 14 Jun 2015 at 20:44 Petr Jelinek mailto:p...@2ndquadrant.com>> wrote:

looks like I omitted psql tab completion from the TABLESAMPLE patch. The
attached patch adds it.


Hi Petr,

I'm doing an initial review of this patch.


Thanks



With the patch applied, psql correctly offers a list of existing
tablesample methods for completion whenever the previous word is
'TABLESAMPLE'.  I did notice that the syntax of the TABLESAMPLE clause
requires brackets after the method name, but the tab completion does not
complete the opening bracket.  Whereas, say, ALTER FUNCTION does.  I
think it would be convenient and user-friendly to complete the opening
bracket -- it would make it perfectly clear that an argument is required
for the syntax to be valid.



Agreed, here is updated version which does that.

--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index b9f5acc..c295ffa 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -737,6 +737,11 @@ static const SchemaQuery Query_for_list_of_matviews = {
 "   FROM pg_catalog.pg_event_trigger "\
 "  WHERE substring(pg_catalog.quote_ident(evtname),1,%d)='%s'"
 
+#define Query_for_list_of_tablesample_methods \
+" SELECT pg_catalog.quote_ident(tsmname) "\
+"   FROM pg_catalog.pg_tablesample_method "\
+"  WHERE substring(pg_catalog.quote_ident(tsmname),1,%d)='%s'"
+
 /*
  * This is a list of all "things" in Pgsql, which can show up after CREATE or
  * DROP; and there is also a query to get a list of them.
@@ -3578,6 +3583,13 @@ psql_completion(const char *text, int start, int end)
 			 prev2_wd[0] == '\0')
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_relations, NULL);
 
+/* TABLESAMPLE */
+	else if (pg_strcasecmp(prev_wd, "TABLESAMPLE") == 0)
+		COMPLETE_WITH_QUERY(Query_for_list_of_tablesample_methods);
+
+	else if (pg_strcasecmp(prev2_wd, "TABLESAMPLE") == 0)
+		COMPLETE_WITH_CONST("(");
+
 /* TRUNCATE */
 	else if (pg_strcasecmp(prev_wd, "TRUNCATE") == 0)
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Michael Paquier
On Thu, Jun 18, 2015 at 3:52 PM, Michael Paquier wrote:
> I think that it would be useful as well to improve the buildfarm
> output. Thoughts?

And after running the tests more or less 6~7 times in a row on a PI, I
have been able to trigger the problem and I think that I have found
its origin. First, the error has been triggered by the tests of
pg_rewind:
t/002_databases.pl ...
1..4
Bailout called.  Further testing stopped:  run pg_ctl failed: 256
Bail out!  run pg_ctl failed: 256
FAILED--Further testing stopped: run pg_ctl failed: 256
Makefile:51: recipe for target 'check' failed
make[1]: *** [check] Error 255

And by looking at the logs obtained thanks to the previous patch I
could see the following (log attached for tests 1 and 2):
$ tail -n5 regress_log/regress_log_002_databases
waiting for server to start stopped waiting
pg_ctl: could not start server
Examine the log output.
LOG:  received immediate shutdown request
LOG:  received immediate shutdown request

pg_ctl should be able to start the server and should not fail here.
This is confirmed by the fact that first test has not stopped the
servers. On a clean run, the immediate shutdown request is received
and done:
waiting for server to shut downLOG:  received immediate shutdown request
LOG:  unexpected EOF on standby connection
 done

But in the case of the failure this does not happen:
LOG:  received immediate shutdown request
LOG:  unexpected EOF on standby connection
LOG:  received immediate shutdown request
See the "done" is not here.

Now if we look at RewindTest.pm, there is the following code:
if ($test_master_datadir)
{
system
  "pg_ctl -D $test_master_datadir -s -m immediate stop
2> /dev/null";
}
if ($test_standby_datadir)
{
system
  "pg_ctl -D $test_standby_datadir -s -m immediate
stop 2> /dev/null";
}
And I think that the problem is triggered because we are missing a -w
switch here, meaning that we do not wait until the confirmation that
the server has stopped, and visibly if stop is slow enough the next
server to use cannot start because the port is already taken by the
server currently stopping.

Note as well that the last command of pg_ctl stop in
pg_ctl/t/002_status.pl does not use -w, so we have the same problem
there.

Attached is a patch fixing those problems and improving the log
facility as it really helped me out with those issues. The simplest
fix would be to include the -w switch missing in the tests of
pg_rewind and pg_ctl though.

It would be good to get that fixed, then I would be able to re-enable
the TAP tests of hamster. I have run the tests a dozen of times again
with this patch, and I could not trigger the failure anymore.
Regards,
-- 
Michael


regress_log_001_basic
Description: Binary data


regress_log_002_databases
Description: Binary data
From 50ea935f3873df7c206e6ff3890a3d2219698098 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Fri, 19 Jun 2015 14:12:09 +0900
Subject: [PATCH] Improve log capture of TAP tests and fix race conditions

All tests have their logs stored as regress_log/$TEST_NAME, with content
captured from the many commands run during the tests.

Some commands of pg_ctl was lacking a -w switch, meaning that in rather
slow environments this could lead to a server not able to start if the
last server stopped did not yet acknowledge the stop request.
---
 src/Makefile.global.in |  1 +
 src/bin/pg_basebackup/.gitignore   |  1 +
 src/bin/pg_basebackup/Makefile |  2 +-
 src/bin/pg_basebackup/t/010_pg_basebackup.pl   |  4 +-
 src/bin/pg_controldata/.gitignore  |  1 +
 src/bin/pg_controldata/Makefile|  2 +-
 src/bin/pg_controldata/t/001_pg_controldata.pl |  2 +-
 src/bin/pg_ctl/.gitignore  |  1 +
 src/bin/pg_ctl/Makefile|  2 +-
 src/bin/pg_ctl/t/001_start_stop.pl |  2 +-
 src/bin/pg_ctl/t/002_status.pl |  6 +--
 src/bin/pg_rewind/RewindTest.pm| 68 +-
 src/bin/scripts/.gitignore |  1 +
 src/bin/scripts/Makefile   |  2 +-
 src/test/perl/TestLib.pm   | 48 +-
 src/test/ssl/.gitignore|  1 +
 src/test/ssl/Makefile  |  3 ++
 src/test/ssl/ServerSetup.pm|  6 +--
 18 files changed, 90 insertions(+), 63 deletions(-)

diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index c583b44..4b1de8c 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -336,6 +336,7 @@ cd $(srcdir) && TESTDIR='$(CURDIR)' PATH="$(bindir):$$PATH" PGPORT='6$(DEF_PGPOR
 endef
 
 define prove_check
+rm -rf $(srcdir)/regress_log
 cd $(srcdir) && TESTDIR='$(CURDIR)' $(with_temp_install) PGPORT='6$(DEF_PGPORT)' top_builddir='$(CURDIR)/$(top_builddir)' $(PROVE) $(PG_PROVE_

[HACKERS] Missing tab-complete for PASSWORD word in CREATE ROLE syntax

2015-06-19 Thread Jeevan Chalke
Hi,

I have observed that we are not tab-completing word PASSWORD in the
following
syntaxes:

1.
CREATE|ALTER ROLE|USER rolname

2.
CREATE|ALTER ROLE|USER rolname WITH

PASSWORD is used many times and should be in the tab-complete list.

Was there any reason we have deliberately kept this out?
If yes, please ignore my patch.
Attached patch to add those missing tab-completes.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


tab_complete_password_in_create_alter_role_v1.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Progress bar for pg_dump/pg_restore

2015-06-19 Thread Taiki Kondo
Hi, andres

Thank you for your comment, and sorry for late response.

> The question is how to actually get useful estimates. As there's no
> progress report for indvidiual COPY and CREATE INDEX commands you'll, in
> many cases, have very irregular progress updates. In many many cases
> most of the time is spent on a very small subset of the total objects.

When dumping, I think number of tuples can be got from pg_class.reltuples, 
therefore I want pg_dump to run "select reltuples" to get it, and then pg_dump 
will calculate estimated time to execute "COPY FROM" command in getting each 
tuples.

For restoring, I think it's better to record above information (number of 
tuples) into pg_dump file to estimate time to restore tables.

And, I also understood your concern about "CREATE INDEX", but we have no way to 
get progress information of "CREATE INDEX".
At present, I think it may be good to refer to the same time as estimated time 
to execute "COPY TO".
But it's better to get information from pg_stat_activity which is proposed at 
other thread from Anzai-san as following.

http://www.postgresql.org/message-id/116262cf971c844fb6e793f8809b51c6ea6...@bpxm02gp.gisp.nec.co.jp

How about your opinion?


regards,
--
Taiki Kondo


-Original Message-
From: pgsql-hackers-ow...@postgresql.org 
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Andres Freund
Sent: Friday, June 12, 2015 10:48 PM
To: Taiki Kondo
Cc: pgsql-hackers@postgresql.org; Akio Iwaasa
Subject: Re: [HACKERS] [Proposal] Progress bar for pg_dump/pg_restore

Hi,

On 2015-06-12 12:45:50 +, Taiki Kondo wrote:
> Design & API
> 
> When pg_dump / pg_restore is running, progress bar and estimated time to 
> finish is shown on screen like following. 
> 
> 
> =>   (50%)  15:50
> 
> The bar ("=>" in above) and percentage value ("50%" in above) show percentage 
> of progress, and the time ("15:50" in above) shows estimated time to finish.
> (This percentage is the ratio for the whole processing.)
> 
> Percentage and time are calculated and shown for every 1 second.
> 
> In pg_dump, the information, which is required for calculating percentage and 
> time, is from pg_class.
> 
> In pg_restore, to calculate the same things, I want to record total amount of 
> command lines into pg_dump file, thus I would like to add a new element to 
> "Archive" structure.
> (This means that version number of archive format is changed.)

The question is how to actually get useful estimates. As there's no progress 
report for indvidiual COPY and CREATE INDEX commands you'll, in many cases, 
have very irregular progress updates. In many many cases most of the time is 
spent on a very small subset of the total objects.

Greetings,

Andres Freund


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make 
changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tab completion for TABLESAMPLE

2015-06-19 Thread Brendan Jurd
On Sun, 14 Jun 2015 at 20:44 Petr Jelinek  wrote:

> looks like I omitted psql tab completion from the TABLESAMPLE patch. The
> attached patch adds it.
>

Hi Petr,

I'm doing an initial review of this patch.

It applies and compiles cleanly.  Code style is consistent with its
surroundings.

With the patch applied, psql correctly offers a list of existing
tablesample methods for completion whenever the previous word is
'TABLESAMPLE'.  I did notice that the syntax of the TABLESAMPLE clause
requires brackets after the method name, but the tab completion does not
complete the opening bracket.  Whereas, say, ALTER FUNCTION does.  I think
it would be convenient and user-friendly to complete the opening bracket --
it would make it perfectly clear that an argument is required for the
syntax to be valid.

Otherwise, no issues with the patch.

Cheers,
BJ