date:20160312

Re: [HACKERS] multivariate statistics v14

2016-03-12 Thread Jeff Janes

On Wed, Mar 9, 2016 at 9:21 AM, Tomas Vondra
 wrote:
> Hi,
>
> On Wed, 2016-03-09 at 08:45 -0800, Jeff Janes wrote:
>> On Wed, Mar 9, 2016 at 7:02 AM, Tomas Vondra
>>  wrote:
>> > Hi,
>> >
>> > thanks for the feedback. Attached is v14 of the patch series, fixing
>> > most of the points you've raised.
>>
>>
>> Hi Tomas,
>>
>> Applied to aa09cd242fa7e3a694a31f, I still get the seg faults in make
>> check if I configure without --enable-cassert.
>
> Ah, after disabling asserts I can reproduce it too. And the reason why
> it fails is quite simple - clauselist_selectivity modifies the original
> list of clauses, which then confuses cost_qual_eval.
>
> Can you try if the attached patch fixes the issue? I'll need to rework a
> bit more of the code, but let's see if this fixes the issue on your
> machine too.

That patch on top of v14 did fix the original problem.  But I got
another segfault:

jjanes=# create table foo as select x, floor(x/(1000/500))::int as
y  from generate_series(1,1000) f(x);
jjanes=# create index on foo (x,y);
jjanes=# create index on foo (y,x);
jjanes=# create statistics jjj on foo (x,y) with (dependencies,histogram);
jjanes=# analyze ;
server closed the connection unexpectedly

#0  multi_sort_add_dimension (mss=mss@entry=0x7f45dafc7c88,
sortdim=sortdim@entry=0, dim=dim@entry=0,
vacattrstats=vacattrstats@entry=0x16f0dd0) at common.c:436
#1  0x007d022a in update_bucket_ndistinct (attrs=0x166fdf8,
stats=0x16f0dd0, bucket=) at histogram.c:1384
#2  0x007d09aa in create_initial_mv_bucket (stats=0x16f0dd0,
attrs=0x166fdf8, rows=0x17cda20, numrows=3) at histogram.c:880
#3  build_mv_histogram (numrows=3, rows=rows@entry=0x170ecf0,
attrs=attrs@entry=0x166fdf8, stats=stats@entry=0x16f0dd0,
numrows_total=numrows_total@entry=3)
at histogram.c:156
#4  0x007ced19 in build_mv_stats
(onerel=onerel@entry=0x7f45e797d040, totalrows=985,
numrows=numrows@entry=3, rows=rows@entry=0x170ecf0,
natts=natts@entry=2,
vacattrstats=vacattrstats@entry=0x166efa0) at common.c:106
#5  0x0055ff6b in do_analyze_rel
(onerel=onerel@entry=0x7f45e797d040, options=options@entry=2,
va_cols=va_cols@entry=0x0, acquirefunc=,
relpages=44248,
inh=inh@entry=0 '\000', in_outer_xact=in_outer_xact@entry=0
'\000', elevel=elevel@entry=13, params=0x7ffcbe382a30) at
analyze.c:585
#6  0x00560ced in analyze_rel (relid=relid@entry=16441,
relation=relation@entry=0x16bc9d0, options=options@entry=2,
params=params@entry=0x7ffcbe382a30,
va_cols=va_cols@entry=0x0, in_outer_xact=,
bstrategy=0x16640f0) at analyze.c:262
#7  0x005b70fd in vacuum (options=2, relation=0x16bc9d0,
relid=relid@entry=0, params=params@entry=0x7ffcbe382a30, va_cols=0x0,
bstrategy=,
bstrategy@entry=0x0, isTopLevel=isTopLevel@entry=1 '\001') at vacuum.c:313
#8  0x005b748e in ExecVacuum (vacstmt=vacstmt@entry=0x16bca20,
isTopLevel=isTopLevel@entry=1 '\001') at vacuum.c:121
#9  0x006c90f3 in standard_ProcessUtility
(parsetree=0x16bca20, queryString=0x16bbfc0 "analyze foo ;",
context=, params=0x0, dest=0x16bcd60,
completionTag=0x7ffcbe382fa0 "") at utility.c:654
#10 0x7f45e413b1d1 in pgss_ProcessUtility (parsetree=0x16bca20,
queryString=0x16bbfc0 "analyze foo ;",
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=0x16bcd60,
completionTag=0x7ffcbe382fa0 "") at pg_stat_statements.c:986
#11 0x006c6841 in PortalRunUtility (portal=0x16f7700,
utilityStmt=0x16bca20, isTopLevel=, dest=0x16bcd60,
completionTag=0x7ffcbe382fa0 "") at pquery.c:1175
#12 0x006c73c5 in PortalRunMulti
(portal=portal@entry=0x16f7700, isTopLevel=isTopLevel@entry=1 '\001',
dest=dest@entry=0x16bcd60, altdest=altdest@entry=0x16bcd60,
completionTag=completionTag@entry=0x7ffcbe382fa0 "") at pquery.c:1306
#13 0x006c7dd9 in PortalRun (portal=portal@entry=0x16f7700,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1
'\001', dest=dest@entry=0x16bcd60,
altdest=altdest@entry=0x16bcd60,
completionTag=completionTag@entry=0x7ffcbe382fa0 "") at pquery.c:813
#14 0x006c5c98 in exec_simple_query (query_string=0x16bbfc0
"analyze foo ;") at postgres.c:1094
#15 PostgresMain (argc=, argv=argv@entry=0x164baf8,
dbname=0x164b9a8 "jjanes", username=) at
postgres.c:4021
#16 0x0047cb1e in BackendRun (port=0x1669d40) at postmaster.c:4258
#17 BackendStartup (port=0x1669d40) at postmaster.c:3932
#18 ServerLoop () at postmaster.c:1690
#19 0x0066ff27 in PostmasterMain (argc=argc@entry=1,
argv=argv@entry=0x164aa10) at postmaster.c:1298
#20 0x0047d35e in main (argc=1, argv=0x164aa10) at main.c:228

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-12 Thread Pavel Stehule

2016-03-11 14:49 GMT+01:00 Robert Haas :

> On Thu, Feb 18, 2016 at 9:23 AM, Daniel Verite 
> wrote:
> > Dean Rasheed wrote:
> >
> >> If I want to sort the rows coming out of a query, my first thought
> >> is always going to be to add/adjust the query's ORDER BY clause, not
> >> use some weird +/- psql syntax.
> >
> > About the vertical sort, I agree on all your points.
> > It's best to rely on ORDER BY for all the reasons mentioned,
> > as opposed to a separate sort in a second step.
> >
> > But you're considering the case when a user is designing
> > or adapting a query for the purpose of crosstab
> > viewing. As mentioned in my previous reply (about the
> > methods to achieve horizontal sort), that scenario is not really
> > what motivates the feature in the first place.
> >
> > If removing that sort option is required to move forward
> > with the patch because it's controversial, so be it,
> > but overall I don't see this as a benefit for the end user,
> > it's just an option.
>
> Discussion on this patch seems to have died off.  I'm probably not
> going to win any popularity contests for saying this, but I think we
> should reject this patch.  I don't feel like this is really a psql
> feature: it's a powerful data visualization tool which we're proposing
> to jam into psql.  I don't think that's psql's purpose.  I also think
> it's quite possible that there could be an unbounded number of
> slightly different things that people want here, and if we take this
> one and a couple more, the code for these individual features could
> come to be larger than all of psql, even though probably 95% of psql
> users would never use any of those.
>

crosstabview is really visualization tool. **But now, there are not any
other tool available from terminal.** So this can be significant help to
all people who would to use this functionality.

The psql has lot of features for 5% users. Currently it is famous not as
"bloated software" but like most comfortable sql console on the world. The
implementation of crosstabview is not complex and with last Daniel's
modification the complexity is less.

The crosstabview is not 100% equal to ANSI SQL PIVOT clause. The ANSI SQL
command is much more rigid (it is one stage statement with predefined
columns), so argument of duplicate implementation one things is not valid.
Probably we would not implement non ANSI SQL feature on server.

Regards

Pavel


>
> Now, that having been said, if other people want this feature to go in
> and are willing to do the work to get it in, I've said my piece and
> won't complain further.  There are a couple of committers who have
> taken positive interest in this thread, so that's good.  However,
> there are also a couple of committers who have expressed doubts
> similar to mine, so that's not so good.  But worse than either of
> those things, there is no real agreement on what the overall design of
> this feature should be.  Everybody wants something a little different,
> for different reasons.  If we can't come to an agreement, more or less
> immediately, on what to try to get into 9.6, then this can't go into
> this release.  Whether it should go into a future release is a
> question we can leave for another time.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: [HACKERS] building on windows using VC 2008

2016-03-12 Thread Michael Paquier

On Sun, Mar 13, 2016 at 2:07 AM, Dave Cramer  wrote:
> Getting lots of POSTGRESQL_TRACE_... undefined.
>
> Any hints ?

Those are coming from probes.h, that gets automatically generated in a
VC build by running psed. My guess is that you are using ActivePerl
5.22 that does not include psed by default in its distribution. You
will need to patch partially the MSVC scripts in src/tools/msvc to fix
that, one way is 0001 that I sent here:
http://www.postgresql.org/message-id/CAB7nPqT5oGZEZawKZVjxuT=c5qek_eufc1j7knpcbv5rctr...@mail.gmail.com
Another if you have sed in PATH, is to replace psed by sed.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is there a way around function search_path killing SQL function inlining? - and backup / restore issue

2016-03-12 Thread Regina Obe

>> On 3/10/16 3:29 PM, Regina Obe wrote:
>> Take for example, I have tiger geocoder which relies on fuzzystrmatch.  I 
>> have no idea where someone installs fuzzystrmatch so I can't schema qualify 
>> those calls.  I use that dependent function to use to build an index on 
>> tables.

> This is something I've thought about as well, and I think the real problem is 
> search_path just isn't the right way to handle this. I think there needs to 
> be some way to definitively reference something that's part of an extension; 
> a method 
> that doesn't depend on whatever schema the extension happens to be installed 
> in.
--
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in 
> Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in 
> Treble! http://BlueTreble.com

I like that idea a lot though that sounds like something that requires a lot 
more work.  In the long run it would be good though especially since I expect 
more and more extensions will rely on each other.

I have similar concerns with pgRouting which I am a member of dev team too, and 
pgRouting  can't schema qualify any of the PostGIS calls because they have no 
idea where PostGIS is installed and the extension model as it stands
doesn't have provisions for referencing dependent extension locations.  That 
hasn't been a major  issue yet since pgRouting doesn't build functions that 
wrap PostGIS for indexing etc.  it is however more of a future concern and is a 
concern for people who build materialized views using pgRouting functions since 
all of those use PostGIS heavily.

There is even if we do that the case of people just building their own 
functions untop of other things.  I guess that one is not as much of a concern 
since they would generally know where their dependent functions are installed 
and can schema qualify.

Thanks,
Regina

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Explain [Analyze] produces parallel scan for select Into table statements.

2016-03-12 Thread Amit Kapila

On Sat, Mar 12, 2016 at 7:11 PM, Mithun Cy 
wrote:
>
>
>
> On Sat, Mar 12, 2016 at 2:32 PM, Amit Kapila 
wrote:
> >With force_parallel_mode=on, I could see many other failures as well.  I
think it is better to have test, which tests this functionality with
>force_parallel_mode=regress
>
> as per user manual.
> Setting this value to regress has all of the same effects as setting it
to on plus some additional effect that are intended to facilitate automated
> regression testing.
>

Yes, that is the only reason I mentioned that it better to have a test
which can be checked in automated way and I understand that the way you
have written test using Explain won't work in automated way, so not sure if
it is good idea to add such a test.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2016-03-12 Thread Matt Kelly

>
> On Fri, 2016-03-11 at 17:24 +0100, Michael Paquier wrote:
> > On Fri, Mar 11, 2016 at 5:19 PM, Anastasia Lubennikova wrote:
> > >
> > > BTW, if you know a good way to corrupt index (and do it
> > > reproducible) I'd be
> > > very glad to see it.
> > You can use for example dd in non-truncate mode to corrupt on-disk
> > page data, say that for example:
> > dd if=/dev/random bs=8192 count=1 \
> > seek=$BLOCK_ID of=base/$DBOID/$RELFILENODE \
> > conv=notrunc
> I guess /dev/random is not very compatible with the "reproducible"
> requirement. I mean, it will reproducibly break the page, but pretty
> much completely, which is mostly what checksums are for.

You can actually pretty easily produce a test case by setting up streaming
replication between servers running two different version of glibc.

I actually wrote a tool that spins up a pair of VMs using vagrant and then
sets them up as streaming replica's using ansible.  It provides a nice one
liner to get a streaming replica test environment going and it will easily
provide the cross glibc test case.  Technically, though it belongs to Trip
because I wrote it on company time.  Let me see if I can open source a
version of it later this week that way you can use it for testing.

- Matt K.

Re: [HACKERS] MinGW versus _strtoui64() ?

2016-03-12 Thread Tom Lane

Petr Jelinek  writes:
> On 13/03/16 04:24, Tom Lane wrote:
>> So you're thinking "#ifdef _MSC_VER"?  Or something else?

> Sorry for brevity, yes, that should work, afaics mingw has strtoul so 
> the code should compile fine with that ifdef. Also just checked to make 
> sure, the _strtoui64 was added in VS 7.0 (the version before VS 2003) so 
> should work well enough compatibility wise.

Pushed that way.  Thanks for the help!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MinGW versus _strtoui64() ?

2016-03-12 Thread Petr Jelinek


On 13/03/16 04:24, Tom Lane wrote:

Petr Jelinek  writes:

On 13/03/16 03:30, Tom Lane wrote:

Per a comment from Petr Jelinek, I added this in commit 23a27b039d94ba35:

#ifdef WIN32
return _strtoui64(str, endptr, base);
#else ...

Several of the Windows buildfarm members are good with that, but
narwhal is not:



Looks like it has to be limited to MSVC not WIN32, mingw does not have it.


So you're thinking "#ifdef _MSC_VER"?  Or something else?



Sorry for brevity, yes, that should work, afaics mingw has strtoul so 
the code should compile fine with that ifdef. Also just checked to make 
sure, the _strtoui64 was added in VS 7.0 (the version before VS 2003) so 
should work well enough compatibility wise.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MinGW versus _strtoui64() ?

2016-03-12 Thread Tom Lane

Petr Jelinek  writes:
> On 13/03/16 03:30, Tom Lane wrote:
>> Per a comment from Petr Jelinek, I added this in commit 23a27b039d94ba35:
>> 
>> #ifdef WIN32
>> return _strtoui64(str, endptr, base);
>> #else ...
>> 
>> Several of the Windows buildfarm members are good with that, but
>> narwhal is not:

> Looks like it has to be limited to MSVC not WIN32, mingw does not have it.

So you're thinking "#ifdef _MSC_VER"?  Or something else?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MinGW versus _strtoui64() ?

2016-03-12 Thread Petr Jelinek


On 13/03/16 03:30, Tom Lane wrote:

Per a comment from Petr Jelinek, I added this in commit 23a27b039d94ba35:

#ifdef WIN32
return _strtoui64(str, endptr, base);
#else ...

Several of the Windows buildfarm members are good with that, but
narwhal is not:

numutils.c: In function `pg_strtouint64':
numutils.c:406: warning: implicit declaration of function `_strtoui64'
...
Creating library file: libpostgres.a
utils/adt/numutils.o(.text+0x67b): In function `pg_strtouint64':
C:/msys/1.0/local/pgbuildfarm/buildroot/HEAD/pgsql.build/src/backend/utils/adt/numutils.c:406:
 undefined reference to `_strtoui64'
collect2: ld returned 1 exit status



Looks like it has to be limited to MSVC not WIN32, mingw does not have it.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Upper planner pathification

2016-03-12 Thread Tom Lane

Andres Freund  writes:
> On 2016-03-12 12:22:01 -0500, Tom Lane wrote:
>> I wonder whether that's pathification per se.

> If you're interested enough, I've uploaded a dump of the schema relevant
> table to http://anarazel.de/t/lineitem_95_96_plan.dump.gz

I haven't dug into it, but I'll bet this is a case of add_path deciding
that the GroupAgg plan is fuzzily the same cost and better sorted (ie,
it produces *some* sort order, versus none for the hash), so it kicks
the hash plan out.  Again, that would not have happened with the old
hard-wired cost comparisons in grouping_planner, because they considered
no factors other than an exact cost comparison.

> I've not yet looked deep enough to determine the root cause; I did
> however notice that set enable_sort = false; yields a cheaper plan than
> the default one, within the fuzz range (137.91..137.93 vs 138.43..139.02).

Yeah, you're just forcing it to choose the hash plan again.  But that's
within the cost fuzz range, so it's a legitimate choice.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] MinGW versus _strtoui64() ?

2016-03-12 Thread Tom Lane

Per a comment from Petr Jelinek, I added this in commit 23a27b039d94ba35:

#ifdef WIN32
   return _strtoui64(str, endptr, base);
#else ...

Several of the Windows buildfarm members are good with that, but
narwhal is not:

numutils.c: In function `pg_strtouint64':
numutils.c:406: warning: implicit declaration of function `_strtoui64'
...
Creating library file: libpostgres.a
utils/adt/numutils.o(.text+0x67b): In function `pg_strtouint64':
C:/msys/1.0/local/pgbuildfarm/buildroot/HEAD/pgsql.build/src/backend/utils/adt/numutils.c:406:
 undefined reference to `_strtoui64'
collect2: ld returned 1 exit status

Any thoughts how to fix that?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Provide much better wait information in pg_stat_activity.

2016-03-12 Thread Joel Jacobson

On Sun, Mar 13, 2016 at 12:40 AM, Tom Lane  wrote:
> In short: we've already been over this territory, at length,
> and I am not excited by people trying to bikeshed it again
> after the fact, especially when no new arguments are being
> presented.  Can we call the discussion closed, please?

Closed, at least from my side.

I'm grateful to have learned at least a bit more about when it's OK
to sacrifice backwards-compatibility.

Sorry for spamming this thread on that topic,
I'll instead wade through the archives to see what more I can learn
to hopefully become less confused.

Thanks.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-12 Thread Jeff Janes

On Thu, Mar 10, 2016 at 11:25 PM, Peter Geoghegan  wrote:
> On Thu, Mar 10, 2016 at 11:18 PM, Fabien COELHO  wrote:
>> I can only concur!
>>
>> The "Performance Tips" chapter (II.14) is more user/query oriented. The
>> "Server Administration" bool (III) does not discuss this much.
>
> That's definitely one area in which the docs are lacking -- I've heard
> several complaints about this myself. I think we've been hesitant to
> do more in part because the docs must always be categorically correct,
> and must not use weasel words. I think it's hard to talk about
> performance while maintaining the general tone of the documentation. I
> don't know what can be done about that.

Would the wiki be a good place for such tips?  Not as formal as the
documentation, and more centralized (and editable) than a collection
of blog posts.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] building on windows using VC 2008

2016-03-12 Thread Dave Cramer

Getting lots of POSTGRESQL_TRACE_... undefined.

Any hints ?
Dave Cramer

Re: [HACKERS] WIP: Upper planner pathification

2016-03-12 Thread Andres Freund

On 2016-03-12 12:22:01 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-03-10 23:38:14 -0500, Tom Lane wrote:
> >> I'll do it ... just send me the list.
> 
> > After exporting make_agg, make_limit, make_sort_from_sortclauses and
> > making some trivial adjustments due to the pull_var_clause changes
> > change, Citus' tests pass on 9.6, bar some noise.
> 
> OK, done.

Thanks.

> > Pathification made
> > some plans switch from hash-agg to sort-agg, and the other way round;
> > but that's obviously ok.
> 
> I wonder whether that's pathification per se.

If you're interested enough, I've uploaded a dump of the schema relevant
table to http://anarazel.de/t/lineitem_95_96_plan.dump.gz

the affected query is (after ANALYZE lineitem_102009)

EXPLAIN SELECT l_quantity, count(*) AS count, sum(l_extendedprice) AS avg, 
count(l_extendedprice) AS avg, array_agg(l_orderkey) AS array_agg
FROM lineitem_102009 lineitem
WHERE ((l_quantity < '5'::numeric) AND (l_orderkey > 5500) AND (l_orderkey < 
9500)) GROUP BY l_quantity;

=# SELECT version();
┌──┐
│ version   
   │
├──┤
│ PostgreSQL 9.5.1 on x86_64-pc-linux-gnu, compiled by gcc-5.real (Debian 
5.3.1-10) 5.3.1 20160224, 64-bit │
└──┘
(1 row)

=# SELECT l_quantity, count(*) AS count, sum(l_extendedprice) AS avg, 
count(l_extendedprice) AS avg, array_agg(l_orderkey) AS array_agg FROM 
lineitem_102009 lineitem WHERE ((l_quantity < '5'::numeric) AND (l_orderkey > 
5500) AND (l_orderkey < 9500)) GROUP BY l_quantity;
┌┬───┬──┬─┬┐
│ l_quantity │ count │   avg│ avg │   array_agg 
   │
├┼───┼──┼─┼┤
│   1.00 │ 9 │ 13044.06 │   9 │ 
{8997,9026,9158,9184,9220,9222,9348,9383,9476} │
│   4.00 │ 7 │ 40868.84 │   7 │ {9091,9120,9281,9347,9382,9440,9473}
   │
│   2.00 │ 8 │ 26072.02 │   8 │ 
{9030,9058,9123,9124,9188,9344,9441,9476}  │
│   3.00 │ 9 │ 39925.32 │   9 │ 
{9124,9157,9184,9223,9254,9349,9414,9475,9477} │
└┴───┴──┴─┴┘
(4 rows)

Time: 0.906 ms
=# EXPLAIN SELECT l_quantity, count(*) AS count, sum(l_extendedprice) AS avg, 
count(l_extendedprice) AS avg, array_agg(l_orderkey) AS array_agg FROM 
lineitem_102009 lineitem WHERE ((l_quantity < '5'::numeric) AND (l_orderkey > 
5500) AND (l_orderkey < 9500)) GROUP BY l_quantity;
┌┐
│ QUERY PLAN
 │
├┤
│ HashAggregate  (cost=137.91..137.93 rows=1 width=21)  
 │
│   Group Key: l_quantity   
 │
│   ->  Bitmap Heap Scan on lineitem_102009 lineitem  (cost=13.07..137.44 
rows=38 width=21)  │
│ Recheck Cond: ((l_orderkey > 5500) AND (l_orderkey < 9500))   
 │
│ Filter: (l_quantity < '5'::numeric)   
 │
│ ->  Bitmap Index Scan on lineitem_pkey_102009  (cost=0.00..13.06 
rows=478 width=0) │
│   Index Cond: ((l_orderkey > 5500) AND (l_orderkey < 9500))   
 │
└

vs.


=# SELECT version();
┌───
│
version
├───
│ PostgreSQL 9.6devel on x86_64-pc-linux-gnu, compiled by gcc-6.real (Debian 
6-20160228-1) 6.0.0 20160228 (experimental) [trunk revision
└───
(1 row)

=# SELECT l_quantity, count(*) AS count, sum(l_extendedprice) AS avg, 
count(l_extendedprice) AS avg, array_agg(l_orderkey) AS array_agg FROM 
lineitem_102009 lineitem WHERE ((l_quantity < '5'::numeric) AND (l_orderkey > 
5500) AND (l_orderkey < 9500)) GROUP BY l_quantity;
┌┬───┬──┬─┬─

Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-03-12 Thread Peter Geoghegan

On Sat, Mar 12, 2016 at 2:53 PM, Peter Geoghegan  wrote:
> I said "basically uncontroversial", not "uncontroversial". That is a
> perfectly accurate characterization of the patch, and if you disagree
> than I suggest you re-read the thread.

In particular, note that Alvaro eventually sided with me against the
thing that Heikki argued for:

http://www.postgresql.org/message-id/20160118195643.GA117199@alvherre.pgsql

Describing what happened that way is unfair on Heikki, because I don't
think he was at all firm in what he said about making the new
UNIQUE_CHECK_SPECULATIVE "like CHECK_UNIQUE_YES, but return FALSE
instead of throwing an error on conflict". We were working through the
design, and it didn't actually come to any kind of impasse.

It's surprising and disappointing to me that this supposed
disagreement has been blown out of all proportion.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] OS X 10.11.3, psql, bus error 10, 9.5.1

2016-03-12 Thread Chris Ruprecht

unfortunately, I have to admit to my disgrace, that I'm still no C programmer 
after all these decades of dabbling in writing code. I just used the flags 
because someone at some point told me that it was a good idea, turns out, it's 
not [always]. I shall rebuild 9.5.1 without the -fno-common flag and see if 
that fixes things.

Thanks Tom for spending part of your weekend on this.

Chris.


> On Mar 12, 2016, at 17:58, Tom Lane  wrote:
> 
> I wrote:
>> That's confusing because it implies that -fno-common is the default,
>> which it evidently is not.  But anyway, my diagnosis is that you're
>> breaking something about the linker's behavior with that switch.
> 
> Oh!  Looking closer, the core dump happens here:
> 
> const printTextFormat pg_utf8format;
> 
>   printTextFormat *popt = (printTextFormat *) &pg_utf8format;
> 
> -->   popt->name = "unicode";
> 
> So apparently, the relevant property of "-fno-common" is that it
> causes "const" variables to actually get placed in read-only data.
> 
> I think this code is new in 9.5, which'd explain why you didn't see
> the failure with older PG versions.  It's surely busted though.
> 
> I shall get rid of the const-ness, as well as the lame casting away
> of it, and I think I will also go make buildfarm member longfin use
> "-fno-common".  It is truly sad that we apparently have no test
> machine that enforces that const means const ...
> 
>   regards, tom lane



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Refectoring of receivelog.c

2016-03-12 Thread Tomas Vondra


On 03/11/2016 11:15 AM, Magnus Hagander wrote:



...



Pushed with updated comments and a named stsruct.


Pretty sure this memset call in pg_basebackup.c is incorrect, as it 
passes parameters in the wrong order:


MemSet(&stream, sizeof(stream), 0);

It seems benign, because we're setting all the fields explicitly, but 
gcc is nagging about it.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] OS X 10.11.3, psql, bus error 10, 9.5.1

2016-03-12 Thread Tom Lane

I wrote:
> That's confusing because it implies that -fno-common is the default,
> which it evidently is not.  But anyway, my diagnosis is that you're
> breaking something about the linker's behavior with that switch.

Oh!  Looking closer, the core dump happens here:

const printTextFormat pg_utf8format;

printTextFormat *popt = (printTextFormat *) &pg_utf8format;

--> popt->name = "unicode";

So apparently, the relevant property of "-fno-common" is that it
causes "const" variables to actually get placed in read-only data.

I think this code is new in 9.5, which'd explain why you didn't see
the failure with older PG versions.  It's surely busted though.

I shall get rid of the const-ness, as well as the lame casting away
of it, and I think I will also go make buildfarm member longfin use
"-fno-common".  It is truly sad that we apparently have no test
machine that enforces that const means const ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-03-12 Thread Peter Geoghegan

On Sat, Mar 12, 2016 at 2:43 PM, Michael Paquier
 wrote:
> Only one version of this patch has been sent at the beginning of this
> thread, and Heikki has clearly expressed his disagreement about at
> least a portion of it at the beginning of this thread, so I find it
> hard to define it as an "uncontroversial" thing and something that is
> clear to have as things stand. Seeing a new version soon would be a
> good next step I guess.

What is the point in saying this, Michael? What purpose does it serve?

I said "basically uncontroversial", not "uncontroversial". That is a
perfectly accurate characterization of the patch, and if you disagree
than I suggest you re-read the thread. Andres and Heikki were both in
favor of this patch. Heikki and I discussed one particular aspect of
it, and then it trailed off. The only thing that Heikki categorically
stated was that he disliked one narrow aspect of the style of one
thing in one function. I've already said I'm happy to do that.

As things stand, the documentation for amcanunique methods, and the
way they are described internally, is fairly misleading.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-03-12 Thread Michael Paquier

On Sat, Mar 12, 2016 at 11:24 PM, Peter Geoghegan  wrote:
> On Fri, Mar 11, 2016 at 5:26 AM, Robert Haas  wrote:
>> This patch was reviewed during CF 2016-01 and has not been updated for
>> CF 2016-03.  I think we should mark it Returned with Feedback.
>
> I have a full plate at the moment, Robert, both as a reviewer and as a
> patch author. This patch is basically uncontroversial, and is built to
> make the AM interface clearer, and the design of speculative insertion
> easier to understand. It's clear we should have it. I'll get around to
> revising it before too long.

Only one version of this patch has been sent at the beginning of this
thread, and Heikki has clearly expressed his disagreement about at
least a portion of it at the beginning of this thread, so I find it
hard to define it as an "uncontroversial" thing and something that is
clear to have as things stand. Seeing a new version soon would be a
good next step I guess.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-03-12 Thread Peter Geoghegan

On Fri, Mar 11, 2016 at 5:26 AM, Robert Haas  wrote:
> This patch was reviewed during CF 2016-01 and has not been updated for
> CF 2016-03.  I think we should mark it Returned with Feedback.

I have a full plate at the moment, Robert, both as a reviewer and as a
patch author. This patch is basically uncontroversial, and is built to
make the AM interface clearer, and the design of speculative insertion
easier to understand. It's clear we should have it. I'll get around to
revising it before too long.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pam auth - add rhost item

2016-03-12 Thread Haribabu Kommi

On Sun, Mar 13, 2016 at 8:07 AM, Grzegorz Sampolski  wrote:
> Hi.
> Thank you for improve documentation and yes I'm fine with this chages.

Thanks. changed the patch status as ready for committer.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pl/pgSQL, get diagnostics and big data

2016-03-12 Thread Tom Lane

I wrote:
> 2. As I was just complaining to -hackers, plpython plperl and pltcl
> all now contain attempts to pass uint64 values (from SPI_processed)
> into language-specific functions.  We need to figure out whether
> that will overflow and whether it's worth doing something about.

I fixed this along the lines suggested by Salvador Fandino, viz convert
to the language's equivalent of "double" if it wouldn't fit in int
or long respectively.  Tcl turns out to have a native int64 type
("WideInt") so that was slightly less messy.

I've pushed this so we can get some buildfarm testing, but it wouldn't
be a bad idea for someone to review the committed patch.  Chasing all
the dependencies was tedious and I'm still not real sure I found them
all.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pam auth - add rhost item

2016-03-12 Thread Grzegorz Sampolski

Hi.
Thank you for improve documentation and yes I'm fine with this chages.

Regards.
Grzegorz.

On 03/12/2016 01:17 PM, Haribabu Kommi wrote:
> On Fri, Mar 11, 2016 at 12:11 AM, Grzegorz Sampolski  wrote:
>> Hi.
>> In attchment new patch with updated documentation and with small change
>> to coding style as you suggested.
> 
> 
> Thanks for the update. Here I attached updated patch with additional
> documentation
> changes, If you are fine with the changes, I will mark the patch as
> ready for committer.
> 
> 
> Regards,
> Hari Babu
> Fujitsu Australia
> 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2016-03-12 Thread Peter Geoghegan

On Fri, Mar 11, 2016 at 6:33 PM, Tomas Vondra
 wrote:
> Right, but isn't there a difference between the two functions in this
> respect? Once you find corruption involving relationship between
> multiple pages, then I agree it's complicated to do any reasoning about
> what additional checks are safe.
>
> But does that problem also apply to bt_index_check, which checks pages
> independently?

I think so, yes.

> Admittedly, this also depends on the use case. If all we need to do is
> answering a question "Is the index corrupted?" then sure, bailing out
> on the first error is perfectly appropriate.
>
> But perhaps this would be useful for some recovery/forensics tasks?

Maybe, but I feel like making it possible to change the CORRUPTION
elevel macro was the right trade-off. I don't want to have to reason
about the universe of possible problems that could occur when the tool
must limp on in the event of corruption. For example, I don't want to
have to deal with infinite loops. In practice, an expert would
probably be fine to change the constant themselves if they needed to.

Indexes can always be rebuilt. The tool is for identifying and
diagnosing corruption, but if you want to diagnose a faulty opclass or
something, then I think you need to get out pageinspect. You need
human judgement for that.

> From time to time we need to investigate corruption in a database, i.e.
> see how much of the data is actually corrupted, list pages that need to
> be zeroed to get the cluster up to salvage as much as possible, etc.
> Currently this is tedious because we essentially find/fix the pages one
> by one. It'd be very useful to list all broken pages in one go and then
> fix all of them.
>
> Obviously, that's about heapam checks, but perhaps it would be useful
> for an index too?

Only insofar as it helps diagnose the underlying issue, when it is a
more subtle issue. Actually fixing the index is almost certainly a
REINDEX. Once you're into the messy business of diagnosing a
problematic opclass, you have to be an expert, and tweaking amcheck
for your requirements (i.e. rebuilding from source) becomes
reasonable. Part of the reason that the code is so heavily commented
is to make it hackable, because I do not feel optimistic that I can
get an expert-orientated interface right, but I still want to make the
tool as useful as possible to experts.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Performance improvement for joins where outer side is unique

2016-03-12 Thread David G. Johnston

On Saturday, March 12, 2016, David G. Johnston 
wrote:

> On Saturday, March 12, 2016, Tom Lane  > wrote:
>
>> "David G. Johnston"  writes:
>> > Don't the semantics of a SEMI JOIN also state that the output columns
>> only
>> > come from the outer relation? i.e., the inner relation doesn't
>> contribute
>> > either rows or columns to the final result?  Or is that simply
>> > an implementation artifact of the fact that the only current way to
>> perform
>> > a semi-join explicitly is via exists/in?
>>
>> I think it's an artifact.  What nodes.h actually says about it is you get
>> the values of one randomly-selected matching inner row, which seems like
>> a fine definition for the purposes we plan to put it to.
>>
>>
> But is it a definition that actually materializes anywhere presently?
>
> I'm not sure what we consider an authoritative source but relational
> algebra does define the results of semi and anti joins as only containing
> rows from main relation.
>
> https://en.m.wikipedia.org/wiki/Relational_algebra
>

Pondering it more calling these optimizations "semi" joins further
distances us from the meaning of "semi" as used in relational algebra.  The
half that semi refers to IS that only one half of the tables are returned.
That you only get a single row of output regardless of multiple potential
matches is simply a consequence of this and general set theory.

In short "semi" communicates a semantic meaning as to the intended output
of the query irrespective of the data upon which it is executed.  We now
are hijacking the and calling something "semi" if by some chance the data
the query is operating against happens to be accommodating to some
particular optimization.

This seems wrong on definitional and cleanliness grounds.

So while I'm still liking the idea of introducing specializations of outer
and inner joins I think calling them "semi" joins adds a definitional
inconsistency we are better off avoiding.

This came about because calling something "outer semi join" struck me as
odd.

Something like "outer only join" and "inner only join" comes to mind.
Consider the parallel between this and "index only scan".  Learning that
"only" means "join the outer row to the (at most for outer) one and only
row in the inner relation" doesn't seem to much of a challenge.

David J.

[HACKERS] Re: [COMMITTERS] pgsql: Only try to push down foreign joins if the user mapping OIDs mat

2016-03-12 Thread Andres Freund

Hi,

On 2016-03-12 11:56:24 -0500, Robert Haas wrote:
> On Fri, Mar 11, 2016 at 10:15 PM, Andres Freund  wrote:
> > On 2016-01-28 19:09:01 +, Robert Haas wrote:
> >> Only try to push down foreign joins if the user mapping OIDs match.
> >>
> >> Previously, the foreign join pushdown infrastructure left the question
> >> of security entirely up to individual FDWs, but it would be easy for
> >> a foreign data wrapper to inadvertently open up subtle security holes
> >> that way.  So, make it the core code's job to determine which user
> >> mapping OID is relevant, and don't attempt join pushdown unless it's
> >> the same for all relevant relations.
> >>
> >> Per a suggestion from Tom Lane.  Shigeru Hanada and Ashutosh Bapat,
> >> reviewed by Etsuro Fujita and KaiGai Kohei, with some further
> >> changes by me.
> >
> > I noticed that this breaks some citus regression tests in a minor
> > manner. Namely previously file_fdw worked without a user mapping, now it
> > doesn't appear to anymore.
> >
> > This is easy enough to fix, and it's perfectly ok for us to fix this,
> > but I do wonder if that's not going to cause trouble for others.
> 
> Hmm, I didn't intend to change that.  If that commit broke something,
> there's obviously a hole in our regression test coverage.

CREATE EXTENSION file_fdw;
CREATE SERVER file_server FOREIGN DATA WRAPPER file_fdw;

CREATE FOREIGN TABLE agg_csv (
a   int2,
b   float4
) SERVER file_server
OPTIONS (format 'csv', filename 
'/home/andres/src/postgresql/contrib/file_fdw/data/agg.csv', header 'true', 
delimiter ';', quote '@', escape '"', null '');

SELECT * FROM agg_csv;


worked in 9.5, but doesn't in master. The difference apears to be the
check that's now in build_simple_rel() - there was nothing hitting the
user mapping code before for file_fdw.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Performance improvement for joins where outer side is unique

2016-03-12 Thread David G. Johnston

On Saturday, March 12, 2016, Tom Lane  wrote:

> "David G. Johnston" > writes:
> > Don't the semantics of a SEMI JOIN also state that the output columns
> only
> > come from the outer relation? i.e., the inner relation doesn't contribute
> > either rows or columns to the final result?  Or is that simply
> > an implementation artifact of the fact that the only current way to
> perform
> > a semi-join explicitly is via exists/in?
>
> I think it's an artifact.  What nodes.h actually says about it is you get
> the values of one randomly-selected matching inner row, which seems like
> a fine definition for the purposes we plan to put it to.
>
>
But is it a definition that actually materializes anywhere presently?

I'm not sure what we consider an authoritative source but relational
algebra does define the results of semi and anti joins as only containing
rows from main relation.

https://en.m.wikipedia.org/wiki/Relational_algebra

Given that this is largely internals (aside from the plan explanations
themselves) I guess we can punt for now but calling an inner or outer join
a semijoin in this case relies on a non-standard definition of semijoin -
namely that it is an optimized variation of the other joins instead of a
join type in its own right.  This is complicated further in that we
do implement a true semijoin (using exists) while we allow for an anti join
to be non-standard if expressed using "left join ... is null" instead of
via "not exists".

Calling these optimizations outer/inner+semi/anti preserves the ability to
distinguish these versions from the standard definitions.  I do like ithe
idea of it being exposed and encapsulated as a distinct join type instead
of being an attribute.

David J.

Re: [HACKERS] pl/pgSQL, get diagnostics and big data

2016-03-12 Thread Tom Lane

Petr Jelinek  writes:
> On 12/03/16 04:30, Tom Lane wrote:
>> 1. I found two places (marked XXX in this patch) that are using strtoul()
>> to parse a tuple count back out of a command tag.  That won't do anymore.
>> pg_stat_statements has a messy hack for the same problem (look for
>> HAVE_STRTOULL), which is probably what we want to do, but not by
>> copy-and-pasting #ifdef HAVE_STRTOULL into multiple places.  I'd be
>> inclined to provide a utility function "pg_strtouint64" or some such
>> to encapsulate that.  (numutils.c might be a good place for it.)

> Hmm, I thought that solution is not really portable for 64bit numbers 
> and only is allowed in pg_stat_statements because worst case it will cut 
> the number to 32bit int and misreport but won't break anything there. 
> For example windows IIRC need _strtoui64 for this.

OK, we can use _strtoui64() on Windows.

> I once wrote (well copy-pasted from BDS + some #define wrappers) 
> portable version of that, see the 0004 and bottom of 0003 in 
> http://www.postgresql.org/message-id/557d9ded.2080...@2ndquadrant.com (I 
> think at minimum what the 0003 does in c.h is needed).

Meh.  That seems like pretty substantial overkill.  Given that we assume
platforms have working 64-bit support these days, what's the probability
that they don't have an appropriate strtoXXX function?  Or on one that
doesn't, that anyone will ever try to run >4G-tuple results through the
relevant code paths?

For the moment I'm just going to do this:

uint64
pg_strtouint64(const char *str, char **endptr, int base)
{
#ifdef WIN32
return _strtoui64(str, endptr, base);
#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
return strtoull(str, endptr, base);
#else
return strtoul(str, endptr, base);
#endif
}

If there ever seems to be any practical value in improving it, we
can do that later.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Provide much better wait information in pg_stat_activity.

2016-03-12 Thread Tom Lane

Robert Haas  writes:
> You could also argue that's a compatibility break, because people may
> have logic that assumes that a wait is always a heavyweight lock wait.
> If we keep the column but change the meaning, people who need to
> update their scripts may fail to notice.  Hard breaks aren't that fun,
> but at least you don't fail to notice that something needs to be
> changed.

Yes.  My recollection of the argument for the earlier renames of
pg_stat_activity columns is that it was basically the same thing:
we changed the semantics of these columns, you are very likely to
need to adjust your queries, so we'll change the column names to
make sure you notice.  There's always a tradeoff there.  Maybe you
won't need to adjust your queries, but maybe they'll break silently.

In this case I agree with the feeling that people probably took
waiting == true as an indication that there was a matching entry
in pg_locks, so the odds of subtle breakage if we keep the name
the same while changing the semantics are pretty high.  Or we
could keep the semantics the same (waiting is true only for
heavyweight-lock waits) but that was mighty ugly too.

In short: we've already been over this territory, at length,
and I am not excited by people trying to bikeshed it again
after the fact, especially when no new arguments are being
presented.  Can we call the discussion closed, please?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Background Processes and reporting

2016-03-12 Thread Vladimir Borodin


> 12 марта 2016 г., в 13:59, Amit Kapila  написал(а):
> 
> On Sat, Mar 12, 2016 at 3:10 AM, Andres Freund  > wrote:
> >
> >
> > > Similarly for the wait event stuff - checkpointer, wal writer,
> > > background writer are in many cases processes that very often are
> > > blocked on locks, IO and such.  Thus restricting the facility to
> > > database connected processes seems like a loss.
> >
> > I think one way to address this would be to not only report
> > PgBackendStatus type processes in pg_stat_activity. While that'd
> > obviously be a compatibility break, I think it'd be an improvement.
> >
> 
> I think here another point which needs more thoughts is that many of the 
> pg_stat_activity fields are not relevant for background processes, ofcourse 
> one can say that we can keep those fields as NULL, but still I think that 
> indicates it is not the most suitable way to expose such information.
> 
> Another way could be to have new view like pg_stat_background_activity with 
> only relevant fields or try expose via individual views like pg_stat_bgwriter.

From the DBA point of view it is much more convenient to see all wait events in 
one view. I don’t know if it is right to break compability even more, but IMHO 
exposing this data in different views is a bad plan.

> 
> Do you intend to get this done for 9.6 considering an add-on patch for wait 
> event information displayed in pg_stat_activity?
> 
> 
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com 

--
May the force be with you…
https://simply.name

Re: [HACKERS] Background Processes and reporting

2016-03-12 Thread Vladimir Borodin


> 12 марта 2016 г., в 2:45, Andres Freund  написал(а):
> 
> On 2016-03-12 02:24:33 +0300, Alexander Korotkov wrote:
>> Idea of individual time measurement of every wait event met criticism
>> because it might have high overhead [1].
> 
> Right. And that's actually one of the point which I meant with "didn't
> listen to criticism". There've been a lot of examples, on an off list,
> where taking timings trigger significant slowdowns.  Yes, in some
> bare-metal environments, which a coherent tsc, the overhead can be
> low. But that doesn't make it ok to have a high overhead on a lot of
> other systems.

That’s why proposal included GUC for that with a default to turn timings 
measuring off. I don’t remember any objections against that.

And I’m absolutely sure that a real highload production (which of course 
doesn’t use virtualization and windows) can’t exist without measuring timings. 
Oracle guys have written several chapters (!) about that [0]. Long story short, 
sampling doesn’t give enough precision. I have shown overhead [1] on bare metal 
linux with high stressed lwlocks worload. BTW Oracle doesn’t give you any ways 
to turn timings measurement off, even with hidden parameters. All other 
commercial databases have waits monitoring with timings measurement. Let’s do 
it and turn it off by default so that all other platforms don’t suffer from it.

[0] 
http://www.amazon.com/Optimizing-Oracle-Performance-Cary-Millsap/dp/059600527X
[1] 
http://www.postgresql.org/message-id/eee78e40-0e48-411a-9f90-cf9339da9...@simply.name

> 
> Just claiming that that's not a problem will only lead to your position
> not being taken serious.
> 
> 
>> This is really so at least for Windows [2].
> 
> Measuring timing overhead for a simplistic workload on a single system
> doesn't mean that.  Try doing such a test on a vmware esx virtualized
> windows machine, on a multi-socket server; in a lot of instances you'll
> see two-three orders of magnitude longer average times; with peaks going
> into 4-5 orders of magnitude.  And, as sad it is, realistically most
> postgres instances will run in virtualized environments.
> 
> 
>> But accessing only current values wouldn't be very useful.  We
>> anyway need to gather some statistics.  Gathering it by sampling would be
>> both more expensive and less accurate for majority of systems.  This is why
>> I proposed hooks to make possible platform dependent extensions.  Robert
>> rejects hook because he is "not a big fan of hooks as a way of resolving
>> disagreements about the design" [3].
> 
> I think I agree with Robert here. Providing hooks into very low level
> places tends to lead to problems in my experience; tight control over
> what happens is often important - I certainly don't want any external
> code to run while we're waiting for an lwlock.
> 
> 
>> Besides that is actually not design issues but platform issues...
> 
> I don't see how that's the case.
> 
> 
>> Another question is wait parameters.  We want to expose wait event with
>> some parameters.  Robert rejects that because it *might* add additional
>> overhead [3]. When I proposed to fit something useful into hard-won
>> 4-bytes, Roberts claims that it is "too clever" [4].
> 
> I think stopping to treat this as "Robert/EDB vs. pgpro" would be a good
> first step to make progress here.
> 
> 
> It seems entirely possible to extend the current API in an incremental
> fashion, either allowing to disable the individual pieces, or providing
> sufficient measurements that it's not needed.
> 
> 
>> So, situation looks like dead-end.  I have no idea how to convince Robert
>> about any kind of advanced functionality of wait monitoring to PostgreSQL.
>> I'm thinking about implementing sampling extension over current
>> infrastructure just to make community see that it sucks. Andres, it would
>> be very nice if you have any idea how to move this situation forward.
> 
> I've had my share of conflicts with Robert. But if I were in his shoes,
> targeted by this kind of rhetoric, I'd be very tempted to just ignore
> any further arguments from the origin.  So I think the way forward is
> for everyone to cool off, and to see how we can incrementally make
> progress from here on.
> 
> 
>> Another aspect is that EnterpriseDB offers waits monitoring in proprietary
>> fork [5].
> 
> So?
> 
> Greetings,
> 
> Andres Freund
> 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


--
May the force be with you…
https://simply.name

Re: [HACKERS] Performance improvement for joins where outer side is unique

2016-03-12 Thread Tom Lane

Robert Haas  writes:
> The new join pushdown code in postgres_fdw does not grok SEMI and ANTI
> joins because there is no straightforward way of reducing those back
> to SQL.  They can originate in multiple ways and not all of those can
> be represented easily.  I think it would be nice to do something to
> fix this.  For example, if a LEFT join WHERE outer_column IS NULL
> turns into an ANTI join, it would be nice if that were marked in some
> way so that postgres_fdw could conveniently emit it in the original
> form.

> Maybe the people who have been working on that patch just haven't been
> creative enough in thinking about how to solve this problem, but it
> makes me greet the idea of more join types that don't map directly
> back to SQL with somewhat mixed feelings.

I can't summon a whole lot of sympathy for that objection.  These cases
won't arise with postgres_fdw as it stands because we'd never be able to
prove uniqueness on a foreign table.  When and if someone tries to improve
that, we can think about how the whole thing ought to map to FDWs.

Having said that, your point does add a bit of weight to David's
suggestion of inventing two new join-type codes rather than overloading
JOIN_SEMI.  I'd still be a bit inclined to display JOIN_INNER_UNIQUE or
whatever we call it as a "Semi" join in EXPLAIN, though, just to minimize
the amount of newness there.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Upper planner pathification

2016-03-12 Thread Tom Lane

Andres Freund  writes:
> On 2016-03-10 23:38:14 -0500, Tom Lane wrote:
>> I'll do it ... just send me the list.

> After exporting make_agg, make_limit, make_sort_from_sortclauses and
> making some trivial adjustments due to the pull_var_clause changes
> change, Citus' tests pass on 9.6, bar some noise.

OK, done.

> Pathification made
> some plans switch from hash-agg to sort-agg, and the other way round;
> but that's obviously ok.

I wonder whether that's pathification per se.  Of the three core
regression test EXPLAINs that changed in the pathification commit,
two actually were a case of finding better plans.  The other one
was a random-looking swap between two plans with near-identical
costs.  When I looked into it, I found that the reason the planner
liked the new plan better was that it was parallel-safe; add_path()
saw the costs as fuzzily equal and allowed parallel-safe to be the
determining factor in the choice.  The old code hadn't done that
because the hard-wired cost comparisons in grouping_planner() never
took parallel-safety into account.  But I'd call that a parallelism
change, not a pathification change; it would certainly have appeared
to be that if the patches had gone in in the opposite order.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] eXtensible Transaction Manager API (v2)

2016-03-12 Thread Robert Haas

On Sat, Mar 12, 2016 at 11:06 AM, Tom Lane  wrote:
> Michael Paquier  writes:
>> On Fri, Mar 11, 2016 at 9:35 PM, Tom Lane  wrote:
>>> IMO this is not committable as-is, and I don't think that it's something
>>> that will become committable during this 'fest.  I think we'd be well
>>> advised to boot it to the 2016-09 CF and focus our efforts on other stuff
>>> that has a better chance of getting finished this month.
>
>> Yeah, I would believe that a good first step would be to discuss
>> deeply about that directly at PGCon for folks that will be there and
>> interested in the subject. It seems like a good timing to brainstorm
>> things F2F at the developer unconference for example, a couple of
>> months before the 1st CF of 9.7. We may perhaps (or not) get to
>> cleaner picture of what kind of things are wanted in this area.
>
> Yeah, the whole area seems like a great topic for some unconference
> sessions.

I agree.  I think this is a problem we really need to solve, and I
think talking about it will help us figure out the best solution.  I'd
also be interested in hearing Kevin Grittner's thoughts about
serializability in a distributed environment, since he's obviously
thought about the topic of serializability quite a bit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Provide much better wait information in pg_stat_activity.

2016-03-12 Thread Robert Haas

On Fri, Mar 11, 2016 at 6:31 PM, Jim Nasby  wrote:
> On 3/10/16 8:36 PM, Robert Haas wrote:
>> 1. We make it true only for heavyweight lock waits, and false for
>> other kinds of waits.  That's pretty strange.
>> 2. We make it true for all kinds of waits that we now know how to
>> report.  That still breaks compatibility.
>
>
> I would absolutely vote for 2 here. You could even argue that it's a bug
> fix, since those were waits we technically should have been indicating.

You could also argue that's a compatibility break, because people may
have logic that assumes that a wait is always a heavyweight lock wait.
If we keep the column but change the meaning, people who need to
update their scripts may fail to notice.  Hard breaks aren't that fun,
but at least you don't fail to notice that something needs to be
changed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Performance improvement for joins where outer side is unique

2016-03-12 Thread Robert Haas

On Fri, Mar 11, 2016 at 4:32 PM, Tom Lane  wrote:
> So I started re-reading this thread in preparation for looking at the
> patch, and this bit in your initial message jumped out at me:
>
>> In all of our join algorithms in the executor, if the join type is SEMI,
>> we skip to the next outer row once we find a matching inner row. This is
>> because we don't want to allow duplicate rows in the inner side to
>> duplicate outer rows in the result set. Obviously this is required per SQL
>> spec. I believe we can also skip to the next outer row in this case when
>> we've managed to prove that no other row can possibly exist that matches
>> the current outer row, due to a unique index or group by/distinct clause
>> (for subqueries).
>
> I wondered why, instead of inventing an extra semantics-modifying flag,
> we couldn't just change the jointype to *be* JOIN_SEMI when we've
> discovered that the inner side is unique.
>
> Now of course this only works if the join type was INNER to start with.
> If it was a LEFT join, you'd need an "outer semi join" jointype which
> we haven't got at the moment.  But I wonder whether inventing that
> jointype wouldn't let us arrive at a less messy handling of things in
> the executor and EXPLAIN.  I'm not very enamored of plastering this
> "match_first_tuple_only" flag on every join, in part because it doesn't
> appear to have sensible semantics for other jointypes such as JOIN_RIGHT.
> And I'd really be happier to see the information reflected by join type
> than a new line in EXPLAIN, also.

The new join pushdown code in postgres_fdw does not grok SEMI and ANTI
joins because there is no straightforward way of reducing those back
to SQL.  They can originate in multiple ways and not all of those can
be represented easily.  I think it would be nice to do something to
fix this.  For example, if a LEFT join WHERE outer_column IS NULL
turns into an ANTI join, it would be nice if that were marked in some
way so that postgres_fdw could conveniently emit it in the original
form.

Maybe the people who have been working on that patch just haven't been
creative enough in thinking about how to solve this problem, but it
makes me greet the idea of more join types that don't map directly
back to SQL with somewhat mixed feelings.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2016-03-12 Thread Jim Nasby


On 3/10/16 7:48 AM, Robert Haas wrote:

I think the problem is that you can't show the name of a non-global
SQL object (such as a relation) unless the object is in the current
database.  Many of the views in the first group are database-local
views, while things like pg_locks span all databases.  We can show the
datid/relid always, but if we have a relname column it will have to be
NULL unless the datid is our database.


I would prefer that if the object is in another database we at least 
display the OID. That way, if you're logging this info you can go back 
later and figure out what was going on.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Only try to push down foreign joins if the user mapping OIDs mat

2016-03-12 Thread Robert Haas

On Fri, Mar 11, 2016 at 10:15 PM, Andres Freund  wrote:
> On 2016-01-28 19:09:01 +, Robert Haas wrote:
>> Only try to push down foreign joins if the user mapping OIDs match.
>>
>> Previously, the foreign join pushdown infrastructure left the question
>> of security entirely up to individual FDWs, but it would be easy for
>> a foreign data wrapper to inadvertently open up subtle security holes
>> that way.  So, make it the core code's job to determine which user
>> mapping OID is relevant, and don't attempt join pushdown unless it's
>> the same for all relevant relations.
>>
>> Per a suggestion from Tom Lane.  Shigeru Hanada and Ashutosh Bapat,
>> reviewed by Etsuro Fujita and KaiGai Kohei, with some further
>> changes by me.
>
> I noticed that this breaks some citus regression tests in a minor
> manner. Namely previously file_fdw worked without a user mapping, now it
> doesn't appear to anymore.
>
> This is easy enough to fix, and it's perfectly ok for us to fix this,
> but I do wonder if that's not going to cause trouble for others.

Hmm, I didn't intend to change that.  If that commit broke something,
there's obviously a hole in our regression test coverage.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perl's newSViv() versus 64-bit ints?

2016-03-12 Thread Tom Lane

=?UTF-8?Q?Salvador_Fandi=c3=b1o?=  writes:
> On 03/12/2016 04:47 PM, Tom Lane wrote:
>> How much of a user-visible change would that be, if the "processed"
>> field of a spi_exec_query() result started coming back as an NV not
>> an IV?  I'm not sure how much that would affect semantics in typical
>> Perl code.

> At the Perl level, IVs and NVs are mostly indistinguishable, and Perl 
> does promote values internally from IVs to NVs to avoid overflows 
> automatically.

Sounds good.  I notice that the manual discourages people from using
spi_exec_query() for "big" results, which means we could possibly get
away without doing anything here; but I'll feel better about it if we
can push the upper limit to 2^53 or so.

Given that text in the manual, I think it might be worth the code space
to do it like this:

   (SPI_processed > INT_MAX) ? newSVnv(SPI_processed) : newSViv(SPI_processed)

since the NV code path is presumably a bit slower and it's very likely
that users would never actually need it.

I wonder whether this idea has analogues for python and tcl ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Performance improvement for joins where outer side is unique

2016-03-12 Thread Tom Lane

David Rowley  writes:
> On 12 March 2016 at 11:43, Tom Lane  wrote:
>>> I wondered why, instead of inventing an extra semantics-modifying flag,
>>> we couldn't just change the jointype to *be* JOIN_SEMI when we've
>>> discovered that the inner side is unique.

> The thing that might matter is that, this;

> explain (costs off) select * from t1 inner join t2 on t1.id=t2.id
>  QUERY PLAN
> --
>  Hash Join
>Hash Cond: (t1.id = t2.id)
>->  Seq Scan on t1
>->  Hash
>  ->  Seq Scan on t2

> could become;

>   QUERY PLAN
> --
>  Hash Semi Join
>Hash Cond: (t1.id = t2.id)
>->  Seq Scan on t1
>->  Hash
>  ->  Seq Scan on t2

> Wouldn't that cause quite a bit of confusion?

Well, no more than was introduced when we invented semi joins at all.

> Now, we could get around that by
> adding JOIN_SEMI_INNER I guess, and just displaying that as a normal
> inner join, yet it'll behave exactly like JOIN_SEMI!

I'm not that thrilled with having EXPLAIN hide real differences in the
plan from you; if I was, I'd have just lobbied to drop the "unique inner"
annotation from EXPLAIN output altogether.

(I think at one point we'd discussed displaying this in EXPLAIN output
as a different join type, and I'd been against it at the time.  What
changed my thinking was realizing that it could be mapped on to the
existing jointype "semi join".  We still need one new concept,
"outer semi join" or whatever we decide to call it, but it's less of
a stretch than I'd supposed originally.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perl's newSViv() versus 64-bit ints?

2016-03-12 Thread Salvador Fandiño


On 03/12/2016 04:47 PM, Tom Lane wrote:

=?UTF-8?Q?Salvador_Fandi=c3=b1o?=  writes:

Another possibility is to just use newSVnv(), but NVs are not
able to represent all the uint64 range precisely (IIRC, they can
represent integers up to 48bits?).


[ looks... ]  Oh, NV is a "double", which I think would be a perfectly
reasonable choice: it'd be exact up to about 2^53, on most machines,
which should be plenty for a long time to come.

How much of a user-visible change would that be, if the "processed"
field of a spi_exec_query() result started coming back as an NV not
an IV?  I'm not sure how much that would affect semantics in typical
Perl code.


At the Perl level, IVs and NVs are mostly indistinguishable, and Perl 
does promote values internally from IVs to NVs to avoid overflows 
automatically.


There are some operations that cause an implicit coercion from NV to IV 
(or UV) under the hood, and in those cases, big values would get 
mangled. For instance, bit operations as <<, >> or ~, or calling pack or 
printf with some integer template do that.


Those don't look like common operations for "processed".





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Use MemoryContextAlloc() in the MemoryContextAllocZero() and MemoryContextAllocZeroAligned()

2016-03-12 Thread Tom Lane

Alexander Kuleshov  writes:
> Attached patch simplifies the MemoryContextAllocZero() and
> MemoryContextAllocZeroAligned().

What this does is to de-inline those functions, resulting in an
extra level of function call per allocation.  We had intentionally
inlined them on performance grounds: those things are hot spots in
most workloads.  Do you have any evidence demonstrating that this
doesn't cause a performance hit?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] eXtensible Transaction Manager API (v2)

2016-03-12 Thread Tom Lane

Michael Paquier  writes:
> On Fri, Mar 11, 2016 at 9:35 PM, Tom Lane  wrote:
>> IMO this is not committable as-is, and I don't think that it's something
>> that will become committable during this 'fest.  I think we'd be well
>> advised to boot it to the 2016-09 CF and focus our efforts on other stuff
>> that has a better chance of getting finished this month.

> Yeah, I would believe that a good first step would be to discuss
> deeply about that directly at PGCon for folks that will be there and
> interested in the subject. It seems like a good timing to brainstorm
> things F2F at the developer unconference for example, a couple of
> months before the 1st CF of 9.7. We may perhaps (or not) get to
> cleaner picture of what kind of things are wanted in this area.

Yeah, the whole area seems like a great topic for some unconference
sessions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Performance improvement for joins where outer side is unique

2016-03-12 Thread Tom Lane

"David G. Johnston"  writes:
> Don't the semantics of a SEMI JOIN also state that the output columns only
> come from the outer relation? i.e., the inner relation doesn't contribute
> either rows or columns to the final result?  Or is that simply
> an implementation artifact of the fact that the only current way to perform
> a semi-join explicitly is via exists/in?

I think it's an artifact.  What nodes.h actually says about it is you get
the values of one randomly-selected matching inner row, which seems like
a fine definition for the purposes we plan to put it to.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perl's newSViv() versus 64-bit ints?

2016-03-12 Thread Tom Lane

=?UTF-8?Q?Salvador_Fandi=c3=b1o?=  writes:
> Another possibility is to just use newSVnv(), but NVs are not 
> able to represent all the uint64 range precisely (IIRC, they can 
> represent integers up to 48bits?).

[ looks... ]  Oh, NV is a "double", which I think would be a perfectly
reasonable choice: it'd be exact up to about 2^53, on most machines,
which should be plenty for a long time to come.

How much of a user-visible change would that be, if the "processed"
field of a spi_exec_query() result started coming back as an NV not
an IV?  I'm not sure how much that would affect semantics in typical
Perl code.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-12 Thread Daniel Verite

Robert Haas wrote:

> But worse than either of  those things, there is no real
> agreement on what the overall design of this feature
> should be.

The part in the design that raised concerns upthread is
essentially how headers sorting is exposed to the user and
implemented.

As suggested in [1], I've made some drastic changes in the
attached patch to take the comments (from Dean R., Tom L.)
into account. The idea is to limit to the bare minimum
the involvement of psql in sorting:

- the +/- syntax goes away

- the possibility of post-sorting the values through a backdoor
  query goes away too, for both headers.

- the vertical order of the crosstab view is now driven solely by the
  order  in the query

- the order of the horizontal header can be optionally specified
  by a column expected to contain an integer, with the syntax
  \crosstabview colv colh:scolh [other cols]
  which means "colh" will be sorted by "scolh".
  It still defaults to whatever order "colh" comes in from the results

  Concerning the optional "scolh", there are cases where it might pre-exist
  naturally, such as a month number going in pair with a month name.
  In other cases,  a user may add it as a kind of "synthetic column"
  by way of a window function, for example:
SELECT ...other columns...,
   (row_number() over(order by something [order options]) as scolh
   FROM...
   Only the relative order of scolh values is taken into account, the value
itself
   has no meaning for crosstabview.

- also NULLs are no longer excluded from headers, per Peter E.
  comment in [2].


[1]
http://www.postgresql.org/message-id/3d513263-104b-41e3-b1c7-4ad4bd99c491@mm

[2] http://www.postgresql.org/message-id/56c4e344.6070...@gmx.net


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 8a85804..da0621b 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -990,6 +990,113 @@ testdb=>
   
 
   
+\crosstabview [
+colV
+colH
+[:scolH]
+[colG1[,colG2...]]
+] 
+
+
+Execute the current query buffer (like \g) and shows
+the results inside a crosstab grid.
+The output column colV
+becomes a vertical header
+and the output column colH
+becomes a horizontal header, optionally sorted by ranking data obtained
+from scolH.
+
+colG1[,colG2...]
+is the list of output columns to project into the grid.
+By default, all output columns of the query except 
+colV and
+colH
+are included in this list.
+
+
+
+All columns can be refered to by their position (starting at 1), or by
+their name. Normal case folding and quoting rules apply on column
+names. By default,
+colV corresponds to column 1
+and colH to column 2.
+A query having only one output column cannot be viewed in crosstab, and
+colH must differ from
+colV.
+
+
+
+The vertical header, displayed as the leftmost column,
+contains the deduplicated values found in
+column colV, in the same
+order as in the query results.
+
+
+The horizontal header, displayed as the first row,
+contains the deduplicated values found in
+column colH, in
+the order of appearance in the query results.
+If specified, the optional scolH
+argument refers to a column whose values should be integer numbers
+by which colH will be sorted
+to be positioned in the horizontal header.
+
+
+
+Inside the crosstab grid,
+given a query output with N columns
+(including colV and
+colH),
+for each distinct value x of
+colH
+and each distinct value y of
+colV,
+the contents of a cell located at the intersection
+(x,y) is determined by these rules:
+
+
+
+ if there is no corresponding row in the query results such that the
+ value for colH
+ is x and the value
+ for colV
+ is y, the cell is empty.
+
+
+
+
+
+ if there is exactly one row such that the value
+ for colH
+ is x and the value
+ for colV
+ is y, then the N-2 other
+ columns or the columns listed in
+ colG1[,colG2...]
+ are displayed in the cell, separated between each other by
+ a space character if needed.
+
+ If N=2, the letter X is displayed
+ in the cell as if a virtual third column contained that character.
+
+
+
+
+
+ if there are several corresponding rows, the behavior is identical to
+

Re: [HACKERS] Performance improvement for joins where outer side is unique

2016-03-12 Thread David G. Johnston

On Saturday, March 12, 2016, David Rowley 
wrote:

> On 12 March 2016 at 11:43, Tom Lane >
> wrote:
> > I wrote:
> >> I wondered why, instead of inventing an extra semantics-modifying flag,
> >> we couldn't just change the jointype to *be* JOIN_SEMI when we've
> >> discovered that the inner side is unique.
> >
> > BTW, to clarify: I'm not imagining that we'd make this change in the
> > query jointree, as for example prepjointree.c might do.  That would
> appear
> > to add join order constraints, which we don't want.  But I'm thinking
> that
> > at the instant where we form a join Path, we could change the Path's
> > jointype to be JOIN_SEMI or JOIN_SEMI_OUTER if we're able to prove the
> > inner side unique, rather than annotating the Path with a separate flag.
> > Then that representation is what propagates forward.
>
> Thanks for looking at this.
>
> Yes that might work, since we'd just be changing the jointype in the
> JoinPath, if that path was discarded if favour of, say the commutative
> variant, which was not "unique inner", then it shouldn't matter, as
> the join type for that path would be the original one.
>
> The thing that might matter is that, this;
>
>
> explain (costs off) select * from t1 inner join t2 on t1.id=t2.id
>  QUERY PLAN
> --
>  Hash Join
>Hash Cond: (t1.id = t2.id)
>->  Seq Scan on t1
>->  Hash
>  ->  Seq Scan on t2
>
> could become;
>
>   QUERY PLAN
> --
>  Hash Semi Join
>Hash Cond: (t1.id = t2.id)
>->  Seq Scan on t1
>->  Hash
>  ->  Seq Scan on t2
>
> Wouldn't that cause quite a bit of confusion? People browsing EXPLAIN
> output might be a bit confused at the lack of EXISTS/IN clause in a
> query which is showing a Semi Join. Now, we could get around that by
> adding JOIN_SEMI_INNER I guess, and just displaying that as a normal
> inner join, yet it'll behave exactly like JOIN_SEMI!
>
>
Don't the semantics of a SEMI JOIN also state that the output columns only
come from the outer relation? i.e., the inner relation doesn't contribute
either rows or columns to the final result?  Or is that simply
an implementation artifact of the fact that the only current way to perform
a semi-join explicitly is via exists/in?

David J.

Re: [HACKERS] auto_explain sample rate

2016-03-12 Thread Julien Rouhaud

On 11/03/2016 17:55, Robert Haas wrote:
> On Fri, Mar 11, 2016 at 11:33 AM, Tomas Vondra
>  wrote:
>> A bit late, but I think we should rename the GUC variable to
>> "sampling_rate" (instead of sample_ratio) as that's what pgbench uses
>> for the same thing. That'd be more consistent.
> 
> I like that idea.  It seems like slightly better terminology.
> 

I like it too. I also just noticed that I duplicated the var type by
mistake in the documentation :/

Attached patch fixes both.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/contrib/auto_explain/auto_explain.c 
b/contrib/auto_explain/auto_explain.c
index 76d1831..55529af 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -29,7 +29,7 @@ static bool auto_explain_log_triggers = false;
 static bool auto_explain_log_timing = true;
 static int auto_explain_log_format = EXPLAIN_FORMAT_TEXT;
 static bool auto_explain_log_nested_statements = false;
-static double auto_explain_sample_ratio = 1;
+static double auto_explain_sample_rate = 1;
 
 static const struct config_enum_entry format_options[] = {
{"text", EXPLAIN_FORMAT_TEXT, false},
@@ -163,10 +163,10 @@ _PG_init(void)
 NULL,
 NULL);
 
-   DefineCustomRealVariable("auto_explain.sample_ratio",
+   DefineCustomRealVariable("auto_explain.sample_rate",
 "Fraction of queries 
to process.",
NULL,
-   
&auto_explain_sample_ratio,
+   
&auto_explain_sample_rate,
1.0,
0.0,
1.0,
@@ -209,11 +209,11 @@ static void
 explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
 {
/*
-* For ratio sampling, randomly choose top-level statement. Either
+* For rate sampling, randomly choose top-level statement. Either
 * all nested statements will be explained or none will.
 */
if (auto_explain_log_min_duration >= 0 && nesting_level == 0)
-   current_query_sampled = (random() < auto_explain_sample_ratio *
+   current_query_sampled = (random() < auto_explain_sample_rate *
MAX_RANDOM_VALUE);
 
if (auto_explain_enabled() && current_query_sampled)
diff --git a/doc/src/sgml/auto-explain.sgml b/doc/src/sgml/auto-explain.sgml
index 6f1bde0..38e6f50 100644
--- a/doc/src/sgml/auto-explain.sgml
+++ b/doc/src/sgml/auto-explain.sgml
@@ -206,18 +206,17 @@ LOAD 'auto_explain';
 

 
- auto_explain.sample_ratio (real)
+ auto_explain.sample_rate (real)
  
-  auto_explain.sample_ratio configuration 
parameter
+  auto_explain.sample_rate configuration 
parameter
  
 
 
  
-  auto_explain.sample_ratio (floating 
point)
-  causes auto_explain to only explain a fraction of the statements in each
-  session.  The default is 1, meaning explain all the queries.  In case
-  of nested statements, either all will be explained or none. Only
-  superusers can change this setting.
+  auto_explain.sample_rate causes auto_explain to only
+  explain a fraction of the statements in each session.  The default is 1,
+  meaning explain all the queries.  In case of nested statements, either 
all
+  will be explained or none. Only superusers can change this setting.
  
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Explain [Analyze] produces parallel scan for select Into table statements.

2016-03-12 Thread Mithun Cy

Sorry there was some issue with my mail settings same mail got set more
than once.
-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Explain [Analyze] produces parallel scan for select Into table statements.

2016-03-12 Thread Mithun Cy

On Sat, Mar 12, 2016 at 2:32 PM, Amit Kapila 
wrote:
>With force_parallel_mode=on, I could see many other failures as well.  I
think it is better to have test, which tests this functionality with
>force_parallel_mode=regress

as per user manual.
Setting this value to regress has all of the same effects as setting it to
on plus some additional effect that are intended to facilitate automated
regression testing. Normally, messages from a parallel worker are prefixed
with a context line, but a setting of regress suppresses this to guarantee
reproducible results. *Also, the Gather nodes added to plans by this
setting are hidden from the EXPLAIN output so that the output matches what
would be obtained if this setting were turned off.  *

And my test is for EXPLAIN statements. I think under regress mode it will
never fail even if parallel scan is used as per above statement.
-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Explain [Analyze] produces parallel scan for select Into table statements.

2016-03-12 Thread Mithun Cy

On Sat, Mar 12, 2016 at 2:32 PM, Amit Kapila 
wrote:
>With force_parallel_mode=on, I could see many other failures as well.  I
think it is better to have test, which tests this functionality with
>force_parallel_mode=regress

as per user manual.
Setting this value to regress has all of the same effects as setting it to
on plus some additional effect that are intended to facilitate automated
regression testing. Normally, messages from a parallel worker are prefixed
with a context line, but a setting of regress suppresses this to guarantee
reproducible results. *Also, the Gather nodes added to plans by this
setting are hidden from the EXPLAIN output so that the output matches what
would be obtained if this setting were turned off.  *

And my test is for EXPLAIN statements. I think under regress mode it will
never fail even if parallel scan is used as per above statement.
-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Explain [Analyze] produces parallel scan for select Into table statements.

2016-03-12 Thread Mithun Cy

On Sat, Mar 12, 2016 at 2:32 PM, Amit Kapila 
wrote:
>With force_parallel_mode=on, I could see many other failures as well.  I
think it is better to have test, which tests this functionality with
>force_parallel_mode=regress

as per user manual.
Setting this value to regress has all of the same effects as setting it to
on plus some additional effect that are intended to facilitate automated
regression testing. Normally, messages from a parallel worker are prefixed
with a context line, but a setting of regress suppresses this to guarantee
reproducible results. *Also, the Gather nodes added to plans by this
setting are hidden from the EXPLAIN output so that the output matches what
would be obtained if this setting were turned off.  *

And my test is for EXPLAIN statements. I think under regress mode it will
never fail even if parallel scan is used as per above statement.
-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2016-03-12 Thread Amit Langote

On Fri, Mar 11, 2016 at 2:31 PM, Amit Langote
 wrote:
> On 2016/03/11 13:16, Robert Haas wrote:
>> On Thu, Mar 10, 2016 at 9:04 PM, Amit Langote
>>  wrote:
>>> So, from what I understand here, we should not put total count of index
>>> pages into st_progress_param; rather, have the client (reading
>>> pg_stat_progress_vacuum) derive it using pg_indexes_size() (?), as and
>>> when necessary.  However, only server is able to tell the current position
>>> within an index vacuuming round (or how many pages into a given index
>>> vacuuming round), so report that using some not-yet-existent mechanism.
>>
>> Isn't that mechanism what you are trying to create in 0003?
>
> Right, 0003 should hopefully become that mechanism.

About 0003:

Earlier, it was trying to report vacuumed index block count using
lazy_tid_reaped() callback for which I had added a index_blkno
argument to IndexBulkDeleteCallback. Turns out it's not such a good
place to do what we are trying to do.  This callback is called for
every heap pointer in an index. Not all index pages contain heap
pointers, which means the existing callback does not allow to count
all the index blocks that AM would read to finish a given index vacuum
run.

Instead, the attached patch adds a IndexBulkDeleteProgressCallback
which AMs should call for every block that's read (say, right before a
call to ReadBufferExtended) as part of a given vacuum run. The
callback with help of some bookkeeping state can count each block and
report to pgstat_progress API. Now, I am not sure if all AMs read 1..N
blocks for every vacuum or if it's possible that some blocks are read
more than once in single vacuum, etc.  IOW, some AM's processing may
be non-linear and counting blocks 1..N (where N is reported total
index blocks) may not be possible.  However, this is the best I could
think of as doing what we are trying to do here. Maybe index AM
experts can chime in on that.

Thoughts?

Thanks,
Amit

0001-WIP-Implement-progress-reporting-for-VACUUM-command-v11.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pam auth - add rhost item

2016-03-12 Thread Haribabu Kommi

On Fri, Mar 11, 2016 at 12:11 AM, Grzegorz Sampolski  wrote:
> Hi.
> In attchment new patch with updated documentation and with small change
> to coding style as you suggested.


Thanks for the update. Here I attached updated patch with additional
documentation
changes, If you are fine with the changes, I will mark the patch as
ready for committer.


Regards,
Hari Babu
Fujitsu Australia


pam_auth_updated.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Saving SRF context

2016-03-12 Thread Salvador Fandiño


On 03/11/2016 09:19 PM, Salvador Fandiño wrote:

Hi,

I have implemented a SRF[*] that returns rows one by one (using
ExprMultipleResult). But now I need to save somewhere some context
information between calls pertaining to the same result set and I am
unable to find a proper place for that.

I have seen that cfinfo->flinfo->fn_extra is available, but if I
understand the mechanics of FmgrInfo correctly, that structure is not
unique for call frame. It could be shared between several cfinfo
structures and would be crushed on recursion (I plan to use SPI from my
function).

So, is that right? should I build a stack there? is there any other
place where to store context information relative to a SRF call frame?


Replying to myself, pretty much I wanted to know is documented here:

http://www.postgresql.org/docs/devel/static/xfunc-c.html




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Background Processes and reporting

2016-03-12 Thread Amit Kapila

On Sat, Mar 12, 2016 at 2:38 PM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:
>
> On Sat, Mar 12, 2016 at 2:45 AM, Andres Freund  wrote:
>>
>>
>> I think I agree with Robert here. Providing hooks into very low level
>> places tends to lead to problems in my experience; tight control over
>> what happens is often important - I certainly don't want any external
>> code to run while we're waiting for an lwlock.
>
>
> So, I get following.
>
> 1) Detailed wait monitoring might cause high overhead on some systems.
> 2) We want wait monitoring to be always on. And we don't want options to
enable additional features of wait monitoring.
>

I am not able to see how any of above comments indicate that wait
monitoring need to be always on, why can't we consider to be off by default
especially for events like timing calculations where we suspect to have
some performance penalty and  during development if it is proven that none
of the additional wait events cause any overhead, then we can keep them on
by default.

> 3) We don't want hook of wait events to be exposed.
>
> Can I conclude that we reject detailed wait monitoring by design?
>

I don't think so.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Background Processes and reporting

2016-03-12 Thread Oleg Bartunov

On Sat, Mar 12, 2016 at 12:45 AM, Andres Freund  wrote:

> On 2016-03-12 02:24:33 +0300, Alexander Korotkov wrote:
>
>

> > So, situation looks like dead-end.  I have no idea how to convince Robert
> > about any kind of advanced functionality of wait monitoring to
> PostgreSQL.
> > I'm thinking about implementing sampling extension over current
> > infrastructure just to make community see that it sucks. Andres, it would
> > be very nice if you have any idea how to move this situation forward.
>
> I've had my share of conflicts with Robert. But if I were in his shoes,
> targeted by this kind of rhetoric, I'd be very tempted to just ignore
> any further arguments from the origin.  So I think the way forward is
> for everyone to cool off, and to see how we can incrementally make
> progress from here on.
>
>
We all are very different people from different cultures, so online
discussion on ill-defined topics  wouldn't work. Let's back to work.

> > Another aspect is that EnterpriseDB offers waits monitoring in
> proprietary
> > fork [5].
>
>
So?
>

So, Robert already has experience with the subject, probably,  he has bad
experience with edb implementation and he'd like to see something better in
community version. That's fair and I accept his position.

Wait monitoring is one of the popular requirement of russian companies, who
migrated from Oracle. Overwhelming majority of them use Linux, so I suggest
to have configure flag for including wait monitoring at compile time
(default is no wait monitoring), or have GUC variable, which is also off by
default, so we have zero to minimal overhead of monitoring. That way we'll
satisfy many enterprises and help them to choose postgres, will get
feedback from production use and have time for feature improving.

>
> Greetings,
>
> Andres Freund
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] Background Processes and reporting

2016-03-12 Thread Amit Kapila

On Sat, Mar 12, 2016 at 3:10 AM, Andres Freund  wrote:
>
>
> > Similarly for the wait event stuff - checkpointer, wal writer,
> > background writer are in many cases processes that very often are
> > blocked on locks, IO and such.  Thus restricting the facility to
> > database connected processes seems like a loss.
>
> I think one way to address this would be to not only report
> PgBackendStatus type processes in pg_stat_activity. While that'd
> obviously be a compatibility break, I think it'd be an improvement.
>

I think here another point which needs more thoughts is that many of the
pg_stat_activity fields are not relevant for background processes, ofcourse
one can say that we can keep those fields as NULL, but still I think that
indicates it is not the most suitable way to expose such information.

Another way could be to have new view like pg_stat_background_activity with
only relevant fields or try expose via individual views like
pg_stat_bgwriter.

Do you intend to get this done for 9.6 considering an add-on patch for wait
event information displayed in pg_stat_activity?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Performance improvement for joins where outer side is unique

2016-03-12 Thread David Rowley

On 12 March 2016 at 11:43, Tom Lane  wrote:
> I wrote:
>> I wondered why, instead of inventing an extra semantics-modifying flag,
>> we couldn't just change the jointype to *be* JOIN_SEMI when we've
>> discovered that the inner side is unique.
>
> BTW, to clarify: I'm not imagining that we'd make this change in the
> query jointree, as for example prepjointree.c might do.  That would appear
> to add join order constraints, which we don't want.  But I'm thinking that
> at the instant where we form a join Path, we could change the Path's
> jointype to be JOIN_SEMI or JOIN_SEMI_OUTER if we're able to prove the
> inner side unique, rather than annotating the Path with a separate flag.
> Then that representation is what propagates forward.

Thanks for looking at this.

Yes that might work, since we'd just be changing the jointype in the
JoinPath, if that path was discarded if favour of, say the commutative
variant, which was not "unique inner", then it shouldn't matter, as
the join type for that path would be the original one.

The thing that might matter is that, this;

explain (costs off) select * from t1 inner join t2 on t1.id=t2.id
 QUERY PLAN
--
 Hash Join
   Hash Cond: (t1.id = t2.id)
   ->  Seq Scan on t1
   ->  Hash
 ->  Seq Scan on t2

could become;

  QUERY PLAN
--
 Hash Semi Join
   Hash Cond: (t1.id = t2.id)
   ->  Seq Scan on t1
   ->  Hash
 ->  Seq Scan on t2

Wouldn't that cause quite a bit of confusion? People browsing EXPLAIN
output might be a bit confused at the lack of EXISTS/IN clause in a
query which is showing a Semi Join. Now, we could get around that by
adding JOIN_SEMI_INNER I guess, and just displaying that as a normal
inner join, yet it'll behave exactly like JOIN_SEMI!

And if we do that, then the join node code changes from;

if (node->js.match_first_tuple_only)
  node->nl_NeedNewOuter = true;

to;

if (node->js.jointype == JOIN_SEMI || node->js.jointype ==
JOIN_SEMI_INNER || node->js.jointype == JOIN_SEMI_LEFT)
  node->nl_NeedNewOuter = true;

which is kinda horrid and adds a few cycles and code without a real
need for it. If we kept the match_first_tuples_only and set it in the
node startup similar to how it is now, or changed JoinType to be some
bitmask mask that had a bit for each piece of a venn diagram, and an
extra for the unique inner... but I'm not so sure I like that idea...

To me it seems neater just to keep the match_first_tuple_only and just
update the planner stuff to use the new jointypes.

Thoughts?

>
> It seems like the major intellectual complexity here is to figure out
> how to detect inner-side-unique at reasonable cost.  I see that for
> LEFT joins you're caching that in the SpecialJoinInfos, which is probably
> fine.  But for INNER joins it looks like you're just doing it over again
> for every candidate join, and that seems mighty expensive.

hmm yeah, perhaps something can be done to cache that, I'm just not
all that sure how much reuse the cache will get used since it
generates the Paths for nestloop/merge/has join methods all at once,
once the unique_inner has been determined for that inner rel, and the
restrict info for whatever the outer rel(s) are. I didn't expect that
to happen twice, so I'm not all that sure if a cache will get
reused...

... thinks more ...

Perhaps maybe if rel x was found to be unique with the join conditions
of rel y, then we can be sure that x is unique against y, z, as that
could only possible add more to the join condition, never less. Is
this what you meant?

... I'll look into that.

The other thing I thought of was to add a dedicated list for unique
indexes in RelOptInfo, this would also allow
rel_supports_distinctness() to do something a bit smarter than just
return false if there's no indexes. That might not buy us much though,
but at least relations tend to have very little unique indexes, even
when they have lots of indexes.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: RETURNING primary_key()

2016-03-12 Thread Gavin Flower


On 12/03/16 23:27, Michael Paquier wrote:

On Fri, Mar 11, 2016 at 5:28 AM, Craig Ringer  wrote:

On 11 March 2016 at 03:07, Igal @ Lucee.org  wrote:

I noticed that you usually don't put html in the emails here, but I think that 
it's appropriate here to show the information in a clear way (also, according 
to my computer it's 2016).

Pretty sure we have at least one person here using mailreader software that's 
old enough to vote in most countries, but I tend to share the sentiment. At 
least when there's actually a functional reason like this :)

This one made me smile. Extracting a quote from another project's
slogan: All mail clients s**k. Some of them just s**k less.


s**k == sulk???



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: RETURNING primary_key()

2016-03-12 Thread Michael Paquier

On Fri, Mar 11, 2016 at 5:28 AM, Craig Ringer  wrote:
> On 11 March 2016 at 03:07, Igal @ Lucee.org  wrote:
>> I noticed that you usually don't put html in the emails here, but I think 
>> that it's appropriate here to show the information in a clear way (also, 
>> according to my computer it's 2016).
>
> Pretty sure we have at least one person here using mailreader software that's 
> old enough to vote in most countries, but I tend to share the sentiment. At 
> least when there's actually a functional reason like this :)

This one made me smile. Extracting a quote from another project's
slogan: All mail clients s**k. Some of them just s**k less.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] eXtensible Transaction Manager API (v2)

2016-03-12 Thread Michael Paquier

On Fri, Mar 11, 2016 at 9:35 PM, Tom Lane  wrote:
> IMO this is not committable as-is, and I don't think that it's something
> that will become committable during this 'fest.  I think we'd be well
> advised to boot it to the 2016-09 CF and focus our efforts on other stuff
> that has a better chance of getting finished this month.

Yeah, I would believe that a good first step would be to discuss
deeply about that directly at PGCon for folks that will be there and
interested in the subject. It seems like a good timing to brainstorm
things F2F at the developer unconference for example, a couple of
months before the 1st CF of 9.7. We may perhaps (or not) get to
cleaner picture of what kind of things are wanted in this area.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Background Processes and reporting

2016-03-12 Thread Alexander Korotkov

On Sat, Mar 12, 2016 at 2:45 AM, Andres Freund  wrote:

> On 2016-03-12 02:24:33 +0300, Alexander Korotkov wrote:
> > Idea of individual time measurement of every wait event met criticism
> > because it might have high overhead [1].
>
> Right. And that's actually one of the point which I meant with "didn't
> listen to criticism". There've been a lot of examples, on an off list,
> where taking timings trigger significant slowdowns.  Yes, in some
> bare-metal environments, which a coherent tsc, the overhead can be
> low. But that doesn't make it ok to have a high overhead on a lot of
> other systems.
>
> Just claiming that that's not a problem will only lead to your position
> not being taken serious.
>
>
> > This is really so at least for Windows [2].
>
> Measuring timing overhead for a simplistic workload on a single system
> doesn't mean that.  Try doing such a test on a vmware esx virtualized
> windows machine, on a multi-socket server; in a lot of instances you'll
> see two-three orders of magnitude longer average times; with peaks going
> into 4-5 orders of magnitude.  And, as sad it is, realistically most
> postgres instances will run in virtualized environments.
>
>
> > But accessing only current values wouldn't be very useful.  We
> > anyway need to gather some statistics.  Gathering it by sampling would be
> > both more expensive and less accurate for majority of systems.  This is
> why
> > I proposed hooks to make possible platform dependent extensions.  Robert
> > rejects hook because he is "not a big fan of hooks as a way of resolving
> > disagreements about the design" [3].
>
> I think I agree with Robert here. Providing hooks into very low level
> places tends to lead to problems in my experience; tight control over
> what happens is often important - I certainly don't want any external
> code to run while we're waiting for an lwlock.
>

So, I get following.

1) Detailed wait monitoring might cause high overhead on some systems.
2) We want wait monitoring to be always on. And we don't want options to
enable additional features of wait monitoring.
3) We don't want hook of wait events to be exposed.

Can I conclude that we reject detailed wait monitoring by design?
If it's so and not only Robert thinks so, then let's just admit it and add
it to FAQ and etc.

> Besides that is actually not design issues but platform issues...
>
> I don't see how that's the case.
>
>
> > Another question is wait parameters.  We want to expose wait event with
> > some parameters.  Robert rejects that because it *might* add additional
> > overhead [3]. When I proposed to fit something useful into hard-won
> > 4-bytes, Roberts claims that it is "too clever" [4].
>
> I think stopping to treat this as "Robert/EDB vs. pgpro" would be a good
> first step to make progress here.
>
>
> It seems entirely possible to extend the current API in an incremental
> fashion, either allowing to disable the individual pieces, or providing
> sufficient measurements that it's not needed.
>
>
> > So, situation looks like dead-end.  I have no idea how to convince Robert
> > about any kind of advanced functionality of wait monitoring to
> PostgreSQL.
> > I'm thinking about implementing sampling extension over current
> > infrastructure just to make community see that it sucks. Andres, it would
> > be very nice if you have any idea how to move this situation forward.
>
> I've had my share of conflicts with Robert. But if I were in his shoes,
> targeted by this kind of rhetoric, I'd be very tempted to just ignore
> any further arguments from the origin.  So I think the way forward is
> for everyone to cool off, and to see how we can incrementally make
> progress from here on.
>
>
> > Another aspect is that EnterpriseDB offers waits monitoring in
> proprietary
> > fork [5].
>

> So?


So, we'll end up with every company providing fork with detailed wait
monitoring. While community PostgreSQL resists from providing such
functionality.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Explain [Analyze] produces parallel scan for select Into table statements.

2016-03-12 Thread Amit Kapila

On Sat, Mar 12, 2016 at 2:02 PM, Mithun Cy 
wrote:
>
>
>
> On Sat, Mar 12, 2016 at 12:28 PM, Amit Kapila 
wrote
> >I don't see how this test will fail with force_parallel_mode=regress and
max_parallel_degree > 0 even without the patch proposed to fix the issue in
>hand.  In short, I don't think this test would have caught the issue, so I
don't see much advantage in adding such a test.  Even if we want to add
such a >test case, I think as proposed this will substantially increase the
timing for "Select Into" test which might not be an acceptable test case
addition >especially for testing one corner case.
>
>
> Without above patch the make installcheck fails for select_into.sql with
below diff
>
> when
> force_parallel_mode = on
> max_parallel_degree = 3
>

With force_parallel_mode=on, I could see many other failures as well.  I
think it is better to have test, which tests this functionality with
force_parallel_mode=regress


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] [PATCH] Use MemoryContextAlloc() in the MemoryContextAllocZero() and MemoryContextAllocZeroAligned()

2016-03-12 Thread Michael Paquier

On Sat, Mar 12, 2016 at 8:09 AM, Alexander Kuleshov
 wrote:
> Hello all,
>
> Attached patch simplifies the MemoryContextAllocZero() and
> MemoryContextAllocZeroAligned().
> The MemoryContextAllocZero() and MemoryContextAllocZeroAligned()
> functions does almost the
> same that MemoryContextAlloc() does. Additionally these functions
> fills allocated memory context
> with zeros via MemSetAligned() and MemSetLoop(). Let's call
> MemoryContextAlloc() in these functions
> instead of setting isReset to false, call alloc() callback of the
> context and etc., to prevent code duplication.

This code duplication is on purpose. This is a very hot code path and
we want to avoid the overhead of an extra function call.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Explain [Analyze] produces parallel scan for select Into table statements.

2016-03-12 Thread Mithun Cy

On Sat, Mar 12, 2016 at 12:28 PM, Amit Kapila 
wrote
>I don't see how this test will fail with force_parallel_mode=regress and
max_parallel_degree > 0 even without the patch proposed to fix the issue in
>hand.  In short, I don't think this test would have caught the issue, so I
don't see much advantage in adding such a test.  Even if we want to add
such a >test case, I think as proposed this will substantially increase the
timing for "Select Into" test which might not be an acceptable test case
addition >especially for testing one corner case.


Without above patch the make installcheck fails for select_into.sql with
below diff

when
force_parallel_mode = on
max_parallel_degree = 3

diff results/select_into.out expected/select_into.out

104,110c104,107

< QUERY PLAN

< 

< Gather

< Number of Workers: 1

< Single Copy: true

< -> Seq Scan on mt1

< (4 rows)

---

> QUERY PLAN

> -

> Seq Scan on mt1

> (1 row)


Again with postgresql.conf non default settings.

force_parallel_mode = on
max_parallel_degree = 3
parallel_tuple_cost = 0

[mithun@localhost regress]$ diff results/select_into.out
expected/select_into.out

104,109c104,107

< QUERY PLAN

< 

< Gather

< Number of Workers: 3

< -> Parallel Seq Scan on mt1

< (3 rows)

---

> QUERY PLAN

> -

> Seq Scan on mt1

> (1 row)

To reduce the time of execution I can set the generate_series parameter to
500, which is fast in my machine and also fails with above diff but this
time only one worker is assigned as per plan.

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

69 matches

Mail list logo