date:20161206

Re: [HACKERS] Assignment of valid collation for SET operations on queries with UNKNOWN types.

2016-12-06 Thread Michael Paquier

On Wed, Dec 7, 2016 at 4:24 PM, Michael Paquier
 wrote:
> On Tue, Dec 6, 2016 at 10:42 PM, Rahila Syed  wrote:
>> Thank you for suggestion. Attached is a patch which resolves the columns
>> with literal constants as TEXT for view creation.

You may also want to add your patch to a CF so as it does not fall
into the cracks.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Assignment of valid collation for SET operations on queries with UNKNOWN types.

2016-12-06 Thread Michael Paquier

On Tue, Dec 6, 2016 at 10:42 PM, Rahila Syed  wrote:
> Hello,
>
>>And ideally fix things so
>>that the type of the view column will be resolved as text, so that you
>>don't hit this condition in the first place; but there is no good that
>>comes out of allowing a view to be created like this
>
> Thank you for suggestion. Attached is a patch which resolves the columns
> with literal constants as TEXT for view creation.
>
> Following views can be created with literal columns resolved as TEXT.
>
> postgres=# create view v as select 'abc' a;
> CREATE VIEW
> postgres=# create view v1 as select 'def' a;
> CREATE VIEW

There is a similar code pattern for materialized views, see
create_ctas_nodata() where the attribute list is built. Even with your
patch, I get that:
=# create materialized view m as select 'abc' a;
WARNING:  42P16: column "a" has type "unknown"
DETAIL:  Proceeding with relation creation anyway.
LOCATION:  CheckAttributeType, heap.c:499
SELECT 1
Time: 6.566 ms
=# select * from m order by 1;
ERROR:  XX000: failed to find conversion function from unknown to text

Your patch has no regression tests, surely you want some to stress
this code path. And actually, shouldn't this be just a plain error?
-- 
Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2016-12-06 Thread Michael Paquier

On Wed, Dec 7, 2016 at 2:49 PM, Kyotaro HORIGUCHI
 wrote:
> Aside from measurement of the two sorting methods, I'd like to
> point out that quorum commit basically doesn't need
> sorting. Counting conforming santdbys while scanning the
> walsender(receiver) LSN list comparing with the target LSN is
> O(n). Small refactoring of SyncRerpGetOldestSyncRecPtr would
> enough to do that.

Indeed, I haven't thought about that, and that's a no-brainer. That
would remove the need to allocate and sort each array, what is simply
needed is to track the number of times a newest value has been found.
So what this processing would do is updating the write/flush/apply
values for the first k loops if the new value is *older* than the
current one, where k is the quorum number, and between k+1 and N the
value gets updated only if the value compared is newer. No need to
take the mutex lock for a long time as well. By the way, the patch now
conflicts on HEAD, it needs a refresh.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tzdata 2016j

2016-12-06 Thread Vladimir Borodin


> 6 дек. 2016 г., в 19:19, David Fetter  написал(а):
> 
> On Tue, Dec 06, 2016 at 10:52:47AM -0500, Tom Lane wrote:
>> Vladimir Borodin  writes:
>>> Any chance to get tzdata 2016j in supported branches?
>> 
>> When the next scheduled releases come around (February), we'll update
>> to whatever tzdata is current at that time.
> 
> I'm guessing that request came through because Vladimir is actually
> affected by the change.
> 
> Apparently, you can replace the tzdata file and restart the server per
> https://wiki.postgresql.org/wiki/FAQ#Will_PostgreSQL_handle_recent_daylight_saving_time_changes_in_various_countries.3F
>  
> 

Yep, both packages for RHEL and Ubuntu are built with 
--with-system-tzdata=/usr/share/zoneinfo configure option so updating system 
package and restarting postgres is sufficient. Thanks!


--
May the force be with you…
https://simply.name

Re: [HACKERS] Partitionning: support for Truncate Table WHERE

2016-12-06 Thread Amit Langote

On 2016/12/07 15:26, Craig Ringer wrote:
> On 7 December 2016 at 07:29, legrand legrand
>  wrote:
> 
>> Working in a DSS environment, we often need to truncate table partitions
>> regarding a WHERE condition and have to
>> [...]
>> Would be pleased to ear your feedback regarding this.
> 
> It sounds like something that'd be useful to do on top of declarative
> partitioning, once that is merged. Perhaps you could start by reading
> and testing the declarative partitioning patch. That'll give you a
> better idea of the practicalities of doing what you propose on top of
> it, and give you an opportunity to suggest changes to the declarative
> partitioning scheme that might make conditional truncate easier later.

Agreed.

If I understand the request correctly, TRUNCATE on the parent table (a
partitioned table), which currently recurses to *all* child tables
(partitions), should have a restricting WHERE clause, right?  It would
become possible to implement something like that with the new declarative
partitioned tables.  As Crag mentioned, you can take a look at the
discussion about declarative partitioning in the emails linked to at the
following page: https://commitfest.postgresql.org/12/611/

Thanks,
Amit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2016-12-06 Thread Craig Ringer

On 7 December 2016 at 14:39, Craig Ringer  wrote:
> On 7 December 2016 at 04:13, Robert Haas  wrote:
>
>> I wonder how feasible it would be to make this a run-time dependency
>> rather than a compile option.
>
> Or something that's compiled with the server, but produces a separate
> .so that's the only thing that links to LLVM. So packagers can avoid a
> dependency on LLVM for postgres.

Ahem, next time I'll finish the thread first. Nevermind.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separate connection handling from backends

2016-12-06 Thread Jim Nasby


On 12/6/16 10:34 PM, Craig Ringer wrote:

In other words, we could start to separate session state from executor
state in a limited manner. That'd definitely be valuable, IMO; it's a
real shame that Pg's architecture so closely couples the two.

So - is just doing "PgInCoreBouncer" a good idea? No, I don't think
so. But there are potentially good things to be done in the area.


Right.


What I don't see here is a patch, or a vague proposal for a patch, so
I'm not sure how this can go past the hot-air stage.


Yeah, I brought it up because I think there's potential tie-in with 
other things that have been discussed (notably async transactions, but 
maybe BG workers and parallel query could benefit too). Maybe it would 
make sense as part of one of those efforts.


Though, this is something that's asked about often enough that it'd 
probably be possible to round up a few companies to fund it.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2016-12-06 Thread Craig Ringer

On 7 December 2016 at 04:13, Robert Haas  wrote:

> I wonder how feasible it would be to make this a run-time dependency
> rather than a compile option.

Or something that's compiled with the server, but produces a separate
.so that's the only thing that links to LLVM. So packagers can avoid a
dependency on LLVM for postgres.

I suspect it wouldn't be worth the complexity, the added indirection
necessary, etc. If you're using packages then pulling in LLVM isn't a
big deal. If you're not, then don't use --with-llvm .

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Push down more full joins in postgres_fdw

2016-12-06 Thread Ashutosh Bapat

>
>
> Ashutosh proposed this to do the comparison:
>
> On 2016/11/22 18:28, Ashutosh Bapat wrote:
>>
>> I guess, the reason why you are doing it this way, is SELECT clause for
>> the
>> outermost query gets deparsed before FROM clause. For later we call
>> deparseRangeTblRef(), which builds the tlist. So, while deparsing SELECT
>> clause, we do not have tlist to build from. In that case, I guess, we have
>> to
>> build the tlist in get_subselect_alias_id() if it's not available and
>> stick it
>> in fpinfo. Subsequent calls to get_subselect_alias_id() should find tlist
>> there. Then in deparseRangeTblRef() assert that there's a tlist in fpinfo
>> and use it to build the SELECT clause of subquery. That way, we don't
>> build
>> tlist unless it's needed and also use the same tlist for all searches.
>> Please
>> use tlist_member() to search into the tlist.
>
>
> This would probably work, but seems to me a bit complicated.  Instead, I'd
> like to propose that we build the tlist for each relation being deparsed as
> a subquery in a given join tree, right before deparsing the SELECT clause in
> deparseSelectStmtForRel, if is_subquery is false and lower_subquery_rels
> isn't NULL, and store the tlist into the relation's fpinfo.  That would
> allow us to build the tlist only when we need it, and to use tlist_member
> for the exact comparison.  I think it would be much easier to implement
> that.
>

IIRC, this is inline with my original proposal of creating tlists
before deparsing anything. Along-with that I also proposed to bubble
this array of tlist up the join hierarchy to avoid recursion [1] point
#15, further elaborated in [2]

[1] 
https://www.postgresql.org/message-id/ad449b25-66ee-4c06-568c-0eb2e1bef9f9%40lab.ntt.co.jp
[2] 
https://www.postgresql.org/message-id/CAFjFpRcn7%3DDNOXm-PJ_jVDqAmghKVf6tApQC%2B_RgMZ8tOPExcA%40mail.gmail.com

We seem to be moving towards that solution, but as discussed we have
left it to committer's judgement.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separate connection handling from backends

2016-12-06 Thread Jim Nasby


On 12/6/16 6:19 PM, Tom Lane wrote:

I'm kind of mystified how a simple code restructuring could solve the
fundamental problems with a large number of backends. It sounds like
what you're describing would just push the problem around, you would
end up with some other maximum instead, max_backends, or
max_active_backends, or something like that with the same problems.

What it sounds like to me is building a connection pooler into the
backend.  I'm not really convinced we ought to go there.


The way I'm picturing it backends would no longer be directly tied to 
connections. The code that directly handles connections would grab an 
available backend when a statement actually came in (and certainly it'd 
need to worry about transactions and session GUCs).


So in a way it's like a pooler, except it'd be able to do things that 
poolers simply can't (like safely switch the user the backend is using).


I think there might be other uses as well, since there's several other 
places where we need something that's kind-of like a backend, but if 
Heikki's work radically shifts the expense of running many thousands of 
backends then it's probably not worth doing.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separate connection handling from backends

2016-12-06 Thread Craig Ringer

On 7 December 2016 at 10:19, Tom Lane  wrote:

> What it sounds like to me is building a connection pooler into the
> backend.  I'm not really convinced we ought to go there.

If we do, it probably needs to be able to offer things that
out-of-tree ones can't.

The main things I see that you can't do sensibly with an out-of-tree pooler are:

* Re-use a backend for different session users. You can SET SESSION
AUTHORIZATION, but once you hand the connection off to the client they
can just do it again or RESET SESSION AUTHORIZATION and whammo,
they're a superuser. Same issue applies for SET ROLE and RESET ROLE.

* Cope with session-level state when transaction pooling. We probably
can't do anything much about WITH HOLD cursors, advisory locks, etc,
but we could save and restore GUC state and a few other things, and we
could detect whether or not we can save and restore state so we could
switch transparently between session and transaction pooling.

* Know, conclusively, whether a query is safe to reroute to a
read-only standby, without hard coded lists of allowed functions, iffy
SQL parsers, etc. Or conversely, transparently re-route queries from
standbys to a read/write master.

In other words, we could start to separate session state from executor
state in a limited manner. That'd definitely be valuable, IMO; it's a
real shame that Pg's architecture so closely couples the two.

So - is just doing "PgInCoreBouncer" a good idea? No, I don't think
so. But there are potentially good things to be done in the area.

What I don't see here is a patch, or a vague proposal for a patch, so
I'm not sure how this can go past the hot-air stage.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Push down more full joins in postgres_fdw

2016-12-06 Thread Ashutosh Bapat

On Wed, Dec 7, 2016 at 1:57 AM, Robert Haas  wrote:
> On Mon, Dec 5, 2016 at 6:20 AM, Ashutosh Bapat
>  wrote:
>> 4. I am still not happy with this change
>> +/*
>> + * Since (1) the expressions in foreignrel's reltarget doesn't 
>> contain
>> + * any PHVs and (2) foreignrel's local_conds is empty, the tlist
>> + * created by build_tlist_to_deparse must be one-to-one with the
>> + * expressions.
>> + */
>> +Assert(list_length(tlist) ==
>> list_length(foreignrel->reltarget->exprs));
>> the assertion only checks that the number of elements in both the lists are
>> same but does not check whether those lists are same i.e. they contain the 
>> same
>> elements in the same order. This equality is crucial to deparsing logic. If
>> somehow build_tlist_to_deparse() breaks that assumption in future, we have no
>> way to detect it, unless a regression test fails.
>
> If there's an easy way to do a more exact comparison, great.  But if
> we can't get an awesome Assert(), a helpful Assert() is still better
> than a kick in the head.

The assert is not a problem in itself, but the reason we have to add
the assert. The problem is explained in [1], point #9.

[1] 
https://www.postgresql.org/message-id/CAFjFpRfwoSsJr9418b2jA7g0nwagjZSWhPQPUmK9M6z5XSOAqQ%40mail.gmail.com


-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separate connection handling from backends

2016-12-06 Thread Jim Nasby


On 12/6/16 1:46 PM, Adam Brusselback wrote:

BTW, it just occurred to me that having this separation would make
it relatively easy to support re-directing DML queries from a
replica to the master; if the backend throws the error indicating
you tried to write data, the connection layer could re-route that.


This also sounds like it would potentially allow re-routing the other
way where you know the replica contains up-to-date data, couldn't you
potentially re-direct read only queries to your replicas?


That's a lot more complicated, so I don't see that happening anytime soon.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partitionning: support for Truncate Table WHERE

2016-12-06 Thread Craig Ringer

On 7 December 2016 at 07:29, legrand legrand
 wrote:

> Working in a DSS environment, we often need to truncate table partitions
> regarding a WHERE condition and have to
> [...]
> Would be pleased to ear your feedback regarding this.

It sounds like something that'd be useful to do on top of declarative
partitioning, once that is merged. Perhaps you could start by reading
and testing the declarative partitioning patch. That'll give you a
better idea of the practicalities of doing what you propose on top of
it, and give you an opportunity to suggest changes to the declarative
partitioning scheme that might make conditional truncate easier later.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Back-patch use of unnamed POSIX semaphores for Linux?

2016-12-06 Thread Craig Ringer

On 7 December 2016 at 10:53, Tom Lane  wrote:
> Just saw another report of what's probably systemd killing off Postgres'
> SysV semaphores, as we've discussed previously at, eg,
> https://www.postgresql.org/message-id/flat/57828C31.5060409%40gmail.com
> Since the systemd people are generally impervious to suggestions that
> they might be mistaken, I do not expect this problem to go away.
>
> I think we should give serious consideration to back-patching commit
> ecb0d20a9, which changed the default semaphore type to unnamed-POSIX
> on Linux.  We've seen no problems in the buildfarm in the two months
> that that's been in HEAD.  If we don't do this, we can expect to
> continue seeing complaints of this sort until pre-v10 PG releases
> fall out of use ... and I don't want to wait that long.
>
> Commit ecb0d20a9 also changed the default for FreeBSD.  I'm not convinced
> we should back-patch that part, because (a) unnamed-POSIX semas have
> only been there since FreeBSD 9.0, which isn't that long ago, and (b)
> the argument for switching is "it'll perform better" not "your server
> will fail randomly without this change".

That's a huge change to make for something that doesn't risk data
corruption, and that won't happen when using postgres with distro
packages.

As I understand it, it's only a problem if you're running postgres as
a normal user, not a "system user" which systemd defines at
compile-time as a user < 500 or < 1000 depending on the distro's
default login.conf . So it'll only affect people who're not using
their distro's packages and service mechanism, and are instead running
Pg under some other user, likely started manually with pg_ctl.

I see quite a few people who compile their own Pg rather than using
packages, and some who even fail to use the init system and instead
use custom scripts to launch Pg. But pretty much everything I've seen
uses a 'postgres' system-user. Clearly there are exceptions out there
in the wild, but I don't think it makes sense to backpatch this to
satisfy people who are, IMO, doing it wrong in the first place.

Especially since those people can reconfigure systemd not to do this
with the RemoveIPC and KillUserProcesses directives if they insist on
using a non-system user.

If they defined a systemd service to start postgres they'd be fine...
and isn't it pretty much basic sysadmin 101 to use your init system to
start services?

Don't get me wrong, I think systemd's behaviour is pretty stupid.
Mostly in terms of its magic definition of a "system user", which
shouldn't be something determined by a uid threshold at compile time.
But I don't think we should double down on it by backpatching a big
change that hasn't even seen in-the-wild loads from real world use
yet, just to make it easier on people who're doing things backwards in
the first place.

If it were possible to detect that systemd was about to clobber us and
log something informative, _that_ would be very nice to backpatch. I
don't see how that's possible though.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [sqlsmith] Crash in tsquery_rewrite/QTNBinary

2016-12-06 Thread Andreas Seltenreich

Hi,

the following query crashes master as of 4212cb7.

select ts_rewrite(
  tsquery_phrase(
 tsquery $$'sanct' & 'peter'$$,
 tsquery $$'5' <-> '6'$$,
 42),
  tsquery $$'5' <-> '6'$$,
  plainto_tsquery('I') );

Backtrace below.

regards,
Andreas

Program terminated with signal SIGSEGV, Segmentation fault.
#0  QTNBinary (in=0x0) at tsquery_util.c:256
#1  0x559debd68643 in QTNBinary (in=0x559ded7cc998) at tsquery_util.c:260
#2  0x559debd68643 in QTNBinary (in=in@entry=0x559ded7cd068) at 
tsquery_util.c:260
#3  0x559debd67df5 in tsquery_rewrite (fcinfo=0x559ded72c040) at 
tsquery_rewrite.c:453
#4  0x559debb754f4 in ExecMakeFunctionResultNoSets (fcache=0x559ded72bfd0, 
econtext=0x559ded72bda8, isNull=0x559ded72d350 "", isDone=) at 
execQual.c:2046
#5  0x559debb7ba1e in ExecTargetList (tupdesc=, 
isDone=0x7ffce180da6c, itemIsDone=0x559ded72d490, isnull=0x559ded72d350 "", 
values=0x559ded72d330, econtext=0x559ded72bda8, targetlist=0x559ded72d458) at 
execQual.c:5486
#6  ExecProject (projInfo=, isDone=isDone@entry=0x7ffce180da6c) 
at execQual.c:5710
#7  0x559debb92c79 in ExecResult (node=node@entry=0x559ded72bc90) at 
nodeResult.c:155
#8  0x559debb74478 in ExecProcNode (node=node@entry=0x559ded72bc90) at 
execProcnode.c:392
#9  0x559debb702fe in ExecutePlan (dest=0x559ded7c8b98, 
direction=, numberTuples=0, sendTuples=, 
operation=CMD_SELECT, use_parallel_mode=, 
planstate=0x559ded72bc90, estate=0x559ded72bb78) at execMain.c:1568
#10 standard_ExecutorRun (queryDesc=0x559ded727a18, direction=, 
count=0) at execMain.c:338
#11 0x559debc9c238 in PortalRunSelect (portal=portal@entry=0x559ded71f958, 
forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807, 
dest=dest@entry=0x559ded7c8b98) at pquery.c:946
#12 0x559debc9d89e in PortalRun (portal=portal@entry=0x559ded71f958, 
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001', 
dest=dest@entry=0x559ded7c8b98, altdest=altdest@entry=0x559ded7c8b98, 
completionTag=completionTag@entry=0x7ffce180dee0 "") at pquery.c:787
#13 0x559debc9af42 in exec_simple_query (query_string=0x559ded795048 "...") 
at postgres.c:1094
#14 PostgresMain (argc=, argv=argv@entry=0x559ded7390b0, 
dbname=, username=) at postgres.c:4069
#15 0x559deb9ee2f8 in BackendRun (port=0x559ded726ef0) at postmaster.c:4274
#16 BackendStartup (port=0x559ded726ef0) at postmaster.c:3946
#17 ServerLoop () at postmaster.c:1704
#18 0x559debc2ebb4 in PostmasterMain (argc=3, argv=0x559ded7004a0) at 
postmaster.c:1312
#19 0x559deb9ef68d in main (argc=3, argv=0x559ded7004a0) at main.c:228


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2016-12-06 Thread Kyotaro HORIGUCHI

At Wed, 7 Dec 2016 13:26:38 +0900, Michael Paquier  
wrote in 
> On Wed, Dec 7, 2016 at 12:32 PM, Fujii Masao  wrote:
> > So, isn't it better to compare the performance of some algorithms and
> > confirm which is the best for quorum commit? Since this code is hot, i.e.,
> > can be very frequently executed, I'd like to avoid waste of cycle as much
> > as possible.
> 
> It seems to me that it would be simple enough to write a script to do
> that to avoid any other noise: allocate an array with N random
> elements, and fetch the M-th element from it after applying a sort
> method. I highly doubt that you'd see much difference with a low
> number of elements, now if you scale at a thousand standbys in a
> quorum set you may surely see something :*)
> Anybody willing to try out?


Aside from measurement of the two sorting methods, I'd like to
point out that quorum commit basically doesn't need
sorting. Counting comforming santdbys while scanning the
walsender(receiver) LSN list comparing with the target LSN is
O(n). Small refactoring of SyncRerpGetOldestSyncRecPtr would
enough to do that.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Short reads in hash indexes (was: [HACKERS] [sqlsmith] Failed assertion in _hash_splitbucket_guts)

2016-12-06 Thread Andreas Seltenreich

Amit Kapila writes:

> On Sat, Dec 3, 2016 at 3:44 PM, Andreas Seltenreich  
> wrote:
>> Amit Kapila writes:
>>
>>> [2. text/x-diff; fix_hash_bucketsplit_sqlsmith_v1.patch]
>> Ok, I'll do testing with the patch applied.

Good news: the assertion hasn't fired since the patch is in.

However, these are still getting logged:

smith=# select * from state_report where sqlstate = 'XX001';
-[ RECORD 1 
]--
count| 10
sqlstate | XX001
sample   | ERROR:  could not read block 1173 in file "base/16384/17256": read 
only 0 of 8192 bytes
hosts| {airbisquit,frell,gorgo,marbit,pillcrow,quakken}

> Hmm, I am not sure if this is related to previous problem, but it
> could be.  Is it possible to get the operation and or callstack for
> above failure?

Ok, will turn the elog into an assertion to get at the backtraces.

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Back-patch use of unnamed POSIX semaphores for Linux?

2016-12-06 Thread Tatsuo Ishii

> Potential risks involving minor upgrades are far higher than the risks
> involved by systemd, so -1 for a backpatch seen from here.

As long as we would have a compile time switch to enable/disable the
back-patched feature, it seems it would be acceptable. In the worst
case, the back-patching could bring disasters, but in that case
packagers could turn off the switch and ship updated version of
packages.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Back-patch use of unnamed POSIX semaphores for Linux?

2016-12-06 Thread Michael Paquier

On Wed, Dec 7, 2016 at 1:43 PM, Robert Haas  wrote:
> Urk.  That sounds like a scary thing to back-patch.  The fact that the
> buildfarm has reported no problems is good as far as it goes, but user
> environments can be expected to be considerably more diverse than the
> buildfarm.  I wouldn't mind giving users the option to select unnamed
> POSIX semas, but I don't think there's any guarantee that that's 100%
> certain to work every place where the current implementation works -
> and if not, then people will upgrade to the latest minor release and
> everything will completely stop working.

Potential risks involving minor upgrades are far higher than the risks
involved by systemd, so -1 for a backpatch seen from here.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Back-patch use of unnamed POSIX semaphores for Linux?

2016-12-06 Thread Tom Lane

Robert Haas  writes:
> On Tue, Dec 6, 2016 at 9:53 PM, Tom Lane  wrote:
>> I think we should give serious consideration to back-patching commit
>> ecb0d20a9, which changed the default semaphore type to unnamed-POSIX
>> on Linux.

> Urk.  That sounds like a scary thing to back-patch.

I don't deny that it's scary, but the alternative seems to be to be
rather badly broken on systemd-using distros for years to come.
That's pretty scary too.

> ... Granted, that might not
> happen, because maybe unnamed POSIX semas are one of those really
> awesome operating system primitives that never has problems on any
> system anywhere ever.  But I think it's pretty hard to be certain of
> that.

You're attacking a straw man.  I didn't propose changing our behavior
anywhere but Linux.  AFAIK, on that platform unnamed POSIX semaphores
are futexes, which have been a stable feature since 2003 according to
https://en.wikipedia.org/wiki/Futex#History.  Anybody who did need
to compile PG for use with a pre-2.6 kernel could override the default,
anyway.

Now, I did think of a problem we'd have to deal with, which is how
to avoid breaking ABI by changing sizeof(PGSemaphoreData).  I think
that's soluble though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Back-patch use of unnamed POSIX semaphores for Linux?

2016-12-06 Thread Robert Haas

On Tue, Dec 6, 2016 at 9:53 PM, Tom Lane  wrote:
> Just saw another report of what's probably systemd killing off Postgres'
> SysV semaphores, as we've discussed previously at, eg,
> https://www.postgresql.org/message-id/flat/57828C31.5060409%40gmail.com
> Since the systemd people are generally impervious to suggestions that
> they might be mistaken, I do not expect this problem to go away.
>
> I think we should give serious consideration to back-patching commit
> ecb0d20a9, which changed the default semaphore type to unnamed-POSIX
> on Linux.  We've seen no problems in the buildfarm in the two months
> that that's been in HEAD.  If we don't do this, we can expect to
> continue seeing complaints of this sort until pre-v10 PG releases
> fall out of use ... and I don't want to wait that long.
>
> Commit ecb0d20a9 also changed the default for FreeBSD.  I'm not convinced
> we should back-patch that part, because (a) unnamed-POSIX semas have
> only been there since FreeBSD 9.0, which isn't that long ago, and (b)
> the argument for switching is "it'll perform better" not "your server
> will fail randomly without this change".
>
> Comments?

Urk.  That sounds like a scary thing to back-patch.  The fact that the
buildfarm has reported no problems is good as far as it goes, but user
environments can be expected to be considerably more diverse than the
buildfarm.  I wouldn't mind giving users the option to select unnamed
POSIX semas, but I don't think there's any guarantee that that's 100%
certain to work every place where the current implementation works -
and if not, then people will upgrade to the latest minor release and
everything will completely stop working.  Granted, that might not
happen, because maybe unnamed POSIX semas are one of those really
awesome operating system primitives that never has problems on any
system anywhere ever.  But I think it's pretty hard to be certain of
that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Declarative partitioning - another take

2016-12-06 Thread Robert Haas

On Wed, Nov 30, 2016 at 10:56 AM, Amit Langote  wrote:
> The latest patch I posted earlier today has this implementation.

I decided to try out these patches today with #define
CLOBBER_CACHE_ALWAYS 1 in pg_config_manual.h, which found a couple of
problems:

1. RelationClearRelation() wasn't preserving the rd_partkey, even
though there's plenty of code that relies on it not changing while we
hold a lock on the relation - in particular, transformPartitionBound.

2. partition_bounds_equal() was using the comparator and collation for
partitioning column 0 to compare the datums for all partitioning
columns.  It's amazing this passed the regression tests.

The attached incremental patch fixes those things and some cosmetic
issues I found along the way.

3. RelationGetPartitionDispatchInfo() is badly broken:

1010 pd[i] = (PartitionDispatch) palloc(sizeof(PartitionDispatchData));
1011 pd[i]->relid = RelationGetRelid(partrel);
1012 pd[i]->key = RelationGetPartitionKey(partrel);
1013 pd[i]->keystate = NIL;
1014 pd[i]->partdesc = partdesc;
1015 pd[i]->indexes = (int *) palloc(partdesc->nparts * sizeof(int));
1016 heap_close(partrel, NoLock);
1017
1018 m = 0;
1019 for (j = 0; j < partdesc->nparts; j++)
1020 {
1021 Oid partrelid = partdesc->oids[j];

This code imagines that pointers it extracted from partrel are certain
to remain valid after heap_close(partrel, NoLock), perhaps on the
strength of the fact that we still retain a lock on the relation.  But
this isn't the case.  As soon as nobody has the relation open, a call
to RelationClearRelation() will destroy the relcache entry and
everything to which it points; with CLOBBER_CACHE_ALWAYS, I see a
failure at line 1021:

#0  RelationGetPartitionDispatchInfo (rel=0x1136dddf8, lockmode=3,
leaf_part_oids=0x7fff5633b938) at partition.c:1021
1021Oidpartrelid = partdesc->oids[j];
(gdb) bt 5
#0  RelationGetPartitionDispatchInfo (rel=0x1136dddf8, lockmode=3,
leaf_part_oids=0x7fff5633b938) at partition.c:1021
#1  0x000109b8d71f in ExecInitModifyTable (node=0x7fd12984d750,
estate=0x7fd12b885438, eflags=0) at nodeModifyTable.c:1730
#2  0x000109b5e7ac in ExecInitNode (node=0x7fd12984d750,
estate=0x7fd12b885438, eflags=0) at execProcnode.c:159
#3  0x000109b58548 in InitPlan (queryDesc=0x7fd12b87b638,
eflags=0) at execMain.c:961
#4  0x000109b57dcd in standard_ExecutorStart
(queryDesc=0x7fd12b87b638, eflags=0) at execMain.c:239
(More stack frames follow...)
Current language:  auto; currently minimal
(gdb) p debug_query_string
$1 = 0x7fd12b84c238 "insert into list_parted values (null, 1);"
(gdb) p partdesc[0]
$2 = {
  nparts = 2139062143,
  oids = 0x7f7f7f7f7f7f7f7f,
  boundinfo = 0x7f7f7f7f7f7f7f7f
}

As you can see, the partdesc is no longer valid here.  I'm not
immediately sure how to fix this; this isn't a simple thinko.  You
need to keep the relations open for the whole duration of the query,
not just long enough to build the dispatch info.  I think you should
try to revise this so that each relation is opened once and kept open;
maybe the first loop should be making a pointer-list of Relations
rather than an int-list of relation OIDs.  And it seems to me (though
I'm getting a little fuzzy here because it's late) that you need all
of the partitions open, not just the ones that are subpartitioned,
because how else are you going to know how to remap the tuple if the
column order is different?  But I don't see this code doing that,
which makes me wonder if the partitions are being opened yet again in
some other location.

I recommend that once you fix this, you run 'make check' with #define
CLOBBER_CACHE_ALWAYS 1 and look for other hazards.  Such mistakes are
easy to make with this kind of patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 83dc151..cc9009d 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -53,16 +53,16 @@
  * Information about bounds of a partitioned relation
  *
  * A list partition datum that is known to be NULL is never put into the
- * datums array, instead it is tracked using has_null and null_index fields.
+ * datums array. Instead, it is tracked using has_null and null_index fields.
  *
- * In case of range partitioning, ndatums is far less than 2 * nparts, because
- * a partition's upper bound and the next partition's lower bound are same
- * in most common cases, and we only store one of them.
+ * In the case of range partitioning, ndatums will typically be far less than
+ * 2 * nparts, because a partition's upper bound and the next partition's lower
+ * bound are the same in most common cases, and we only store one of them.
  *
- * In case of list partitioning, the indexes array stores one entry for every
- * datum, which is the index of the partition th

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2016-12-06 Thread Michael Paquier

On Wed, Dec 7, 2016 at 12:32 PM, Fujii Masao  wrote:
> So, isn't it better to compare the performance of some algorithms and
> confirm which is the best for quorum commit? Since this code is hot, i.e.,
> can be very frequently executed, I'd like to avoid waste of cycle as much
> as possible.

It seems to me that it would be simple enough to write a script to do
that to avoid any other noise: allocate an array with N random
elements, and fetch the M-th element from it after applying a sort
method. I highly doubt that you'd see much difference with a low
number of elements, now if you scale at a thousand standbys in a
quorum set you may surely see something :*)
Anybody willing to try out?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2016-12-06 Thread Fujii Masao

On Tue, Dec 6, 2016 at 6:57 PM, Masahiko Sawada  wrote:
> On Tue, Dec 6, 2016 at 1:11 PM, Fujii Masao  wrote:
>> On Mon, Nov 28, 2016 at 8:03 PM, Masahiko Sawada  
>> wrote:
>>> On Sat, Nov 26, 2016 at 10:27 PM, Michael Paquier
>>>  wrote:
 On Tue, Nov 15, 2016 at 7:08 PM, Masahiko Sawada  
 wrote:
> Attached latest version patch incorporated review comments. After more
> thought, I agree and changed the value of standby priority in quorum
> method so that it's not set 1 forcibly. The all standby priorities are
> 1 If s_s_names = 'ANY(*)'.
> Please review this patch.

 Sorry for my late reply. Here is my final lookup.
>>>
>>> Thank you for reviewing!
>>>
  
 -num_sync ( >>> class="parameter">standby_name [, ...] )
 +[ANY] num_sync (
 standby_name [, ...] )
 +FIRST num_sync (
 standby_name [, ...] )
  standby_name [, ...
 This can just be replaced with [ ANY | FIRST ]. There is no need for
 braces as the keyword is not mandatory.

 +is the name of a standby server.
 +FIRST and ANY specify the method used by
 +the master to control the standby servres.
  
 s/servres/servers/.

 if (priority == 0)
 values[7] = CStringGetTextDatum("async");
 else if (list_member_int(sync_standbys, i))
 -   values[7] = CStringGetTextDatum("sync");
 +   values[7] = SyncRepConfig->sync_method == 
 SYNC_REP_PRIORITY ?
 +   CStringGetTextDatum("sync") : 
 CStringGetTextDatum("quorum");
 else
 values[7] = CStringGetTextDatum("potential");
 This can be simplified a bit as "quorum" is the state value for all
 standbys with a non-zero priority when the method is set to
 SYNC_REP_QUORUM:
 if (priority == 0)
 values[7] = CStringGetTextDatum("async");
 +   else if (SyncRepConfig->sync_method == SYNC_REP_QUORUM)
 +   values[7] = CStringGetTextDatum("quorum");
 else if (list_member_int(sync_standbys, i))
 values[7] = CStringGetTextDatum("sync");
 else

 SyncRepConfig data is made external to syncrep.c with this patch as
 walsender.c needs to look at the sync method in place, no complain
 about that after considering if there could be a more elegant way to
 do things without this change.
>>>
>>> Agreed.
>>>
 While reviewing the patch, I have found a couple of incorrectly shaped
 sentences, both in the docs and some comments. Attached is a new
 version with this word-smithing. The patch is now switched as ready
 for committer.
>>>
>>> Thanks. I found a typo in v7 patch, so attached latest v8 patch.
>>
>> +qsort(write_array, len, sizeof(XLogRecPtr), cmp_lsn);
>> +qsort(flush_array, len, sizeof(XLogRecPtr), cmp_lsn);
>> +qsort(apply_array, len, sizeof(XLogRecPtr), cmp_lsn);
>>
>> In quorum commit, we need to calculate the N-th largest LSN from
>> M quorum synchronous standbys' LSN. N would be usually 1 - 3 and
>> M would be 1 - 10, I guess. You used the algorithm using qsort for
>> that calculation. But I'm not sure if that's enough effective algorithm
>> or not.
>>
>> If M (i.e., number of quorum sync standbys) is enough large,
>> your choice would be good. But usually M seems not so large.
>>
>
> Thank you for the comment.
>
> One another possible idea is to use the partial selection sort[1],
> which takes O(MN) time. Since this is more efficient if N is small
> this would be better than qsort for this case. But I'm not sure that
> we can see such a difference by result of performance measurement.

So, isn't it better to compare the performance of some algorithms and
confirm which is the best for quorum commit? Since this code is hot, i.e.,
can be very frequently executed, I'd like to avoid waste of cycle as much
as possible.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Push down more full joins in postgres_fdw

2016-12-06 Thread Etsuro Fujita


On 2016/12/07 5:27, Robert Haas wrote:

On Mon, Dec 5, 2016 at 6:20 AM, Ashutosh Bapat
 wrote:

4. I am still not happy with this change
+/*
+ * Since (1) the expressions in foreignrel's reltarget doesn't contain
+ * any PHVs and (2) foreignrel's local_conds is empty, the tlist
+ * created by build_tlist_to_deparse must be one-to-one with the
+ * expressions.
+ */
+Assert(list_length(tlist) ==
list_length(foreignrel->reltarget->exprs));
the assertion only checks that the number of elements in both the lists are
same but does not check whether those lists are same i.e. they contain the same
elements in the same order. This equality is crucial to deparsing logic. If
somehow build_tlist_to_deparse() breaks that assumption in future, we have no
way to detect it, unless a regression test fails.



If there's an easy way to do a more exact comparison, great.


Ashutosh proposed this to do the comparison:

On 2016/11/22 18:28, Ashutosh Bapat wrote:

I guess, the reason why you are doing it this way, is SELECT clause for the
outermost query gets deparsed before FROM clause. For later we call
deparseRangeTblRef(), which builds the tlist. So, while deparsing SELECT
clause, we do not have tlist to build from. In that case, I guess, we have to
build the tlist in get_subselect_alias_id() if it's not available and stick it
in fpinfo. Subsequent calls to get_subselect_alias_id() should find tlist
there. Then in deparseRangeTblRef() assert that there's a tlist in fpinfo
and use it to build the SELECT clause of subquery. That way, we don't build
tlist unless it's needed and also use the same tlist for all searches. Please
use tlist_member() to search into the tlist.


This would probably work, but seems to me a bit complicated.  Instead, 
I'd like to propose that we build the tlist for each relation being 
deparsed as a subquery in a given join tree, right before deparsing the 
SELECT clause in deparseSelectStmtForRel, if is_subquery is false and 
lower_subquery_rels isn't NULL, and store the tlist into the relation's 
fpinfo.  That would allow us to build the tlist only when we need it, 
and to use tlist_member for the exact comparison.  I think it would be 
much easier to implement that.


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Back-patch use of unnamed POSIX semaphores for Linux?

2016-12-06 Thread Tom Lane

Just saw another report of what's probably systemd killing off Postgres'
SysV semaphores, as we've discussed previously at, eg,
https://www.postgresql.org/message-id/flat/57828C31.5060409%40gmail.com
Since the systemd people are generally impervious to suggestions that
they might be mistaken, I do not expect this problem to go away.

I think we should give serious consideration to back-patching commit
ecb0d20a9, which changed the default semaphore type to unnamed-POSIX
on Linux.  We've seen no problems in the buildfarm in the two months
that that's been in HEAD.  If we don't do this, we can expect to
continue seeing complaints of this sort until pre-v10 PG releases
fall out of use ... and I don't want to wait that long.

Commit ecb0d20a9 also changed the default for FreeBSD.  I'm not convinced
we should back-patch that part, because (a) unnamed-POSIX semas have
only been there since FreeBSD 9.0, which isn't that long ago, and (b)
the argument for switching is "it'll perform better" not "your server
will fail randomly without this change".

Comments?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Partitionning: support for Truncate Table WHERE

2016-12-06 Thread legrand legrand

Hello,


Working in a DSS environment, we often need to truncate table partitions 
regarding a WHERE condition and have to:

- query the dictionnary to identify thoses partitions,

- build SQL statements,

- truncate all partitions covered by the WHERE condition

- eventually delete the rest ...

- perform additionnal maintenance tasks


Wouldn't it be possible to make it possible directly in SQL as a TRUNCATE TABLE 
WHERE syntax ?

I have done something similar using a procedure in Oracle:

- identifying partitions using oracle rowids by a SELECT using the same WHERE 
conditions

- for each partition check if data NOT in WHERE condition

- then DELETE WHERE or TRUNCATE PARTITION depending of data distribution.



Maybe there are some other constrainst like locking, FK disabling/enabling, 
indexes rebuild ...


Would be pleased to ear your feedback regarding this.


Regards

PAscal

Re: [HACKERS] Separate connection handling from backends

2016-12-06 Thread Tom Lane

Greg Stark  writes:
> On 5 December 2016 at 19:48, Jim Nasby  wrote:
>> One solution to this would be to segregate connection handling from actual
>> backends, somewhere along the lines of separating the main loop from the
>> switch() that handles libpq commands. Benefits:

> I'm kind of mystified how a simple code restructuring could solve the
> fundamental problems with a large number of backends. It sounds like
> what you're describing would just push the problem around, you would
> end up with some other maximum instead, max_backends, or
> max_active_backends, or something like that with the same problems.

What it sounds like to me is building a connection pooler into the
backend.  I'm not really convinced we ought to go there.

> Heikki's work with CSN would actually address the main fundamental
> problem. Instead of having to scan PGPROC when taking a snapshot
> taking a snapshot would be O(1).

While that would certainly improve matters, I suspect there are still
going to be bottlenecks arising from too many backends.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Select works only when connected from login postgres

2016-12-06 Thread Tom Lane

Joseph Brenner  writes:
> Well, my take would be that if you've taken the trouble to set an
> empty string as the PAGER that means something, and it probably means
> you don't want any pager to be used.

Yeah, on reflection that argument seems pretty persuasive.  So I propose
the attached patch.

BTW, I realized while testing this that there's still one gap in our
understanding of what went wrong for you: cases like "SELECT 'hello'"
should not have tried to use the pager, because that would've produced
less than a screenful of data.  But that's irrelevant here, because it
can easily be shown that psql doesn't behave nicely if PAGER is set to
empty when it does try to use the pager.

regards, tom lane

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 261652a..9915731 100644
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
*** $endif
*** 3801,3808 
If the query results do not fit on the screen, they are piped
through this command.  Typical values are
more or less.  The default
!   is platform-dependent.  The use of the pager can be disabled by
!   using the \pset command.
   
  
 
--- 3801,3809 
If the query results do not fit on the screen, they are piped
through this command.  Typical values are
more or less.  The default
!   is platform-dependent.  Use of the pager can be disabled by setting
!   PAGER to empty, or by using pager-related options of
!   the \pset command.
   
  
 
diff --git a/src/fe_utils/print.c b/src/fe_utils/print.c
index 1ec74f1..5c5d285 100644
*** a/src/fe_utils/print.c
--- b/src/fe_utils/print.c
*** PageOutput(int lines, const printTableOp
*** 2874,2879 
--- 2874,2885 
  			pagerprog = getenv("PAGER");
  			if (!pagerprog)
  pagerprog = DEFAULT_PAGER;
+ 			else
+ 			{
+ /* if PAGER is empty or all-white-space, don't use pager */
+ if (strspn(pagerprog, " \t\r\n") == strlen(pagerprog))
+ 	return stdout;
+ 			}
  			disable_sigpipe_trap();
  			pagerpipe = popen(pagerprog, "w");
  			if (pagerpipe)
diff --git a/src/interfaces/libpq/fe-print.c b/src/interfaces/libpq/fe-print.c
index c33dc42..e596a51 100644
*** a/src/interfaces/libpq/fe-print.c
--- b/src/interfaces/libpq/fe-print.c
*** PQprint(FILE *fout, const PGresult *res,
*** 166,173 
  			screen_size.ws_col = 80;
  #endif
  			pagerenv = getenv("PAGER");
  			if (pagerenv != NULL &&
! pagerenv[0] != '\0' &&
  !po->html3 &&
  ((po->expanded &&
    nTups * (nFields + 1) >= screen_size.ws_row) ||
--- 166,174 
  			screen_size.ws_col = 80;
  #endif
  			pagerenv = getenv("PAGER");
+ 			/* if PAGER is unset, empty or all-white-space, don't use pager */
  			if (pagerenv != NULL &&
! strspn(pagerenv, " \t\r\n") != strlen(pagerenv) &&
  !po->html3 &&
  ((po->expanded &&
    nTups * (nFields + 1) >= screen_size.ws_row) ||

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separate connection handling from backends

2016-12-06 Thread Greg Stark

On 5 December 2016 at 19:48, Jim Nasby  wrote:
> One solution to this would be to segregate connection handling from actual
> backends, somewhere along the lines of separating the main loop from the
> switch() that handles libpq commands. Benefits:

I'm kind of mystified how a simple code restructuring could solve the
fundamental problems with a large number of backends. It sounds like
what you're describing would just push the problem around, you would
end up with some other maximum instead, max_backends, or
max_active_backends, or something like that with the same problems.
At best it would help people who have connection pooling or but few
connections active at any given time.

Heikki's work with CSN would actually address the main fundamental
problem. Instead of having to scan PGPROC when taking a snapshot
taking a snapshot would be O(1). There might need to be scans of the
list of active transactions but never of all connections whether
they're in a transaction or not.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2016-12-06 Thread Andres Freund

On 2016-12-06 13:27:14 -0800, Peter Geoghegan wrote:
> On Mon, Dec 5, 2016 at 7:49 PM, Andres Freund  wrote:
> > I tried to address 2) by changing the C implementation. That brings some
> > measurable speedups, but it's not huge. A bigger speedup is making
> > slot_getattr, slot_getsomeattrs, slot_getallattrs very trivial wrappers;
> > but it's still not huge.  Finally I turned to just-in-time (JIT)
> > compiling the code for tuple deforming. That doesn't save the cost of
> > 1), but it gets rid of most of 2) (from ~15% to ~3% in TPCH-Q01).  The
> > first part is done in 0008, the JITing in 0012.
>
> A more complete motivating example would be nice. For example, it
> would be nice to see the overall speedup for some particular TPC-H
> query.

Well, it's a bit WIP-y for that - not all TPCH queries run JITed yet, as
I've not done that for enough expression types... And you run quickly
into other bottlenecks.

But here we go for TPCH (scale 10) Q01:
master:
Time: 33885.381 ms
  16.29%  postgres  postgres  [.] slot_getattr
  12.85%  postgres  postgres  [.] ExecMakeFunctionResultNoSets
  10.85%  postgres  postgres  [.] advance_aggregates
   6.91%  postgres  postgres  [.] slot_deform_tuple
   6.70%  postgres  postgres  [.] advance_transition_function
   4.59%  postgres  postgres  [.] ExecProject
   4.25%  postgres  postgres  [.] float8_accum
   3.69%  postgres  postgres  [.] tuplehash_insert
   2.39%  postgres  postgres  [.] float8pl
   2.20%  postgres  postgres  [.] bpchareq
   2.03%  postgres  postgres  [.] check_stack_depth

profile:

(note that all expression evaluated things are distributed among many
functions)

dev (no jiting):
Time: 30343.532 ms

profile:
  16.57%  postgres  postgres  [.] slot_deform_tuple
  13.39%  postgres  postgres  [.] ExecEvalExpr
   8.64%  postgres  postgres  [.] advance_aggregates
   8.58%  postgres  postgres  [.] advance_transition_function
   5.83%  postgres  postgres  [.] float8_accum
   5.14%  postgres  postgres  [.] tuplehash_insert
   3.89%  postgres  postgres  [.] float8pl
   3.60%  postgres  postgres  [.] slot_getattr
   2.66%  postgres  postgres  [.] bpchareq
   2.56%  postgres  postgres  [.] heap_getnext

dev (jiting):
SET jit_tuple_deforming = on;
SET jit_expressions = true;

Time: 24439.803 ms

profile:
  11.11%  postgres  postgres [.] slot_deform_tuple
  10.87%  postgres  postgres [.] advance_aggregates
   9.74%  postgres  postgres [.] advance_transition_function
   6.53%  postgres  postgres [.] float8_accum
   5.25%  postgres  postgres [.] tuplehash_insert
   4.31%  postgres  perf-10698.map   [.] deform0
   3.68%  postgres  perf-10698.map   [.] evalexpr6
   3.53%  postgres  postgres [.] slot_getattr
   3.41%  postgres  postgres [.] float8pl
   2.84%  postgres  postgres [.] bpchareq

(note how expression eval when from 13.39% to roughly 4%)

The slot_deform_cost here is primarily cache misses. If you do the
"memory order" iteration, it drops significantly.

The JIT generated code still leaves a lot on the table, i.e. this is
definitely not the best we can do.  We also deform half the tuple twice,
because I've not yet added support for starting to deform in the middle
of a tuple.

Independent of new expression evaluation and/or JITing, if you make
advance_aggregates and advance_transition_function inline functions (or
you do profiling accounting for children), you'll notice that ExecAgg()
+ advance_aggregates + advance_transition_function themselves take up
about 20% cpu-time.  That's *not* including the hashtable management,
the actual transition functions, and such themselves.


If you have queries where tuple deforming is a bigger proportion of the
load, or where expression evalution (including projection) is a larger
part (any NULLs e.g.) you can get a lot bigger wins, even without
actually optimizing the generated code (which I've not yet done).

Just btw: float8_accum really should use an internal aggregation type
instead of using postgres array...


Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch to implement pg_current_logfile() function

2016-12-06 Thread Gilles Darold

Le 02/12/2016 à 02:08, Karl O. Pinc a écrit :
> On Sun, 27 Nov 2016 21:54:46 +0100
> Gilles Darold  wrote:
>
>> I've attached the v15 of the patch 
>> I've not applied patch patch_pg_current_logfile-v14.diff.backoff to
>> prevent constant call of logfile_writename() on a busy system (errno =
>> ENFILE | EMFILE).
> I don't think it should be applied and included in the basic
> functionality patch in any case. I think it needs to be submitted as a
> separate patch along with the basic functionality patch.  Backing off
> the retry of the current_logfiles write could be overly fancy and
> simply not needed.
>
>> I think this can be done quite simply by testing if
>> log rotate is still enabled. This is possible because function
>> logfile_rotate() is already testing if  errno = ENFILE | EMFILE and in
>> this case rotation_disabled is set to true. So the following test
>> should do the work:
>>
>>if (log_metainfo_stale && !rotation_disabled)
>>logfile_writename();
>>
>> This is included in v15 patch.
> I don't see this helping much, if at all.
>
> First, it's not clear just when rotation_disabled can be false
> when log_metainfo_stale is true.  The typical execution
> path is for logfile_writename() to be called after rotate_logfile()
> has already set rotataion_disabled to true.   logfile_writename()
> is the only place setting log_metainfo_stale to true and
> rotate_logfile() the only place settig rotation_disabled to false.
>
> While it's possible
> that log_metainfo_stale might be set to true when logfile_writename()
> is called from within open_csvlogfile(), this does not help with the
> stderr case.  IMO better to just test for log_metaifo_stale at the
> code snippet above.
>
> Second, the whole point of retrying the logfile_writename() call is
> to be sure that the current_logfiles file is updated before the logs
> rotate.  Waiting until logfile rotation is enabled defeats the purpose.

Ok, sorry I've misunderstood your previous post. Current v16
attached patch removed your change about log_meta_info
stale and fix the use of sscanf to read the file.

It seems that all fixes have been included in this patch.

Regards

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0fc4e57..41144cb 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4169,6 +4169,12 @@ SELECT * FROM parent WHERE key = 2400;
  server log
 
 
+When logs are written to the file-system their paths, names, and
+types are recorded in
+the  file.  This provides
+a convenient way to find and access log content without establishing a
+database connection.
+
 
  Where To Log
 
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index eca98df..47ca846 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -15444,6 +15444,19 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
   
 
   
+   pg_current_logfile()
+   text
+   primary log file name in use by the logging collector
+  
+
+  
+   pg_current_logfile(text)
+   text
+   log file name, of log in the requested format, in use by the
+   logging collector
+  
+
+  
pg_my_temp_schema()
oid
OID of session's temporary schema, or 0 if none
@@ -15661,6 +15674,39 @@ SET search_path TO schema , schema, ..
 the time when the postmaster process re-read the configuration files.)

 
+
+pg_current_logile
+   
+
+   
+Logging
+pg_current_logfile function
+   
+
+   
+pg_current_logfile returns, as text,
+the path of either the csv or stderr log file currently in use by the
+logging collector.  This is a path including the
+ directory and the log file name.
+Log collection must be active or the return value
+is NULL.  When multiple logfiles exist, each in a
+different format, pg_current_logfile called
+without arguments returns the path of the file having the first format
+found in the ordered
+list: stderr, csvlog.
+NULL is returned when no log file has any of these
+formats.  To request a specific file format supply,
+as text, either csvlog
+or stderr as the value of the optional parameter. The
+return value is NULL when the log format requested
+is not a configured .
+
+   pg_current_logfiles reflects the content of the
+ file.  All caveats
+regards current_logfiles content are applicable
+to pg_current_logfiles' return value.
+   
+

 pg_my_temp_schema

diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index 5c52824..928133c 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -60,6 +60,79 @@ Item
  Subdirectory containing per-database subdirectories
 
 
+
+ 
+  
+   current_logfiles
+  
+  
+   Logging
+   current_logfiles file
+  
+  current_logfiles
+ 
+ 
+

Re: [HACKERS] Compiler warnings

2016-12-06 Thread Robert Haas

On Tue, Dec 6, 2016 at 3:46 PM, Stephen Frost  wrote:
> Good thought, thanks!
>
> Updated patch attached with that change and I also added an Assert() to
> GetCachedPlan(), in case that code gets whacked around later and somehow
> we end up falling through without actually setting *plan.
>
> Thoughts?

wfm

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separate connection handling from backends

2016-12-06 Thread Adam Brusselback

>
> BTW, it just occurred to me that having this separation would make it
> relatively easy to support re-directing DML queries from a replica to the
> master; if the backend throws the error indicating you tried to write data,
> the connection layer could re-route that.


This also sounds like it would potentially allow re-routing the other way
where you know the replica contains up-to-date data, couldn't you
potentially re-direct read only queries to your replicas?

Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2016-12-06 Thread Peter Geoghegan

On Mon, Dec 5, 2016 at 7:49 PM, Andres Freund  wrote:
> I tried to address 2) by changing the C implementation. That brings some
> measurable speedups, but it's not huge. A bigger speedup is making
> slot_getattr, slot_getsomeattrs, slot_getallattrs very trivial wrappers;
> but it's still not huge.  Finally I turned to just-in-time (JIT)
> compiling the code for tuple deforming. That doesn't save the cost of
> 1), but it gets rid of most of 2) (from ~15% to ~3% in TPCH-Q01).  The
> first part is done in 0008, the JITing in 0012.

A more complete motivating example would be nice. For example, it
would be nice to see the overall speedup for some particular TPC-H
query.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgcrypto compilation error due to stack-allocated EVP_CIPHER_CTX

2016-12-06 Thread Michael Paquier

On Tue, Dec 6, 2016 at 11:42 PM, Asif Naeem  wrote:
> Thanks for updated patch. Although EVP_CIPHER_CTX_cleanup() seems deprecated
> in OpenSSL >= 1.1.0 i.e.
>
>> # if OPENSSL_API_COMPAT < 0x1010L
>> #  define EVP_CIPHER_CTX_init(c)  EVP_CIPHER_CTX_reset(c)
>> #  define EVP_CIPHER_CTX_cleanup(c)   EVP_CIPHER_CTX_reset(c)
>> # endif
>
>
> I guess use of deprecated function is fine, until OpenSSL library support
> it.

We could use some ifdef block with the OpenSSL version number, but I
am not sure if that's worth complicating the code at this stage.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2016-12-06 Thread Nico Williams

On Tue, Dec 06, 2016 at 12:36:41PM -0800, Andres Freund wrote:
> On 2016-12-06 15:25:44 -0500, Tom Lane wrote:
> > I'm not entirely thrilled with the idea of this being a configure-time
> > decision, because that forces packagers to decide for their entire
> > audience whether it's okay to depend on LLVM.  That would be an untenable
> > position to put e.g. Red Hat's packagers in: either they screw the people
> > who want performance or they screw the people who want security.

There's no security issue.  The dependency is on LLVM libraries, not
LLVM front-ends (e.g., clang(1)).

I don't think there's a real issue as to distros/packagers/OS vendors.
They already have to package LLVM, and they already package LLVM
libraries separately from LLVM front-ends.

> The argument for not install a c compiler seems to be that it makes it
> less convenient to build an executable. I doubt that having a C(++)
> library for code generation is convenient enough to change the picture
> there.

The security argument goes back to the days of the Morris worm, which
depended on having developer tools (specifically in that case, ld(1),
the link-editor).  But JIT via LLVM won't give hackers a way to generate
or link arbitrary object code.

Nico
-- 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Compiler warnings

2016-12-06 Thread Stephen Frost

Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Tue, Dec 6, 2016 at 3:23 PM, Stephen Frost  wrote:
> > Given the lack of screaming, I'll push the attached in a bit, which just
> > initializes the two variables being complained about.  As mentioned,
> > there doesn't appear to be any live bugs here, this is just to silence
> > the compiler warnings.
> 
> In LWLockRelease, why not just move mode = held_lwlocks[i].mode; after
> the if (i < 0) elog(...), instead of initializing with a bogus value?

Good thought, thanks!

Updated patch attached with that change and I also added an Assert() to
GetCachedPlan(), in case that code gets whacked around later and somehow
we end up falling through without actually setting *plan.

Thoughts?

Thanks!

Stephen
From 55ecc3367bd063f05c89b4b1ca881d2b084859f6 Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Tue, 6 Dec 2016 15:19:09 -0500
Subject: [PATCH] Silence compiler warnings

Rearrange a bit of code to ensure that 'mode' in LWLockRelease is
obviously always set, which seems a bit cleaner and avoids a compiler
warning (thanks to Robert for the suggestion!).

In GetCachedPlan(), initialize 'plan' to silence a compiler warning, but
also add an Assert() to make sure we don't ever actually fall through
with 'plan' still being set to NULL, since we are about to dereference
it.

Neither of these appear to be live bugs but at least gcc
5.4.0-6ubuntu1~16.04.4 doesn't quite have the smarts to realize that.

Discussion: https://www.postgresql.org/message-id/20161129152102.GR13284%40tamriel.snowman.net
---
 src/backend/storage/lmgr/lwlock.c   | 9 -
 src/backend/utils/cache/plancache.c | 4 +++-
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 9c6862f..ffb2f72 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -1780,15 +1780,14 @@ LWLockRelease(LWLock *lock)
 	 * be the latest-acquired lock; so search array backwards.
 	 */
 	for (i = num_held_lwlocks; --i >= 0;)
-	{
 		if (lock == held_lwlocks[i].lock)
-		{
-			mode = held_lwlocks[i].mode;
 			break;
-		}
-	}
+
 	if (i < 0)
 		elog(ERROR, "lock %s %d is not held", T_NAME(lock), T_ID(lock));
+
+	mode = held_lwlocks[i].mode;
+
 	num_held_lwlocks--;
 	for (; i < num_held_lwlocks; i++)
 		held_lwlocks[i] = held_lwlocks[i + 1];
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 884cdab..aa146d6 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -1128,7 +1128,7 @@ CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 			  bool useResOwner)
 {
-	CachedPlan *plan;
+	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
 
@@ -1210,6 +1210,8 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		}
 	}
 
+	Assert(plan != NULL);
+
 	/* Flag the plan as in use by caller */
 	if (useResOwner)
 		ResourceOwnerEnlargePlanCacheRefs(CurrentResourceOwner);
-- 
2.7.4



signature.asc
Description: Digital signature

Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2016-12-06 Thread Andres Freund

On 2016-12-06 14:35:43 -0600, Nico Williams wrote:
> On Tue, Dec 06, 2016 at 12:27:51PM -0800, Andres Freund wrote:
> > On 2016-12-06 14:19:21 -0600, Nico Williams wrote:
> > > > I concur with your feeling that hand-rolled JIT is right out.  But
> > > 
> > > Yeah, that way lies maintenance madness.
> > 
> > I'm not quite that sure about that. I had a lot of fun doing some
> > hand-rolled x86 JITing. Not that is a ward against me being mad.  But
> > more seriously: Manually doing a JIT gives you a lot faster compilation
> > times, which makes JIT applicable in a lot more situations.
> 
> What I meant is that each time there are new ISA extensions, or
> differences in how relevant/significant different implementations of the
> same ISA implement certain instructions, and/or every time you want to
> add a new architecture... someone has to do a lot of very low-level
> work.

Yea, that's why I didn't pursue this path further. I *personally* think
it'd be perfectly fine to only support JITing on linux x86_64 and
aarch64 for now. And those I'd be willing to work on. But since I know
that's not project policy...

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2016-12-06 Thread Andres Freund

On 2016-12-06 15:25:44 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-12-06 13:56:28 -0500, Tom Lane wrote:
> >> I guess the $64 question that has to be addressed here is whether we're
> >> prepared to accept LLVM as a run-time dependency.  There are some reasons
> >> why we might not be:
>
> > Indeed.  It'd only be a soft dependency obviously.
>
> Oh, so we'd need to maintain both the LLVM and the traditional expression
> execution code?  That seems like a bit of a pain, but maybe we can live
> with it.

Yea, that's why I converted the "traditional" expression evaluation into
a different format first - that way the duplication is a lot
lower. E.g. scalar var eval looks like:

EEO_CASE(EEO_INNER_VAR):
{
int attnum = op->d.var.attnum;

Assert(op->d.var.attnum >= 0);
*op->resnull = innerslot->tts_isnull[attnum];
*op->resvalue = innerslot->tts_values[attnum];
EEO_DISPATCH(op);
}

in normal evaluation and like

case EEO_INNER_VAR:
{
LLVMValueRef value, isnull;
LLVMValueRef v_attnum;

v_attnum = 
LLVMConstInt(LLVMInt32Type(), op->d.var.attnum, false);
value = LLVMBuildLoad(builder, 
LLVMBuildGEP(builder, v_innervalues, &v_attnum, 1, ""), "");
isnull = LLVMBuildLoad(builder, 
LLVMBuildGEP(builder, v_innernulls, &v_attnum, 1, ""), "");
LLVMBuildStore(builder, value, 
v_resvaluep);
LLVMBuildStore(builder, isnull, 
v_resnullp);

LLVMBuildBr(builder, opblocks[i + 1]);
break;
}
for JITed evaluation.

> I'm not entirely thrilled with the idea of this being a configure-time
> decision, because that forces packagers to decide for their entire
> audience whether it's okay to depend on LLVM.  That would be an untenable
> position to put e.g. Red Hat's packagers in: either they screw the people
> who want performance or they screw the people who want security.

Hm. I've a bit of a hard time buying the security argument here. Having
LLVM (not clang!) installed doesn't really change the picture that
much. In either case you can install binaries, and you're very likely
already using some program that does JIT internally. And postgres itself
gives you plenty of ways to execute arbitrary code as superuser.

The argument for not install a c compiler seems to be that it makes it
less convenient to build an executable. I doubt that having a C(++)
library for code generation is convenient enough to change the picture
there.

> I think it'd be all right if we can build this so that the direct
> dependency on LLVM is confined to a separately-packageable extension.
> That way, a packager can produce a core postgresql-server package
> that does not require LLVM, plus a postgresql-llvm package that does,
> and the "no compiler please" crowd simply doesn't install the latter
> package.

That should be possible, but I'm not sure it's worth the effort. The JIT
infrastructure will need resowner integration and such. We can obviously
split things so that part is independent of LLVM, but I'm unconvinced
that the benefit is large enough.

> The alternative would be to produce two independent builds of the
> server, which I suppose might be acceptable but it sure seems like
> a kluge, or at least something that simply wouldn't get done by
> most vendors.

Hm. We could make that a make target ourselves ;)

Regards,

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2016-12-06 Thread Nico Williams

On Tue, Dec 06, 2016 at 12:27:51PM -0800, Andres Freund wrote:
> On 2016-12-06 14:19:21 -0600, Nico Williams wrote:
> > A bigger concern might be interface stability.  IIRC the LLVM C/C++
> > interfaces are not very stable, but bitcode is.
> 
> The C API is a lot more stable than the C++ bit, that's the primary
> reason I ended up using it, despite the C++ docs being better.

Ah.

> > > I concur with your feeling that hand-rolled JIT is right out.  But
> > 
> > Yeah, that way lies maintenance madness.
> 
> I'm not quite that sure about that. I had a lot of fun doing some
> hand-rolled x86 JITing. Not that is a ward against me being mad.  But
> more seriously: Manually doing a JIT gives you a lot faster compilation
> times, which makes JIT applicable in a lot more situations.

What I meant is that each time there are new ISA extensions, or
differences in how relevant/significant different implementations of the
same ISA implement certain instructions, and/or every time you want to
add a new architecture... someone has to do a lot of very low-level
work.

> > > I'm not sure that whatever performance gain we might get in this
> > > direction is worth the costs.
> > 
> > Byte-/bit-coding query plans then JITting them is very likely to improve
> > performance significantly.
> 
> Note that what I'm proposing is a far cry away from that - this converts
> two (peformance wise two, size wise one) significant subsystems, but far
> from all the executors to be JIT able.  I think there's some more low

Yes, I know.

> hanging fruits (particularly aggregate transition functions), but
> converting everything seems to hit the wrong spot in the
> benefit/effort/maintainability triangle.

Maybe?  At least with the infrastructure in place for it someone might
try it and see.

Nico
-- 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hash Indexes

2016-12-06 Thread Jeff Janes

On Tue, Dec 6, 2016 at 4:00 AM, Amit Kapila  wrote:

> On Tue, Dec 6, 2016 at 1:29 PM, Jeff Janes  wrote:
> >
> >
> > I just occasionally insert a bunch of equal tuples, which have to be in
> > overflow pages no matter how much splitting happens.
> >
> > I am getting vacuum errors against HEAD, after about 20 minutes or so (8
> > cores).
> >
> > 49233  XX002 2016-12-05 23:06:44.087 PST:ERROR:  index "foo_index_idx"
> > contains unexpected zero page at block 64941
> > 49233  XX002 2016-12-05 23:06:44.087 PST:HINT:  Please REINDEX it.
> > 49233  XX002 2016-12-05 23:06:44.087 PST:CONTEXT:  automatic vacuum of
> table
> > "jjanes.public.foo"
> >
>
> Thanks for the report.  This can happen due to vacuum trying to access
> a new page.  Vacuum can do so if (a) the calculation for maxbuckets
> (in metapage) is wrong or (b) it is trying to free the overflow page
> twice.  Offhand, I don't see that can happen in code.  I will
> investigate further to see if there is any another reason why vacuum
> can access the new page.  BTW, have you done the test after commit
> 2f4193c3, that doesn't appear to be the cause of this problem, but
> still, it is better to test after that fix.  I am trying to reproduce
> the issue, but in the meantime, if by anychance, you have a callstack,
> then please share the same.
>

It looks like I compiled the code for testing a few minutes before 2f4193c3
went in.

I've run it again with cb9dcbc1eebd8, after promoting the ERROR in
_hash_checkpage, hashutil.c:174 to a PANIC so that I can get a coredump to
feed to gdb.

This time it was an INSERT, not autovac, that got the error:

35495 INSERT XX002 2016-12-06 09:25:09.517 PST:PANIC:  XX002: index
"foo_index_idx" contains unexpected zero page at block 202328
35495 INSERT XX002 2016-12-06 09:25:09.517 PST:HINT:  Please REINDEX it.
35495 INSERT XX002 2016-12-06 09:25:09.517 PST:LOCATION:  _hash_checkpage,
hashutil.c:174
35495 INSERT XX002 2016-12-06 09:25:09.517 PST:STATEMENT:  insert into foo
(index) select $1 from generate_series(1,1)


#0  0x003838c325e5 in raise (sig=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
64return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0  0x003838c325e5 in raise (sig=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x003838c33dc5 in abort () at abort.c:92
#2  0x007d6adf in errfinish (dummy=) at
elog.c:557
#3  0x00498d93 in _hash_checkpage (rel=0x7f4d030906a0, buf=, flags=) at hashutil.c:169
#4  0x004967cf in _hash_getbuf_with_strategy (rel=0x7f4d030906a0,
blkno=, access=2, flags=1, bstrategy=)
at hashpage.c:234
#5  0x00493dbb in hashbucketcleanup (rel=0x7f4d030906a0,
cur_bucket=14544, bucket_buf=7801, bucket_blkno=22864, bstrategy=0x0,
maxbucket=276687,
highmask=524287, lowmask=262143, tuples_removed=0x0,
num_index_tuples=0x0, split_cleanup=1 '\001', callback=0,
callback_state=0x0) at hash.c:799
#6  0x00497921 in _hash_expandtable (rel=0x7f4d030906a0,
metabuf=7961) at hashpage.c:668
#7  0x00495596 in _hash_doinsert (rel=0x7f4d030906a0,
itup=0x1f426b0) at hashinsert.c:236
#8  0x00494830 in hashinsert (rel=0x7f4d030906a0, values=, isnull=, ht_ctid=0x7f4d03076404,
heapRel=, checkUnique=) at
hash.c:247
#9  0x005c81bc in ExecInsertIndexTuples (slot=0x1f28940,
tupleid=0x7f4d03076404, estate=0x1f28280, noDupErr=0 '\000',
specConflict=0x0,
arbiterIndexes=0x0) at execIndexing.c:389
#10 0x005e74ad in ExecInsert (node=0x1f284d0) at
nodeModifyTable.c:496
#11 ExecModifyTable (node=0x1f284d0) at nodeModifyTable.c:1511
#12 0x005cc9d8 in ExecProcNode (node=0x1f284d0) at
execProcnode.c:396
#13 0x005ca53a in ExecutePlan (queryDesc=0x1f21a30,
direction=, count=0) at execMain.c:1567
#14 standard_ExecutorRun (queryDesc=0x1f21a30, direction=, count=0) at execMain.c:338
#15 0x7f4d0c1a6dfb in pgss_ExecutorRun (queryDesc=0x1f21a30,
direction=ForwardScanDirection, count=0) at pg_stat_statements.c:877
#16 0x006dfc8a in ProcessQuery (plan=,
sourceText=0x1f21990 "insert into foo (index) select $1 from
generate_series(1,1)",
params=0x1f219e0, dest=0xc191c0, completionTag=0x7ffe82a959d0 "") at
pquery.c:185
#17 0x006dfeda in PortalRunMulti (portal=0x1e86900, isTopLevel=1
'\001', setHoldSnapshot=0 '\000', dest=0xc191c0, altdest=0xc191c0,
completionTag=0x7ffe82a959d0 "") at pquery.c:1299
#18 0x006e056c in PortalRun (portal=0x1e86900,
count=9223372036854775807, isTopLevel=1 '\001', dest=0x1eec870,
altdest=0x1eec870,
completionTag=0x7ffe82a959d0 "") at pquery.c:813
#19 0x006de832 in exec_execute_message (argc=,
argv=, dbname=0x1e933b8 "jjanes",
username=) at postgres.c:1977
#20 PostgresMain (argc=, argv=,
dbname=0x1e933b8 "jjanes", username=) at
postgres.c:4132
#21 0x0067e8a4 in BackendRun (argc=,
argv=) at postmaster.c:4274
#22 BackendStartup (argc=, argv=)
at postmaster.c:3946
#23 ServerLoop (argc=, argv=) at
postmaster.c:1704
#24 Postmas

Re: [HACKERS] Compiler warnings

2016-12-06 Thread Robert Haas

On Tue, Dec 6, 2016 at 3:23 PM, Stephen Frost  wrote:
> * Stephen Frost (sfr...@snowman.net) wrote:
>> Not sure if anyone else has been seeing these, but I'm getting a bit
>> tired of them.  Neither is a live bug, but they also seem pretty simple
>> to fix.  The attached patch makes both of these warnings go away.  At
>> least for my common build, these are the only warnings that are thrown.
>>
>> I'm building with:
>>
>> gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
>>
>> .../src/backend/storage/lmgr/lwlock.c: In function ‘LWLockRelease’:
>> .../src/backend/storage/lmgr/lwlock.c:1802:5: warning: ‘mode’ may be used 
>> uninitialized in this function [-Wmaybe-uninitialized]
>>   if (mode == LW_EXCLUSIVE)
>>  ^
>> .../src/backend/utils/cache/plancache.c: In function ‘GetCachedPlan’:
>> .../src/backend/utils/cache/plancache.c:1232:9: warning: ‘plan’ may be used 
>> uninitialized in this function [-Wmaybe-uninitialized]
>>   return plan;
>>  ^
>
> Given the lack of screaming, I'll push the attached in a bit, which just
> initializes the two variables being complained about.  As mentioned,
> there doesn't appear to be any live bugs here, this is just to silence
> the compiler warnings.

In LWLockRelease, why not just move mode = held_lwlocks[i].mode; after
the if (i < 0) elog(...), instead of initializing with a bogus value?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2016-12-06 Thread Andres Freund

Hi,

On 2016-12-06 14:19:21 -0600, Nico Williams wrote:
> A bigger concern might be interface stability.  IIRC the LLVM C/C++
> interfaces are not very stable, but bitcode is.

The C API is a lot more stable than the C++ bit, that's the primary
reason I ended up using it, despite the C++ docs being better.

> > I concur with your feeling that hand-rolled JIT is right out.  But
> 
> Yeah, that way lies maintenance madness.

I'm not quite that sure about that. I had a lot of fun doing some
hand-rolled x86 JITing. Not that is a ward against me being mad.  But
more seriously: Manually doing a JIT gives you a lot faster compilation
times, which makes JIT applicable in a lot more situations.

> > I'm not sure that whatever performance gain we might get in this
> > direction is worth the costs.
> 
> Byte-/bit-coding query plans then JITting them is very likely to improve
> performance significantly.

Note that what I'm proposing is a far cry away from that - this converts
two (peformance wise two, size wise one) significant subsystems, but far
from all the executors to be JIT able.  I think there's some more low
hanging fruits (particularly aggregate transition functions), but
converting everything seems to hit the wrong spot in the
benefit/effort/maintainability triangle.

- Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Push down more full joins in postgres_fdw

2016-12-06 Thread Robert Haas

On Mon, Dec 5, 2016 at 6:20 AM, Ashutosh Bapat
 wrote:
> 4. I am still not happy with this change
> +/*
> + * Since (1) the expressions in foreignrel's reltarget doesn't 
> contain
> + * any PHVs and (2) foreignrel's local_conds is empty, the tlist
> + * created by build_tlist_to_deparse must be one-to-one with the
> + * expressions.
> + */
> +Assert(list_length(tlist) ==
> list_length(foreignrel->reltarget->exprs));
> the assertion only checks that the number of elements in both the lists are
> same but does not check whether those lists are same i.e. they contain the 
> same
> elements in the same order. This equality is crucial to deparsing logic. If
> somehow build_tlist_to_deparse() breaks that assumption in future, we have no
> way to detect it, unless a regression test fails.

If there's an easy way to do a more exact comparison, great.  But if
we can't get an awesome Assert(), a helpful Assert() is still better
than a kick in the head.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2016-12-06 Thread Tom Lane

Andres Freund  writes:
> On 2016-12-06 13:56:28 -0500, Tom Lane wrote:
>> I guess the $64 question that has to be addressed here is whether we're
>> prepared to accept LLVM as a run-time dependency.  There are some reasons
>> why we might not be:

> Indeed.  It'd only be a soft dependency obviously.

Oh, so we'd need to maintain both the LLVM and the traditional expression
execution code?  That seems like a bit of a pain, but maybe we can live
with it.

>> * How will we answer people who say they can't accept having a compiler
>> installed on their production boxes for security reasons?

> I think they'll just not enable --with-llvm in that case, and get
> inferior performance. Note that installing llvm does not imply
> installing a full blown C compiler (although I think that's largely
> moot, you can get pretty much the same things done with just compiling
> LLVM IR).

I'm not entirely thrilled with the idea of this being a configure-time
decision, because that forces packagers to decide for their entire
audience whether it's okay to depend on LLVM.  That would be an untenable
position to put e.g. Red Hat's packagers in: either they screw the people
who want performance or they screw the people who want security.

I think it'd be all right if we can build this so that the direct
dependency on LLVM is confined to a separately-packageable extension.
That way, a packager can produce a core postgresql-server package
that does not require LLVM, plus a postgresql-llvm package that does,
and the "no compiler please" crowd simply doesn't install the latter
package.

The alternative would be to produce two independent builds of the
server, which I suppose might be acceptable but it sure seems like
a kluge, or at least something that simply wouldn't get done by
most vendors.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Compiler warnings

2016-12-06 Thread Stephen Frost

All,

* Stephen Frost (sfr...@snowman.net) wrote:
> Not sure if anyone else has been seeing these, but I'm getting a bit
> tired of them.  Neither is a live bug, but they also seem pretty simple
> to fix.  The attached patch makes both of these warnings go away.  At
> least for my common build, these are the only warnings that are thrown.
> 
> I'm building with:
> 
> gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
> 
> .../src/backend/storage/lmgr/lwlock.c: In function ‘LWLockRelease’:
> .../src/backend/storage/lmgr/lwlock.c:1802:5: warning: ‘mode’ may be used 
> uninitialized in this function [-Wmaybe-uninitialized]
>   if (mode == LW_EXCLUSIVE)
>  ^
> .../src/backend/utils/cache/plancache.c: In function ‘GetCachedPlan’:
> .../src/backend/utils/cache/plancache.c:1232:9: warning: ‘plan’ may be used 
> uninitialized in this function [-Wmaybe-uninitialized]
>   return plan;
>  ^

Given the lack of screaming, I'll push the attached in a bit, which just
initializes the two variables being complained about.  As mentioned,
there doesn't appear to be any live bugs here, this is just to silence
the compiler warnings.

Thanks!

Stephen
From afe53a0fa8a53c080b77a7334531ff3a5dc976d4 Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Tue, 6 Dec 2016 15:19:09 -0500
Subject: [PATCH] Silence compiler warnings

Initialize a couple of variables to silence compiler warnings.  There
aren't any cases where these variables could actually end up being used
while uninitialized, but at least gcc 5.4.0-6ubuntu1~16.04.4 doesn't
quite have the smarts to realize that.

Discussion: https://www.postgresql.org/message-id/20161129152102.GR13284%40tamriel.snowman.net
---
 src/backend/storage/lmgr/lwlock.c   | 2 +-
 src/backend/utils/cache/plancache.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 9c6862f..909ff45 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -1770,7 +1770,7 @@ LWLockUpdateVar(LWLock *lock, uint64 *valptr, uint64 val)
 void
 LWLockRelease(LWLock *lock)
 {
-	LWLockMode	mode;
+	LWLockMode	mode = LW_EXCLUSIVE;
 	uint32		oldstate;
 	bool		check_waiters;
 	int			i;
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 884cdab..a37074b 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -1128,7 +1128,7 @@ CachedPlan *
 GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 			  bool useResOwner)
 {
-	CachedPlan *plan;
+	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
 
-- 
2.7.4



signature.asc
Description: Digital signature

Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2016-12-06 Thread Nico Williams

On Tue, Dec 06, 2016 at 01:56:28PM -0500, Tom Lane wrote:
> Andres Freund  writes:
> > I'm posting a quite massive series of WIP patches here, to get some
> > feedback.
> 
> I guess the $64 question that has to be addressed here is whether we're
> prepared to accept LLVM as a run-time dependency.  There are some reasons
> why we might not be:
> 
> * The sheer mass of the dependency.  What's the installed footprint of
> LLVM, versus a Postgres server?  How hard is it to install from source?

As long as it's optional, does this matter?

A bigger concern might be interface stability.  IIRC the LLVM C/C++
interfaces are not very stable, but bitcode is.

> * How will we answer people who say they can't accept having a compiler
> installed on their production boxes for security reasons?

You don't need the front-ends (e.g., clang) installed in order to JIT.

> * Are there any currently-interesting platforms that LLVM doesn't work
> for?  (I'm worried about RISC-V as much as legacy systems.)

The *BSDs support more platforms than LLVM does, that's for sure.
(NetBSD supports four more, IIRC, including ia64.) But the patches make
LLVM optional anyways, so this should be a non-issue.

> I concur with your feeling that hand-rolled JIT is right out.  But

Yeah, that way lies maintenance madness.

> I'm not sure that whatever performance gain we might get in this
> direction is worth the costs.

Byte-/bit-coding query plans then JITting them is very likely to improve
performance significantly.  Whether you want the maintenance overhead is
another story.

Sometimes byte-coding + interpretation yields a significant improvement
by reducing cache pressure on the icache and the size of the program to
be interpreted.  Having the option to JIT or not JIT might be useful.

Nico
-- 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2016-12-06 Thread Andres Freund

On 2016-12-06 15:13:21 -0500, Robert Haas wrote:
> Presumably this is going to need to be something that a user can get
> via yum install  or apt-get install  on common systems.

Right. apt-get install llvm-dev (or llvm-3.9-dev or such if you want to
install a specific version), does the trick here.

It's a bit easier to develop with a hand compiled version, because then
LLVM adds a bootloads of asserts to its IR builder, which catches a fair
amount of mistakes. Nothing you'd run in production though (just like
you don't use a cassert build...).

> I wonder how feasible it would be to make this a run-time dependency
> rather than a compile option.  That's probably overcomplicating
> things, but...

I don't think that's feasible at all unfortunately - the compiler IR
(which then is JITed by LLVM) is generated via another C API.  We could
rebuild that one, but that'd be a lot of work.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2016-12-06 Thread Robert Haas

On Tue, Dec 6, 2016 at 2:10 PM, Andres Freund  wrote:
>> * The sheer mass of the dependency.  What's the installed footprint of
>> LLVM, versus a Postgres server?  How hard is it to install from source?
>
> Worked for me first try, but I'm perhaps not the best person to judge.
> It does take a while to compile though (~20min on my laptop).

Presumably this is going to need to be something that a user can get
via yum install  or apt-get install  on common systems.

I wonder how feasible it would be to make this a run-time dependency
rather than a compile option.  That's probably overcomplicating
things, but...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2016-12-06 Thread Andres Freund

On 2016-12-06 11:10:59 -0800, Andres Freund wrote:
> > * Are there any currently-interesting platforms that LLVM doesn't work
> > for?  (I'm worried about RISC-V as much as legacy systems.)
> 
> LLVM itself I don't think is a problem, it seems to target a wide range
> of platforms. The platforms that don't support JIT compiling might be a
> bit larger, since that involves more than just generating code.

The os specific part is handling the executable format. The JIT we'd be
using (MCJIT) has support for ELF, MachO, and COFF. The architecture
specific bits seem to be there for x86, arm (small endian, be), aarch64
(arm 64 bits be/le again), mips, ppc64.

Somebody is working on RISC-V support for llvm (i.e. it appears to be
working, but is not merged) - but given it's not integrated into gcc
either, I'm not seing that being an argument.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2016-12-06 Thread Andres Freund

On 2016-12-06 14:04:09 -0500, Robert Haas wrote:
> I've heard at least one and maybe several PGCon presentations about
> people JITing tuple deformation and getting big speedups, and I'd like
> to finally hear one from somebody who intends to integrate that into
> PostgreSQL.

I certainly want to.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2016-12-06 Thread Andres Freund

Hi,

On 2016-12-06 13:56:28 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > I'm posting a quite massive series of WIP patches here, to get some
> > feedback.
> 
> I guess the $64 question that has to be addressed here is whether we're
> prepared to accept LLVM as a run-time dependency.  There are some reasons
> why we might not be:

Indeed.  It'd only be a soft dependency obviously.

> * The sheer mass of the dependency.  What's the installed footprint of
> LLVM, versus a Postgres server?  How hard is it to install from source?

Worked for me first try, but I'm perhaps not the best person to judge.
It does take a while to compile though (~20min on my laptop).

> * How will we answer people who say they can't accept having a compiler
> installed on their production boxes for security reasons?

I think they'll just not enable --with-llvm in that case, and get
inferior performance. Note that installing llvm does not imply
installing a full blown C compiler (although I think that's largely
moot, you can get pretty much the same things done with just compiling
LLVM IR).

> * Are there any currently-interesting platforms that LLVM doesn't work
> for?  (I'm worried about RISC-V as much as legacy systems.)

LLVM itself I don't think is a problem, it seems to target a wide range
of platforms. The platforms that don't support JIT compiling might be a
bit larger, since that involves more than just generating code.

> I concur with your feeling that hand-rolled JIT is right out.  But
> I'm not sure that whatever performance gain we might get in this
> direction is worth the costs.

Well, I'm not impartial, but I don't think we do our users a service by
leaving significant speedups untackled, and after spending a *LOT* of
time on this, I don't see much other choice than JITing. Note that
nearly everything performance sensitive is moving towards doing JITing
in some form or another.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2016-12-06 Thread Robert Haas

On Tue, Dec 6, 2016 at 1:56 PM, Tom Lane  wrote:
> Andres Freund  writes:
>> I'm posting a quite massive series of WIP patches here, to get some
>> feedback.
>
> I guess the $64 question that has to be addressed here is whether we're
> prepared to accept LLVM as a run-time dependency.  There are some reasons
> why we might not be:
>
> * The sheer mass of the dependency.  What's the installed footprint of
> LLVM, versus a Postgres server?  How hard is it to install from source?
>
> * How will we answer people who say they can't accept having a compiler
> installed on their production boxes for security reasons?
>
> * Are there any currently-interesting platforms that LLVM doesn't work
> for?  (I'm worried about RISC-V as much as legacy systems.)

I think anything that requires LLVM -- or, for that matter, anything
that does JIT by any means -- has got to be optional.  But I don't
think --with-llvm as a compile option is inherently problematic.
Also, I think this is probably a direction we need to go.  I've heard
at least one and maybe several PGCon presentations about people JITing
tuple deformation and getting big speedups, and I'd like to finally
hear one from somebody who intends to integrate that into PostgreSQL.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel execution and prepared statements

2016-12-06 Thread Robert Haas

On Tue, Dec 6, 2016 at 1:07 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Done.
>
> The comment seems quite confused now:
>
> * If a tuple count was supplied or data is being written to relation, we
> * must force the plan to run without parallelism, because we might exit
> * early.
>
> Exit early is exactly what we *won't* do if writing to an INTO rel, so
> I think this will confuse future readers.  I think it should be more like
>
> * If a tuple count was supplied, we must force the plan to run without
> * parallelism, because we might exit early.  Also disable parallelism
> * when writing into a relation, because [ uh, why exactly? ]
>
> Considering that the writing would happen at top level of the executor,
> and hence in the parent process, it's not actually clear to me why the
> second restriction is there at all: can't we write tuples to a rel even
> though they came from a parallel worker?  In any case, the current wording
> of the comment is a complete fail at explaining this.

Oops.  You're right.  [ uh, why exactly? ] -> no database changes
whatsoever are allowed while in parallel mode.  (This restriction
might be lifted someday, but right now we're stuck with it.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Account for catalog snapshot in PGXACT->xmin updates.

2016-12-06 Thread Robert Haas

On Tue, Dec 6, 2016 at 1:36 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Nov 15, 2016 at 3:55 PM, Tom Lane  wrote:
>>> Account for catalog snapshot in PGXACT->xmin updates.
>
>> It seems to me that this makes it possible for
>> ThereAreNoPriorRegisteredSnapshots() to fail spuriously.  The only
>> core code that calls that function is in copy.c, and apparently we
>> never reach that point with the catalog snapshot set.  But that seems
>> fragile.
>
> Hm.  If there were some documentation explaining what
> ThereAreNoPriorRegisteredSnapshots is supposed to do, it would be possible
> for us to have a considered argument as to whether this patch broke it or
> not.  As things stand, it is not obvious whether it ought to be ignoring
> the catalog snapshot or not; one could construct plausible arguments
> either way.  It's not even very obvious why both 0 and 1 registered
> snapshots should be considered okay; that looks a lot more like a hack
> than reasoned-out behavior.  So I'm satisfied if COPY FREEZE does what
> it's supposed to do, which it still appears to do.
>
> IOW, I'm not interested in touching this unless someone first provides
> adequate documentation for the pre-existing kluge here.  There is no
> API spec at all on the function itself, and the comment in front of the
> one call site is too muddle-headed to provide any insight.

COPY FREEZE is designed to be usable only in situations where the new
tuples won't be visible earlier than would otherwise be the case.
Thus, copy.c has a check that the relfilenode was created by our
(sub)transaction, which guards against the possibility that other
transactions might see the data early, and also a check that there are
no other registered snapshots or ready portals, which guarantees that,
for example, a cursor belonging to the current subtransaction but with
a lower command-ID doesn't see the rows early.  I am fuzzy why we need
to check for both snapshots and portals, but the overall intent seems
pretty clear to me.  What are you confused about?  It seems quite
obvious to me that no matter how old the catalog snapshot is, it's
nothing that we need to worry about here because it'll get invalidated
and refreshed before use if anything changes meanwhile.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Faster Expression Processing and Tuple Deforming (including JIT)

2016-12-06 Thread Tom Lane

Andres Freund  writes:
> I'm posting a quite massive series of WIP patches here, to get some
> feedback.

I guess the $64 question that has to be addressed here is whether we're
prepared to accept LLVM as a run-time dependency.  There are some reasons
why we might not be:

* The sheer mass of the dependency.  What's the installed footprint of
LLVM, versus a Postgres server?  How hard is it to install from source?

* How will we answer people who say they can't accept having a compiler
installed on their production boxes for security reasons?

* Are there any currently-interesting platforms that LLVM doesn't work
for?  (I'm worried about RISC-V as much as legacy systems.)

I concur with your feeling that hand-rolled JIT is right out.  But
I'm not sure that whatever performance gain we might get in this
direction is worth the costs.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Account for catalog snapshot in PGXACT->xmin updates.

2016-12-06 Thread Tom Lane

Robert Haas  writes:
> On Tue, Nov 15, 2016 at 3:55 PM, Tom Lane  wrote:
>> Account for catalog snapshot in PGXACT->xmin updates.

> It seems to me that this makes it possible for
> ThereAreNoPriorRegisteredSnapshots() to fail spuriously.  The only
> core code that calls that function is in copy.c, and apparently we
> never reach that point with the catalog snapshot set.  But that seems
> fragile.

Hm.  If there were some documentation explaining what
ThereAreNoPriorRegisteredSnapshots is supposed to do, it would be possible
for us to have a considered argument as to whether this patch broke it or
not.  As things stand, it is not obvious whether it ought to be ignoring
the catalog snapshot or not; one could construct plausible arguments
either way.  It's not even very obvious why both 0 and 1 registered
snapshots should be considered okay; that looks a lot more like a hack
than reasoned-out behavior.  So I'm satisfied if COPY FREEZE does what
it's supposed to do, which it still appears to do.

IOW, I'm not interested in touching this unless someone first provides
adequate documentation for the pre-existing kluge here.  There is no
API spec at all on the function itself, and the comment in front of the
one call site is too muddle-headed to provide any insight.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel execution and prepared statements

2016-12-06 Thread Tom Lane

Robert Haas  writes:
> Done.

The comment seems quite confused now:

* If a tuple count was supplied or data is being written to relation, we
* must force the plan to run without parallelism, because we might exit
* early.

Exit early is exactly what we *won't* do if writing to an INTO rel, so
I think this will confuse future readers.  I think it should be more like

* If a tuple count was supplied, we must force the plan to run without
* parallelism, because we might exit early.  Also disable parallelism
* when writing into a relation, because [ uh, why exactly? ]

Considering that the writing would happen at top level of the executor,
and hence in the parent process, it's not actually clear to me why the
second restriction is there at all: can't we write tuples to a rel even
though they came from a parallel worker?  In any case, the current wording
of the comment is a complete fail at explaining this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separate connection handling from backends

2016-12-06 Thread Kevin Grittner

On Mon, Dec 5, 2016 at 6:54 PM, Jim Nasby  wrote:
> On 12/5/16 2:14 PM, David Fetter wrote:

>> What do you see as the relationship between this proposal and the
>> earlier one for admission control?
>>
>> https://www.postgresql.org/message-id/4b38c1c502250002d...@gw.wicourts.gov
>
> Without having read the paper reference in that email or the rest of the
> thread...

> One big difference from what Kevin describe though: I don't think it makes
> sense for the connection layer to be able to parse queries. I suspect it
> would take a very large amount of work to allow something that's not a
> full-blown backend to parse, because it needs access to the catalogs.
> *Maybe* it'd be possible if we used a method other than ProcArray to
> register the snapshot that required, but you'd still have to duplicate all
> the relcache stuff.

I don't recall ever, on the referenced thread or any other,
suggesting what you describe.  Basically, I was suggesting that we
create a number hooks which an admission control policy (ACP) could
tie into, and we could create pluggable APCs.  One ACP that I think
would be useful would be one that ties into a hook placed at the
point(s) where a transaction is attempting to acquire its first
"contentious resource" -- which would include at least snapshot and
locks.  If the user was a superuser it would allow the transaction
to proceed; otherwise it would check whether the number of
transactions which were holding contentious resources had reached
some (configurable) limit.  If allowing the transaction to proceed
would put it over the limit, the transaction would be blocked and
put on a queue behind any other transactions which had already been
blocked for this reason, and a transaction from the queue would be
unblocked whenever the count of transactions holding contentious
resources fell below the threshold.

I don't see where parsing even enters into this.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical Replication WIP

2016-12-06 Thread Peter Eisentraut

On 12/5/16 6:24 PM, Petr Jelinek wrote:
> I think that the removal of changes to ReplicationSlotAcquire() that you
> did will result in making it impossible to reacquire temporary slot once
> you switched to different one in the session as the if (active_pid != 0)
> will always be true for temp slot.

I see.  I suppose it's difficult to get a test case for this.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel execution and prepared statements

2016-12-06 Thread Robert Haas

On Mon, Dec 5, 2016 at 5:18 AM, Albe Laurenz  wrote:
> Tobias Bussmann wrote:
>>> I think if we don't see any impact then we should backpatch this
>>> patch. However, if we decide to go some other way, then I can provide
>>> a separate patch for back branches.  BTW, what is your opinion?
>>
>> I could not find anything on backporting guidelines in the wiki but my 
>> opinion would be to backpatch
>> the patch in total. With a different behaviour between the simple and 
>> extended query protocol it would
>> be hard to debug query performance issue in user applications that uses 
>> PQprepare. If the user tries
>> to replicate a query with a PREPARE in psql and tries to EXPLAIN EXECUTE it, 
>> the results would be
>> different then what happens within the application. That behaviour could be 
>> confusing, like
>> differences between EXPLAIN SELECT and EXPLAIN EXECUTE can be to less 
>> experienced users.
>
> +1

Done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tzdata 2016j

2016-12-06 Thread David Fetter

On Tue, Dec 06, 2016 at 10:52:47AM -0500, Tom Lane wrote:
> Vladimir Borodin  writes:
> > Any chance to get tzdata 2016j in supported branches?
> 
> When the next scheduled releases come around (February), we'll update
> to whatever tzdata is current at that time.

I'm guessing that request came through because Vladimir is actually
affected by the change.

Apparently, you can replace the tzdata file and restart the server per
https://wiki.postgresql.org/wiki/FAQ#Will_PostgreSQL_handle_recent_daylight_saving_time_changes_in_various_countries.3F

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UNDO and in-place update

2016-12-06 Thread Robert Haas

On Mon, Dec 5, 2016 at 4:49 AM, Amit Kapila  wrote:
> I can very well understand the reason for not doing so (IIUC, it is
> about complexity and time to get it right when we are already trying
> to solve one big and complex problem of the system), but saying most
> of the benefit can be realized by having simple optimization seems
> less convincing.  I think having simple optimizations won't solve the
> multi-pass vacuum problem.  However, having the visibility information
> in the index will solve that and avoid index bloat much aggressively.

I don't think that multi-pass vacuum would exist in this system,
regardless of any of the details we're talking about here.  If you are
doing in-place updates of tuples and changing the indexed columns,
then you can't rely on the current system for removing index entries
at all.  The indexed entries don't point to a "dead" line pointer;
they point to a tuple that no longer matches the index entry.

>> 3. Otherwise, you need to look at the heap page.  But if the heap
>> page's UNDO pointer is invalid, then the whole page is all-visible, so
>> you're done.  (You can also set the visibility map bit and/or the
>> index tuple's definitely-all-visible bit to avoid having to recheck
>> the heap page in the future.)
>>
>> 4. Each tuple will have a bit indicating whether it's been recently
>> modified; that is, whether the transaction whose UNDO log is
>> referenced by the page's UNDO pointer did anything to that tuple; or
>> if the page points to a TPD entry, whether any of the transactions in
>> the TPD modified that tuple.  If that bit is not set for that
>> particular tuple, it's all-visible and you're done.
>
> I think 4th optimization is especially good and can cover many cases,
> but I am not sure how do we unset it once it is set.  For example,
> once the effect of transaction that has touched that row is
> all-visible, how will we unset it.   It might be better to store the
> offset of transaction that has last modified the tuple, if the last
> transaction that has touched the row is visible to snapshot, then we
> don't need to process the UNDO.

If the page has an invalid UNDO pointer and you change it to a valid
UNDO pointer, you unset all the bits at the same time, except of
course for the bit pertaining to the tuple you are modifying.  If the
page's UNDO pointer is still valid when you update it, then you set
the bits for any tuples you are modifying for which it is not already
set.

> One more thing that needs some thoughts is how will the rollback work
> if keep just one bit for delete marking.  I mean for heap we can
> directly apply from UNDO, but for the index, I think we need to also
> take care of undoing it in some way.  For example, search the tree to
> remove the deleting marking from old value and delete marking the new
> value.  Now if that is what we need to do then first we need to
> traverse the tree twice which is okay considering rollback is a seldom
> operation, the second challenge would be how to make such an operation
> atomic with respect to WAL.

We don't necessarily need UNDO to clean up the indexes, although it
might be a good idea.  It would *work* to just periodically scan the
index for delete-marked items.  For each one, visit the heap and see
if there's a version of that tuple present in the heap or current UNDO
that matches the index entry.  If not, remove the index entry.
However, we could try to be more aggressive: when you UNDO an UPDATE
or DELETE in the heap, also go search the index and remove any
delete-marked entries that are no longer needed.  The advantage of
this is that we'd be more likely to visit the index entries while they
are still in cache.  The disadvantage is that it is complex and might
fail if the index is corrupted.  We can't have the whole system melt
down in that case; UNDO has to always succeed.

>> The UNDO chain has to remain until all transactions that modified the
>> page are all-visible, not just until the transactions are committed.
>
> Okay, I think this will work if we avoid the second insertion of the
> same value-TID pair in the index as you mentioned above.   Without
> that, I think we might not distinguish which of the two rows are
> visible for a transaction to which the latest (step-3) row is visible.

Agreed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tzdata 2016j

2016-12-06 Thread Tom Lane

Vladimir Borodin  writes:
> Any chance to get tzdata 2016j in supported branches?

When the next scheduled releases come around (February), we'll update
to whatever tzdata is current at that time.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PSQL commands: \quit_if, \quit_unless

2016-12-06 Thread Fabien COELHO



But if it does make sense to share, then that's another reason for not 
designing something ad-hoc for psql: integrating it later will be more 
work in total.


Yes, but not much: evaluating "[!] :var" special syntax would be dropped, 
but I do not think that it is the main issue with these features.


I understand that your conclusion is to require psql expressions + 
conditions as one (large) patch.


Given my personnal experience with the patch process, my conclusion is 
that I will very probably not do it. Just adding logical expressions to 
pgbench, a minor feature for a minor client, has already been spread over 
3 CF, and this is a part of what is required for the condition & 
expression in psql.


Hopefully someone else will do it, and I'll do some reviewing then.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] tzdata 2016j

2016-12-06 Thread Vladimir Borodin

Hi all.

Any chance to get tzdata 2016j in supported branches?

--
May the force be with you…
https://simply.name

Re: [HACKERS] [COMMITTERS] pgsql: Add support for restrictive RLS policies

2016-12-06 Thread Stephen Frost

Andres,

* Andres Freund (and...@anarazel.de) wrote:
> On 2016-12-05 20:51:02 +, Stephen Frost wrote:
> > Add support for restrictive RLS policies

> This is missing a catversion bump.

Ewps, apologies and thanks for pointing it out.

Fixed.

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] PSQL commands: \quit_if, \quit_unless

2016-12-06 Thread Robert Haas

On Tue, Dec 6, 2016 at 8:45 AM, Fabien COELHO  wrote:
> There are some differences: pgbench already had one operator at a time
> expressions, while psql has survived till today with none, which suggest a
> less pressing need.

I don't really think so.  People have been wanting expressions in psql
since I got involved in the project.

> Moreover the features are partly orthogonal and would touch psql
> significantly in different although probably overlapping areas:
>  - expressions is rather about \set, even if reused with \if as well
>  - condition is about \if ... \endif and ignoring some input lines

I don't think that difference is very relevant, really.

> The current expression evaluation in pgbench is about 1000 lines for
> scanning, parsing & evaluating, and does not yet support boolean
> expressions, although a patch for that has been in the queue for some time.
> I foresee that someone will suggest/require/demand... that the expression
> code be shared between pgbench and psql, which is another argument for
> dissociating these two features (expression and conditional in psql) from
> the start.

That seems like an argument the other way, from here.  I'm not sure
that Tom is right in wanting so much to be shared between psql and
pgbench.  They're different tools with different purposes, and I'm not
sure sharing between them makes much sense.  But if it does make sense
to share, then that's another reason for not designing something
ad-hoc for psql: integrating it later will be more work in total.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Account for catalog snapshot in PGXACT->xmin updates.

2016-12-06 Thread Robert Haas

On Tue, Nov 15, 2016 at 3:55 PM, Tom Lane  wrote:
> Account for catalog snapshot in PGXACT->xmin updates.
>
> The CatalogSnapshot was not plugged into SnapshotResetXmin()'s accounting
> for whether MyPgXact->xmin could be cleared or advanced.  In normal
> transactions this was masked by the fact that the transaction snapshot
> would be older, but during backend startup and certain utility commands
> it was possible to re-use the CatalogSnapshot after MyPgXact->xmin had
> been cleared, meaning that recently-deleted rows could be pruned even
> though this snapshot could still see them, causing unexpected catalog
> lookup failures.  This effect appears to be the explanation for a recent
> failure on buildfarm member piculet.
>
> To fix, add the CatalogSnapshot to the RegisteredSnapshots heap whenever
> it is valid.

It seems to me that this makes it possible for
ThereAreNoPriorRegisteredSnapshots() to fail spuriously.  The only
core code that calls that function is in copy.c, and apparently we
never reach that point with the catalog snapshot set.  But that seems
fragile.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgcrypto compilation error due to stack-allocated EVP_CIPHER_CTX

2016-12-06 Thread Asif Naeem

Thanks for updated patch. Although EVP_CIPHER_CTX_cleanup() seems deprecated in
OpenSSL >= 1.1.0 i.e.

# if OPENSSL_API_COMPAT < 0x1010L
> #  define EVP_CIPHER_CTX_init(c)  EVP_CIPHER_CTX_reset(c)
> #  define EVP_CIPHER_CTX_cleanup(c)   EVP_CIPHER_CTX_reset(c)
> # endif


I guess use of deprecated function is fine, until OpenSSL library support
it.



On Tue, Dec 6, 2016 at 6:15 PM, Michael Paquier 
wrote:

> On Tue, Dec 6, 2016 at 9:31 PM, Asif Naeem  wrote:
> > Thank you for v2 patch, I would like to comment on it. It seems that you
> > have used function EVP_CIPHER_CTX_reset in the patch that was introduced
> in
> > OpenSSL 1.1.0, older library version might not work now, is it
> intentional
> > change ?.
>
> I thought I tested that... But yes, that would not compile when linked
> with 1.0.2 or older. Using EVP_CIPHER_CTX_cleanup() is safe instead as
> that's available down to 0.9.8.
> --
> Michael
>

Re: [HACKERS] raw output from copy

2016-12-06 Thread Kohei KaiGai

2016-12-06 16:59 GMT+09:00 Pavel Stehule :
>
>
> 2016-12-06 1:50 GMT+01:00 Kohei KaiGai :
>>
>> 2016-12-05 22:45 GMT+09:00 Pavel Stehule :
>> >
>> > There are more goals:
>> >
>> > 1. user friendly import of text or binary data - import text data (with
>> > psql) from file is possible - but you have to load a content to psql
>> > variable. For binary data you should to use workaround based on LO and
>> > transformation from LO to bytea.
>> >
>> > 2. user friendly export text or binary data - now, the binary data can
>> > be
>> > exported only via transformation to LO. The XML has very interesting
>> > features when is passing from/to client binary. This feature is
>> > impossible
>> > in psql now.
>> >
>>   :
>> 
>>   :
>> >> It seems to me extend of COPY statement for this optimization is a bit
>> >> overkill
>> >> solution. Do we find out an alternative solution that we can build on
>> >> the existing
>> >> infrastructure?
>> >
>> > The advantage and sense of COPY RAW was reusing existing interface. The
>> > question was: How I can export/import binary data simply from psql
>> > console?
>> >
>> OK, I could get your point.
>>
>> Likeky, we can implement the feature without COPY statement enhancement
>> by adding a special purpose function and \xxx command on psql.
>>
>> Let's assume the two commands below on psql:
>>
>> \blob_import   (STDIN|)
>> \blob_export  (STDOUT|)
>>
>> On \blob_import, the psql command reads the binary contents from either
>> stdin or file, than call a special purpose function that takes three
>> arguments; table name, column name and a binary data chunk.
>> PQexecParams() of libpq allows to deliver the data chunk with keeping
>> binary data format, then the special purpose function will be able to
>> lookup the destination table/column and construct a tuple that contains
>> the supplied data chunk. (I think _recv handler shall be used for
>> data validation, but not an element of this feature.)
>>
>>
>> On \blob_export, the psql command also set up a simple query as follows:
>>   SELECT blob_export((> For example,
>>   \blob_export SELECT binary_data FROM my_table WHERE id = 10   /tmp/aaa
>> shall be transformed to
>>   SELECT blob_export((SELECT binary_data FROM my_table WHERE id = 10))
>
>
> This is reason why I prefer a COPY statement - because it does all necessary
> things natural.  But if there is a disagreement against COPY RAW it can be
> implemented as psql commands.
>
Yes, both of approach will be able to implement what you want to do.
I agree it is valuable if psql can import/export a particular item with
simple shell-script description, however, here is no consensus how
to implement it.

If psql supports the special \xxx command, it is equivalently convenient
from the standpoint of users, with no enhancement of the statement.

I hope committers comment on the approach we will take on.

Thanks,

> export should be similar like \g, \gset feature
>
> so
>
> SELECT xmldoc FROM 
> \gbinary_store .xxx
>
> import is maybe better solved by proposed file references in queries
>
> Regards
>
> Pavel
>
>
>>
>>
>> This function is declared as:
>>   blob_export(anyelement) RETURNS bytea
>> So, as long as the user supplied query returns exactly one column and
>> one row, it can transform the argument to the binary stream, then psql
>> command receive it and dump somewhere; stdout or file.
>>
>> How about your thought?
>>
>> Thanks,
>> --
>> KaiGai Kohei 
>
>



-- 
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PSQL commands: \quit_if, \quit_unless

2016-12-06 Thread Fabien COELHO



Hello Robert,

Your experience as an seasoned core developer and a committer is 
probably different:-)


Well, everybody can have their own opinion on what is reasonable.


Sure.


There are times I argue for making a patch smaller (frequent) and
times when I argue for making it bigger (rare).


Yep.

We had pretty much this exact same argument about the pgbench lexer and 
parser and in the event I coded something up in less than a day.  This 
argument feels like a rerun of that one.


There are some differences: pgbench already had one operator at a time 
expressions, while psql has survived till today with none, which suggest a 
less pressing need.


Moreover the features are partly orthogonal and would touch psql 
significantly in different although probably overlapping areas:

 - expressions is rather about \set, even if reused with \if as well
 - condition is about \if ... \endif and ignoring some input lines

The current expression evaluation in pgbench is about 1000 lines for 
scanning, parsing & evaluating, and does not yet support boolean 
expressions, although a patch for that has been in the queue for some 
time. I foresee that someone will suggest/require/demand... that the 
expression code be shared between pgbench and psql, which is another 
argument for dissociating these two features (expression and conditional 
in psql) from the start.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Assignment of valid collation for SET operations on queries with UNKNOWN types.

2016-12-06 Thread Rahila Syed

Hello,

>And ideally fix things so
>that the type of the view column will be resolved as text, so that you
>don't hit this condition in the first place; but there is no good that
>comes out of allowing a view to be created like this

Thank you for suggestion. Attached is a patch which resolves the columns
with literal constants as TEXT for view creation.

Following views can be created with literal columns resolved as TEXT.

postgres=# create view v as select 'abc' a;
CREATE VIEW
postgres=# create view v1 as select 'def' a;
CREATE VIEW
postgres=# \d+ v1;
View "public.v1"
 Column | Type | Collation | Nullable | Default | Storage  | Description
+--+---+--+-+--+-
 a  | text |   |  | | extended |
View definition:
 SELECT 'def'::text AS a;

This allows following queries to run successfully which wasn't the case
earlier.

postgres=# select a from v UNION select a from v1;
  a
-
 abc
 def
(2 rows)

AND

postgres=# select * from v order by 1;
  a
-
 abc
(1 row)

Kindly give your opinion.

Thank you,
Rahila Syed.




On Thu, Nov 17, 2016 at 8:59 PM, Tom Lane  wrote:

> Rahila Syed  writes:
> > CASE 2:
> > postgres=# create view v as select 'abc' a;
> > 2016-11-16 15:28:48 IST WARNING:  column "a" has type "unknown"
> > 2016-11-16 15:28:48 IST DETAIL:  Proceeding with relation creation
> anyway.
> > WARNING:  column "a" has type "unknown"
> > DETAIL:  Proceeding with relation creation anyway.
> > CREATE VIEW
>
> We really ought to make that a hard error.  And ideally fix things so
> that the type of the view column will be resolved as text, so that you
> don't hit this condition in the first place; but there is no good that
> comes out of allowing a view to be created like this.
>
> > Attached WIP patch does that. Kindly let me know your opinion.
>
> This is a seriously ugly kluge that's attacking the symptom not the
> problem.  Or really, a symptom not the problem.  There are lots of
> other symptoms, for instance
>
> regression=# select * from v order by 1;
> ERROR:  failed to find conversion function from unknown to text
>
> regards, tom lane
>


unknown_view_column_to_text_convert.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgcrypto compilation error due to stack-allocated EVP_CIPHER_CTX

2016-12-06 Thread Michael Paquier

On Tue, Dec 6, 2016 at 9:31 PM, Asif Naeem  wrote:
> Thank you for v2 patch, I would like to comment on it. It seems that you
> have used function EVP_CIPHER_CTX_reset in the patch that was introduced in
> OpenSSL 1.1.0, older library version might not work now, is it intentional
> change ?.

I thought I tested that... But yes, that would not compile when linked
with 1.0.2 or older. Using EVP_CIPHER_CTX_cleanup() is safe instead as
that's available down to 0.9.8.
-- 
Michael


pgcrypto-openssl11-fix-v3.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PSQL commands: \quit_if, \quit_unless

2016-12-06 Thread Robert Haas

On Tue, Dec 6, 2016 at 1:28 AM, Fabien COELHO  wrote:
> First, my experience as a basic patch submitter is that any patch which does
> more than one thing at a time, even somehow closely related changes, is
> asked to be split into distinct sub-patches, and is harder to get through.
>
> Second, requiring more advanced features is a recipee for getting nothing in
> the end, because even if not "that complex" it requires significant more
> time to develop. The first step I outlined is enough to handle the submitted
> use case and is compatible with grand plans which would change significantly
> psql, so seems a reasonnable intermediate target.
>
> Your experience as an seasoned core developer and a committer is probably
> different:-)

Well, everybody can have their own opinion on what is reasonable.
There are times I argue for making a patch smaller (frequent) and
times when I argue for making it bigger (rare).  We had pretty much
this exact same argument about the pgbench lexer and parser and in the
event I coded something up in less than a day.  This argument feels
like a rerun of that one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgcrypto compilation error due to stack-allocated EVP_CIPHER_CTX

2016-12-06 Thread Asif Naeem

Hi Michael,

Thank you for v2 patch, I would like to comment on it. It seems that you
have used function EVP_CIPHER_CTX_reset in the patch that was introduced in
OpenSSL 1.1.0, older library version might not work now, is it intentional
change ?.

Regards,
Muhammad Asif Naeem

On Tue, Dec 6, 2016 at 12:15 PM, Michael Paquier 
wrote:

> On Mon, Dec 5, 2016 at 6:09 PM, Michael Paquier
>  wrote:
> > On Mon, Dec 5, 2016 at 5:11 PM, Heikki Linnakangas 
> wrote:
> >> I'm afraid if we just start using EVP_CIPHER_CTX_new(), we'll leak the
> >> context on any error. We had exactly the same problem with
> EVP_MD_CTX_init
> >> being removed, in the patch that added OpenSSL 1.1.0 support. We'll
> have to
> >> use a resource owner to track it, just like we did with EVP_MD_CTX in
> commit
> >> 593d4e47. Want to do that, or should I?
> >
> > I'll send a patch within 24 hours.
>
> And within the delay, attached is the promised patch.
>
> While testing with a linked list, I have found out that the linked
> list could leak with cases like that, where decryption and encryption
> are done in a single transaction;
> select pgp_sym_decrypt(pgp_sym_encrypt(repeat('x',65530),'1'),'1') =
> repeat('x',65530);
>
> What has taken me a bit of time was to figure out that this bit is
> needed to free correctly elements in the open list:
> @@ -219,6 +220,8 @@ encrypt_free(void *priv)
>  {
> struct EncStat *st = priv;
>
> +   if (st->ciph)
> +   pgp_cfb_free(st->ciph);
> px_memset(st, 0, sizeof(*st));
> px_free(st);
>  }
> This does not matter on back-branches as things get cleaned up once
> the transaction memory context gets freed.
> --
> Michael
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

Re: [HACKERS] Hash Indexes

2016-12-06 Thread Amit Kapila

On Tue, Dec 6, 2016 at 1:29 PM, Jeff Janes  wrote:
>
>
> I just occasionally insert a bunch of equal tuples, which have to be in
> overflow pages no matter how much splitting happens.
>
> I am getting vacuum errors against HEAD, after about 20 minutes or so (8
> cores).
>
> 49233  XX002 2016-12-05 23:06:44.087 PST:ERROR:  index "foo_index_idx"
> contains unexpected zero page at block 64941
> 49233  XX002 2016-12-05 23:06:44.087 PST:HINT:  Please REINDEX it.
> 49233  XX002 2016-12-05 23:06:44.087 PST:CONTEXT:  automatic vacuum of table
> "jjanes.public.foo"
>

Thanks for the report.  This can happen due to vacuum trying to access
a new page.  Vacuum can do so if (a) the calculation for maxbuckets
(in metapage) is wrong or (b) it is trying to free the overflow page
twice.  Offhand, I don't see that can happen in code.  I will
investigate further to see if there is any another reason why vacuum
can access the new page.  BTW, have you done the test after commit
2f4193c3, that doesn't appear to be the cause of this problem, but
still, it is better to test after that fix.  I am trying to reproduce
the issue, but in the meantime, if by anychance, you have a callstack,
then please share the same.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2016-12-06 Thread Michael Paquier

On Tue, Dec 6, 2016 at 6:57 PM, Masahiko Sawada  wrote:
> On Tue, Dec 6, 2016 at 1:11 PM, Fujii Masao  wrote:
>> If M (i.e., number of quorum sync standbys) is enough large,
>> your choice would be good. But usually M seems not so large.
>>
>
> Thank you for the comment.
>
> One another possible idea is to use the partial selection sort[1],
> which takes O(MN) time. Since this is more efficient if N is small
> this would be better than qsort for this case. But I'm not sure that
> we can see such a difference by result of performance measurement.
>
> [1] https://en.wikipedia.org/wiki/Selection_algorithm#Partial_selection_sort

We'll begin to see a minimal performance impact when selecting a sync
standby across hundreds of them, which is less than say what 0.1% (or
less) of existing deployments are doing. The current approach taken
seems simple enough to be kept, and performance is not something to
worry much IMHO.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2016-12-06 Thread Masahiko Sawada

On Tue, Dec 6, 2016 at 1:11 PM, Fujii Masao  wrote:
> On Mon, Nov 28, 2016 at 8:03 PM, Masahiko Sawada  
> wrote:
>> On Sat, Nov 26, 2016 at 10:27 PM, Michael Paquier
>>  wrote:
>>> On Tue, Nov 15, 2016 at 7:08 PM, Masahiko Sawada  
>>> wrote:
 Attached latest version patch incorporated review comments. After more
 thought, I agree and changed the value of standby priority in quorum
 method so that it's not set 1 forcibly. The all standby priorities are
 1 If s_s_names = 'ANY(*)'.
 Please review this patch.
>>>
>>> Sorry for my late reply. Here is my final lookup.
>>
>> Thank you for reviewing!
>>
>>>  
>>> -num_sync ( >> class="parameter">standby_name [, ...] )
>>> +[ANY] num_sync (
>>> standby_name [, ...] )
>>> +FIRST num_sync (
>>> standby_name [, ...] )
>>>  standby_name [, ...
>>> This can just be replaced with [ ANY | FIRST ]. There is no need for
>>> braces as the keyword is not mandatory.
>>>
>>> +is the name of a standby server.
>>> +FIRST and ANY specify the method used by
>>> +the master to control the standby servres.
>>>  
>>> s/servres/servers/.
>>>
>>> if (priority == 0)
>>> values[7] = CStringGetTextDatum("async");
>>> else if (list_member_int(sync_standbys, i))
>>> -   values[7] = CStringGetTextDatum("sync");
>>> +   values[7] = SyncRepConfig->sync_method == SYNC_REP_PRIORITY 
>>> ?
>>> +   CStringGetTextDatum("sync") : 
>>> CStringGetTextDatum("quorum");
>>> else
>>> values[7] = CStringGetTextDatum("potential");
>>> This can be simplified a bit as "quorum" is the state value for all
>>> standbys with a non-zero priority when the method is set to
>>> SYNC_REP_QUORUM:
>>> if (priority == 0)
>>> values[7] = CStringGetTextDatum("async");
>>> +   else if (SyncRepConfig->sync_method == SYNC_REP_QUORUM)
>>> +   values[7] = CStringGetTextDatum("quorum");
>>> else if (list_member_int(sync_standbys, i))
>>> values[7] = CStringGetTextDatum("sync");
>>> else
>>>
>>> SyncRepConfig data is made external to syncrep.c with this patch as
>>> walsender.c needs to look at the sync method in place, no complain
>>> about that after considering if there could be a more elegant way to
>>> do things without this change.
>>
>> Agreed.
>>
>>> While reviewing the patch, I have found a couple of incorrectly shaped
>>> sentences, both in the docs and some comments. Attached is a new
>>> version with this word-smithing. The patch is now switched as ready
>>> for committer.
>>
>> Thanks. I found a typo in v7 patch, so attached latest v8 patch.
>
> +qsort(write_array, len, sizeof(XLogRecPtr), cmp_lsn);
> +qsort(flush_array, len, sizeof(XLogRecPtr), cmp_lsn);
> +qsort(apply_array, len, sizeof(XLogRecPtr), cmp_lsn);
>
> In quorum commit, we need to calculate the N-th largest LSN from
> M quorum synchronous standbys' LSN. N would be usually 1 - 3 and
> M would be 1 - 10, I guess. You used the algorithm using qsort for
> that calculation. But I'm not sure if that's enough effective algorithm
> or not.
>
> If M (i.e., number of quorum sync standbys) is enough large,
> your choice would be good. But usually M seems not so large.
>

Thank you for the comment.

One another possible idea is to use the partial selection sort[1],
which takes O(MN) time. Since this is more efficient if N is small
this would be better than qsort for this case. But I'm not sure that
we can see such a difference by result of performance measurement.

[1] https://en.wikipedia.org/wiki/Selection_algorithm#Partial_selection_sort

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-12-06 Thread Michael Paquier

On Tue, Dec 6, 2016 at 6:00 PM, Fujii Masao  wrote:
> One idea is to provide the utility or extension which fsync's the specified
> files and directories (including all the files under them). It may be 
> overkill,
> though. But it would be useful for other purposes, for example, we can
> improve the durability of archived WAL files by setting this utility in
> archive_command, and we can flush any output files generated by psql
> by using it.

That's the pg_copy utility that we talked a couple of months
(year(s)?) back, which would really be useful for archiving, and the
main advantage is that it would be cross-platform. Still the patch
discussed here is more targeted, this makes sure that once pg_dump
leaves, things created are on disk, facilitating the life of users by
not involving an extra step and by making the safe behavior the
default.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump, pg_dumpall and data durability

2016-12-06 Thread Fujii Masao

On Wed, Nov 16, 2016 at 1:27 AM, Robert Haas  wrote:
> On Sun, Nov 13, 2016 at 4:18 AM, Andres Freund  wrote:
>> On 2016-11-08 18:18:01 -0500, Tom Lane wrote:
>>> I think this might be better addressed by adding something to backup.sgml
>>> pointing out that you'd better fsync or sync your backups before assuming
>>> that they can't be lost.
>>
>> How does a normal user do that? I don't think there's a cross-platform
>> advice we can give, and even on *nix the answer basically is 'sync;
>> sync;' which is a pretty big hammer, and might be completely
>> unacceptable on a busy server.
>
> Yeah, that's a pretty fair point.  I see the point of this patch
> pretty clearly but somehow it makes me nervous anyway.  I'm not sure
> there's any better alternative to what's being proposed, though.

One idea is to provide the utility or extension which fsync's the specified
files and directories (including all the files under them). It may be overkill,
though. But it would be useful for other purposes, for example, we can
improve the durability of archived WAL files by setting this utility in
archive_command, and we can flush any output files generated by psql
by using it.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding in docs the meaning of pg_stat_replication.sync_state

2016-12-06 Thread Fujii Masao

On Tue, Dec 6, 2016 at 4:54 PM, Michael Paquier
 wrote:
> On Tue, Dec 6, 2016 at 4:38 PM, Fujii Masao  wrote:
>> I changed the order of descriptions of the walsender state in
>> intuitive one rather than alphabetical one. Also I enhanced
>> the description of "potential" state.
>>
>> Could you review the latest patch?
>
> Fine for me. The additions in high-availability.sgml are a good idea.

Pushed. Thanks!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] raw output from copy

2016-12-06 Thread Pavel Stehule

2016-12-06 1:50 GMT+01:00 Kohei KaiGai :

> 2016-12-05 22:45 GMT+09:00 Pavel Stehule :
> >
> > There are more goals:
> >
> > 1. user friendly import of text or binary data - import text data (with
> > psql) from file is possible - but you have to load a content to psql
> > variable. For binary data you should to use workaround based on LO and
> > transformation from LO to bytea.
> >
> > 2. user friendly export text or binary data - now, the binary data can be
> > exported only via transformation to LO. The XML has very interesting
> > features when is passing from/to client binary. This feature is
> impossible
> > in psql now.
> >
>   :
> 
>   :
> >> It seems to me extend of COPY statement for this optimization is a bit
> >> overkill
> >> solution. Do we find out an alternative solution that we can build on
> >> the existing
> >> infrastructure?
> >
> > The advantage and sense of COPY RAW was reusing existing interface. The
> > question was: How I can export/import binary data simply from psql
> console?
> >
> OK, I could get your point.
>
> Likeky, we can implement the feature without COPY statement enhancement
> by adding a special purpose function and \xxx command on psql.
>
> Let's assume the two commands below on psql:
>
> \blob_import   (STDIN|)
> \blob_export  (STDOUT|)
>
> On \blob_import, the psql command reads the binary contents from either
> stdin or file, than call a special purpose function that takes three
> arguments; table name, column name and a binary data chunk.
> PQexecParams() of libpq allows to deliver the data chunk with keeping
> binary data format, then the special purpose function will be able to
> lookup the destination table/column and construct a tuple that contains
> the supplied data chunk. (I think _recv handler shall be used for
> data validation, but not an element of this feature.)
>
>
> On \blob_export, the psql command also set up a simple query as follows:
>   SELECT blob_export(( For example,
>   \blob_export SELECT binary_data FROM my_table WHERE id = 10   /tmp/aaa
> shall be transformed to
>   SELECT blob_export((SELECT binary_data FROM my_table WHERE id = 10))
>

This is reason why I prefer a COPY statement - because it does all
necessary things natural.  But if there is a disagreement against COPY RAW
it can be implemented as psql commands.

export should be similar like \g, \gset feature

so

SELECT xmldoc FROM 
\gbinary_store .xxx

import is maybe better solved by proposed file references in queries

Regards

Pavel



>
> This function is declared as:
>   blob_export(anyelement) RETURNS bytea
> So, as long as the user supplied query returns exactly one column and
> one row, it can transform the argument to the binary stream, then psql
> command receive it and dump somewhere; stdout or file.
>
> How about your thought?
>
> Thanks,
> --
> KaiGai Kohei 
>

Re: [HACKERS] Hash Indexes

2016-12-06 Thread Jeff Janes

On Thu, Dec 1, 2016 at 10:54 PM, Amit Kapila 
wrote:

> On Thu, Dec 1, 2016 at 8:56 PM, Robert Haas  wrote:
> > On Thu, Dec 1, 2016 at 12:43 AM, Amit Kapila 
> wrote:
> >> On Thu, Dec 1, 2016 at 2:18 AM, Robert Haas 
> wrote:
> >>> On Wed, Nov 23, 2016 at 8:50 AM, Amit Kapila 
> wrote:
>  [ new patch ]
> >>>
> >>> Committed with some further cosmetic changes.
> >>
> >> Thank you very much.
> >>
> >>> I think it would be worth testing this code with very long overflow
> >>> chains by hacking the fill factor up to 1000
> >>
> >> 1000 is not a valid value for fill factor. Do you intend to say 100?
> >
> > No.  IIUC, 100 would mean split when the average bucket contains 1
> > page worth of tuples.
> >
>
> I also think so.
>
> >  I want to split when the average bucket
> > contains 10 pages worth of tuples.
> >
>
> oh, I think what you mean to say is hack the code to bump fill factor
> and then test it.  I was confused that how can user can do that from
> SQL command.
>

I just occasionally insert a bunch of equal tuples, which have to be in
overflow pages no matter how much splitting happens.

I am getting vacuum errors against HEAD, after about 20 minutes or so (8
cores).

49233  XX002 2016-12-05 23:06:44.087 PST:ERROR:  index "foo_index_idx"
contains unexpected zero page at block 64941
49233  XX002 2016-12-05 23:06:44.087 PST:HINT:  Please REINDEX it.
49233  XX002 2016-12-05 23:06:44.087 PST:CONTEXT:  automatic vacuum of
table "jjanes.public.foo"

Testing harness is attached.  It includes a lot of code to test crash
recovery, but all of that stuff is turned off in this instance. No patches
need to be applied to the server to get this one to run.


With the latest HASH WAL patch applied, I get different but apparently
related errors

41993 UPDATE XX002 2016-12-05 22:28:45.333 PST:ERROR:  index
"foo_index_idx" contains corrupted page at block 27602
41993 UPDATE XX002 2016-12-05 22:28:45.333 PST:HINT:  Please REINDEX it.
41993 UPDATE XX002 2016-12-05 22:28:45.333 PST:STATEMENT:  update foo set
count=count+1 where index=$1

Cheers,

Jeff


count.pl
Description: Binary data


do_nocrash.sh
Description: Bourne shell script

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

86 matches

Mail list logo