Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-03-23 Thread Amit Kapila
On Wed, Mar 23, 2016 at 12:26 PM, Amit Kapila 
wrote:
>
> On Tue, Mar 22, 2016 at 4:22 PM, Andres Freund  wrote:
> >
> >
> > I think it's worthwhile to create a benchmark that does something like
> > BEGIN;SELECT ... FOR UPDATE; SELECT pg_sleep(random_time);
> > INSERT;COMMIT; you'd find that if random is a bit larger (say 20-200ms,
> > completely realistic values for network RTT + application computation),
> > the success rate of group updates shrinks noticeably.
> >
>
> Will do some tests based on above test and share results.
>

Forgot to mention that the effect of patch is better visible with unlogged
tables, so will do the test with those and request you to use same if you
yourself is also planning to perform some tests.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] VS 2015 support in src/tools/msvc

2016-03-23 Thread Michael Paquier
On Mon, Mar 21, 2016 at 3:43 AM, Andrew Dunstan  wrote:
>
>
> On 03/20/2016 12:02 AM, Michael Paquier wrote:
>>
>> On Sun, Mar 20, 2016 at 8:06 AM, Andrew Dunstan 
>> wrote:
>>>
>>>
>>> Still to do: the non-perl pieces.
>>
>> The patch to address locales is the sensitive part. The patch from
>> Petr is taking the correct approach though I think that we had better
>> simplify it a bit and if possible we had better not rely on the else
>> block it introduces.
>
>
>
> OK, the please send an updated set of patches for what remains.

Here you go:
- 0001 fixes the global declarations of TIMEZONE_GLOBAL and
TZNAME_GLOBAL to be WIN32-compliant. I got bitten by that in the ECPG
compilation.
- 0002, support of VS2015 in the scripts of src/tools/msvc
- 0003, this is necessary in order to run the regression tests,
details are here:
http://www.postgresql.org/message-id/cab7npqtdixxs8cdl8moivh6qfq-1qv9mkn0ayjzybvzv6wc...@mail.gmail.com
- 0004 is the fix for locales. This is actually Petr's work upthread,
I have updated comments in the code though and did a bit more
polishing. This still looks like the cleanest solution we have. Other
solutions are mentioned upthread: redeclaration of struct _locale_t in
backend code is one.
-- 
Michael


0001-Fix-declaration-of-TIMEZONE_GLOBAL-and-TZNAME_GLOBAL.patch
Description: binary/octet-stream


0002-Add-support-for-VS-2015-in-MSVC-scripts.patch
Description: binary/octet-stream


0003-Fix-non-compliant-boolean-declarations-with-stdbool.patch
Description: binary/octet-stream


0004-Fix-code-page-calculation-for-Visual-Studio-2015.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Generic WAL logical messages

2016-03-23 Thread Petr Jelinek

On 22/03/16 14:11, Andres Freund wrote:

On 2016-03-22 14:03:06 +0100, Petr Jelinek wrote:

On 22/03/16 12:47, Andres Freund wrote:

On 2016-03-21 18:10:55 +0100, Petr Jelinek wrote:


+
+
+ Generic Message Callback
+
+ 
+  The optional message_cb callback is called whenever
+  a logical decoding message has been decoded.
+
+typedef void (*LogicalDecodeMessageCB) (
+struct LogicalDecodingContext *,
+ReorderBufferTXN *txn,
+XLogRecPtr message_lsn,
+const char *prefix,
+Size message_size,
+const char *message
+);


I see you removed the transactional parameter. I'm doubtful that that's
a good idea: It seems like it'd be rather helpful to pass the
transaction for a nontransaction message that's emitted while an xid was
assigned?



Hmm but won't that give the output plugin even transactions that were later
aborted? That seems quite different behavior from how the txn parameter
works everywhere else.


Seems harmless to me if called out.



All right, after some consideration I agree.






+/*
+ * Handle rmgr LOGICALMSG_ID records for DecodeRecordIntoReorderBuffer().
+ */
+static void
+DecodeLogicalMsgOp(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
+{
+   SnapBuild  *builder = ctx->snapshot_builder;
+   XLogReaderState *r = buf->record;
+   uint8   info = XLogRecGetInfo(r) & ~XLR_INFO_MASK;
+   xl_logical_message *message;
+
+   if (info != XLOG_LOGICAL_MESSAGE)
+   elog(ERROR, "unexpected RM_LOGICALMSG_ID record type: %u", 
info);
+
+   message = (xl_logical_message *) XLogRecGetData(r);
+
+   if (message->transactional)
+   {
+   if (!SnapBuildProcessChange(builder, XLogRecGetXid(r), 
buf->origptr))
+   return;
+
+   ReorderBufferQueueMessage(ctx->reorder, XLogRecGetXid(r),
+ buf->endptr,
+ 
message->message, /* first part of message is prefix */
+ 
message->message_size,
+ message->message 
+ message->prefix_size);
+   }
+   else if (SnapBuildCurrentState(builder) == SNAPBUILD_CONSISTENT &&
+!SnapBuildXactNeedsSkip(builder, buf->origptr))
+   {
+   volatile Snapshot   snapshot_now;
+   ReorderBuffer  *rb = ctx->reorder;
+
+   /* setup snapshot to allow catalog access */
+   snapshot_now = SnapBuildGetOrBuildSnapshot(builder, 
XLogRecGetXid(r));
+   SetupHistoricSnapshot(snapshot_now, NULL);
+   rb->message(rb, NULL, buf->origptr, message->message,
+   message->message_size,
+   message->message + 
message->prefix_size);
+   TeardownHistoricSnapshot(false);
+   }
+}


A number of things:
1) The SnapBuildProcessChange needs to be toplevel, not just for
transactional messages - we can't yet necessarily build a snapshot.


Nope, the snapshot state is checked in the else if.


2) I'm inclined to move even the non-transactional stuff to reorderbuffer.


Well, it's not doing anything with reorderbuffer but sure it can be done
(didn't do it in the attached though).


I think there'll be some interactions if we ever do some parts in
parallel and such.  I'd rather keep decode.c to only do the lowest level
stuff.



Did it that way but I do like the resulting code less.

--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
>From 6076cb5d85cab4cfe4cb99499ed49e5a0c384033 Mon Sep 17 00:00:00 2001
From: Petr Jelinek 
Date: Wed, 24 Feb 2016 17:02:36 +0100
Subject: [PATCH] Logical Decoding Messages

---
 contrib/test_decoding/Makefile  |   2 +-
 contrib/test_decoding/expected/ddl.out  |  21 ++--
 contrib/test_decoding/expected/messages.out |  56 +++
 contrib/test_decoding/sql/ddl.sql   |   3 +-
 contrib/test_decoding/sql/messages.sql  |  17 
 contrib/test_decoding/test_decoding.c   |  18 
 doc/src/sgml/func.sgml  |  45 +
 doc/src/sgml/logicaldecoding.sgml   |  38 
 src/backend/access/rmgrdesc/Makefile|   4 +-
 src/backend/access/rmgrdesc/logicalmsgdesc.c|  41 
 src/backend/access/transam/rmgr.c   |   1 +
 src/backend/replication/logical/Makefile|   2 +-
 src/backend/replication/logical/decode.c|  46 +
 src/backend/replication/logical/logical.c   |  38 
 src/backend/replication/logical/logicalfuncs.c  |  27 ++
 src/backend/replication/logical/message.c   |  85 +
 src/backend/replication/logical/reorderbuffer.c | 121 +++

Re: [HACKERS] Reworks of CustomScan serialization/deserialization

2016-03-23 Thread Kouhei Kaigai
> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Petr Jelinek
> Sent: Thursday, March 17, 2016 5:06 PM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Robert Haas; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Reworks of CustomScan serialization/deserialization
> 
> On 15/03/16 05:03, Kouhei Kaigai wrote:
> > Petr,
> >
> > The attached patch is the revised one that follows the new extensible-
> > node routine.
> >
> > It is almost same the previous version except for:
> > - custom-apis.[ch] was renamed to custom-node.[ch]
> > - check for the length of custom-scan-method name followed
> >the manner of RegisterExtensibleNodeMethods()
> >
> 
> Hi,
> 
> looks good, only nitpick I have is that it probably should be
> custom_node.h with underscore given that we use underscore everywhere
> (except for libpq and for some reason atomic ops).
>
Sorry for my response late.

The attached patch just renamed custom-node.[ch] by custom_node.[ch].
Other portions are not changed from the previous revison.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data

2016-03-23 Thread Michael Paquier
On Wed, Mar 23, 2016 at 3:43 PM, Pavan Deolasee
 wrote:
> While investigating some issue, I found that pg_xlogdump fails to dump
> contents from a WAL file if the file has continuation data from previous WAL
> record and the data spans more than one page. In such cases,
> XLogFindNextRecord() fails to take into account that there will be more than
> one xlog page headers (long and short) and thus tries to read from an offset
> where no valid record exists. That results in pg_xlogdump throwing error
> such as:

OK. That makes sense, that is indeed a possible scenario.

> Attached WAL file from master branch demonstrates the issue, generated using
> synthetic data. Also, attached patch fixes it for me.

So does it for me.

> While we could have deduced the number of short and long headers and skipped
> directly to the offset, I found reading one page at a time and using
> XLogPageHeaderSize() to find header size of each page separately, a much
> cleaner way. Also, the continuation data is not going to span many pages. So
> I don't see any harm in doing it that way.

I have to agree, the new code is pretty clean this way by calculating
the next page LSN directly in the loop should the record be longer
than it.

> I encountered this on 9.3, but the patch applies to both 9.3 and master. I
> haven't tested it on other branches, but I have no reason to believe it
> won't apply or work. I believe we should back patch it all supported
> branches.

Agreed, that's a bug, and the logic of your patch looks good to me.

+   /*
+* Compute targetRecOff. It should typically be greater than short
+* page-header since a valid record can't , but can also be zero when
+* caller has supplied a page-aligned address or when we are skipping
+* multi-page continuation record. It doesn't matter though because
+* ReadPageInternal() will read at least short page-header worth of
+* data
+*/
This needs some polishing, there is an unfinished sentence here.

+   targetRecOff = tmpRecPtr % XLOG_BLCKSZ;
targetRecOff, pageHeaderSize and targetPagePtr could be declared
inside directly the new while loop.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-03-23 Thread Amit Kapila
On Tue, Mar 22, 2016 at 12:33 PM, Dilip Kumar  wrote:
>
>
> On Thu, Mar 17, 2016 at 11:39 AM, Amit Kapila 
wrote:
>
>  I have reviewed the patch.. here are some review comments, I will
continue to review..
>
> 1.
>
> +
> + /*
> +  * Add the proc to list, if the clog page where we need to update the
>
> +  */
> + if (nextidx != INVALID_PGPROCNO &&
> + ProcGlobal->allProcs[nextidx].clogGroupMemberPage !=
proc->clogGroupMemberPage)
> + return false;
>
> Should we clear all these structure variable what we set above in case we
are not adding our self to group, I can see it will not have any problem
even if we don't clear them,
> I think if we don't want to clear we can add some comment mentioning the
same.
>

I have updated the patch to just clear clogGroupMember as that is what is
done when we wake the processes.

>
> 2.
>
> Here we are updating in our own proc, I think we don't need atomic
operation here, we are not yet added to the list.
>
> + if (nextidx != INVALID_PGPROCNO &&
> + ProcGlobal->allProcs[nextidx].clogGroupMemberPage !=
proc->clogGroupMemberPage)
> + return false;
> +
> + pg_atomic_write_u32(&proc->clogGroupNext, nextidx);
>
>

We won't be able to assign nextidx to clogGroupNext is of
type pg_atomic_uint32.

Thanks for the review.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Reworks of CustomScan serialization/deserialization

2016-03-23 Thread Petr Jelinek

On 23/03/16 08:34, Kouhei Kaigai wrote:

-Original Message-
From: pgsql-hackers-ow...@postgresql.org
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Petr Jelinek
Sent: Thursday, March 17, 2016 5:06 PM
To: Kaigai Kouhei(海外 浩平)
Cc: Robert Haas; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Reworks of CustomScan serialization/deserialization

On 15/03/16 05:03, Kouhei Kaigai wrote:

Petr,

The attached patch is the revised one that follows the new extensible-
node routine.

It is almost same the previous version except for:
- custom-apis.[ch] was renamed to custom-node.[ch]
- check for the length of custom-scan-method name followed
the manner of RegisterExtensibleNodeMethods()



Hi,

looks good, only nitpick I have is that it probably should be
custom_node.h with underscore given that we use underscore everywhere
(except for libpq and for some reason atomic ops).


Sorry for my response late.

The attached patch just renamed custom-node.[ch] by custom_node.[ch].
Other portions are not changed from the previous revison.



Forgot to attach?

--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Access method extendability

2016-03-23 Thread Alexander Korotkov
On Tue, Mar 22, 2016 at 11:53 PM, Alvaro Herrera 
wrote:

> Alexander Korotkov wrote:
> > On Tue, Mar 22, 2016 at 1:26 AM, Alvaro Herrera <
> alvhe...@2ndquadrant.com>
> > wrote:
>
> > > I noticed this state of affairs because I started reading the complete
> > > thread in order to verify whether a consensus had been reached
> regarding
> > > the move of IndexQualInfo and GenericCosts to selfuncs.h.  But I only
> > > see that Jim Nasby mentioned it and Alexander said "yes they move";
> > > nothing else.  The reason for raising this is that, according to
> > > Alexander, new index AMs will want to use those structs; but current
> > > index AMs only include index_selfuncs.h, and none of them includes
> > > selfuncs.h, so it seems completely wrong to be importing all the
> planner
> > > knowledge embodied in selfuncs.h into extension index AMs.
> > >
> > > One observation is that the bloom AM patch (0003 of this series) uses
> > > GenericCosts but not IndexQualInfo.  But btcostestimate() and
> > > gincostestimate() both use IndexQualInfo, so other AMs might well want
> > > to use it too.
> > >
> > > We could move GenericCosts and the prototype for genericcostestimate()
> > > to index_selfuncs.h; that much is pretty reasonable.  But I'm not sure
> > > what to do about IndexQualInfo.  We could just add forward struct
> > > declarations for RestrictInfo and Node to index_selfuncs.h.  But then
> > > the extension code is going to have to import the includes were those
> > > are defined anyway.  Certainly it seems that deconstruct_indexquals()
> > > (which returns a list of IndexQualInfo) is going to be needed by
> > > extension index AMs, at least ...
> >
> > Thank you for highlighting these issues.
>
> After thinking some more about this, I think it's all right to move both
> IndexQualInfo and GenericCosts to selfuncs.h.  The separation that we
> want is this: the selectivity functions themselves need access to a lot
> of planner knowledge, and so selfuncs.h knows a lot about planner.  But
> the code that _calls_ the selectivity function doesn't; and
> index_selfuncs.h is about the latter.  Properly architected extension
> index AMs are going to have their selectivity functions in a separate .c
> file which knows a lot about planner, and which includes selfuncs.h (and
> maybe index_selfuncs.h if it wants to piggyback on the existing
> selecticity functions, but that's unlikely); only the prototype for the
> selfunc is going to be needed elsewhere, and the rest of the index code
> is not going to know anything about planner stuff so it will not need to
> include selfuncs.h nor index_selfuncs.h.


Right, the purposes of headers are:
 * selfuncs.h – for those who going to estimate selectivity,
 * index_selfuncs.h – for those who going to call selectivity estimation
functions of built-in AMs.

>From this point for view we should keep both IndexQualInfo and GenericCosts
in selfuncs.h.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] Performance degradation in commit 6150a1b0

2016-03-23 Thread Ashutosh Sharma
Hi All,

I have been working on this issue for last few days trying to investigate
what could be the probable reasons for Performance degradation at commit
6150a1b0. After going through Andres patch for moving buffer I/O and
content lock out of Main Tranche, the following two things come into my
mind.

1. Content Lock is no more used as a pointer in BufferDesc structure
instead it is included as LWLock structure. This  basically increases the
overall structure size from 64bytes to 80 bytes. Just to investigate on
this, I have reverted the changes related to content lock from commit
6150a1b0 and taken at least 10 readings and with this change i can see that
the overall performance is similar to what it was observed earlier i.e.
before commit 6150a1b0.

2. Secondly, i can see that the BufferDesc structure padding is 64 bytes
however the PG CACHE LINE ALIGNMENT is 128 bytes. Also, after changing the
BufferDesc structure padding size to 128 bytes along with the changes
mentioned in above point #1, I see that the overall performance is again
similar to what is observed before commit 6150a1b0.

Please have a look into the attached test report that contains the
performance test results for all the scenarios discussed above and let me
know your thoughts.

With Regards,
Ashutosh Sharma
EnterpriseDB: *http://www.enterprisedb.com *

On Sat, Feb 27, 2016 at 9:26 AM, Andres Freund  wrote:

> On February 26, 2016 7:55:18 PM PST, Amit Kapila 
> wrote:
> >On Sat, Feb 27, 2016 at 12:41 AM, Andres Freund 
> >wrote:
> >>
> >> Hi,
> >>
> >> On 2016-02-25 12:56:39 +0530, Amit Kapila wrote:
> >> > From past few weeks, we were facing some performance degradation in
> >the
> >> > read-only performance bench marks in high-end machines.  My
> >colleague
> >> > Mithun, has tried by reverting commit ac1d794 which seems to
> >degrade the
> >> > performance in HEAD on high-end m/c's as reported previously[1],
> >but
> >still
> >> > we were getting degradation, then we have done some profiling to
> >see
> >what
> >> > has caused it  and we found that it's mainly caused by spin lock
> >when
> >> > called via pin/unpin buffer and then we tried by reverting commit
> >6150a1b0
> >> > which has recently changed the structures in that area and it turns
> >out
> >> > that reverting that patch, we don't see any degradation in
> >performance.
> >> > The important point to note is that the performance degradation
> >doesn't
> >> > occur every time, but if the tests are repeated twice or thrice, it
> >> > is easily visible.
> >>
> >> > m/c details
> >> > IBM POWER-8
> >> > 24 cores,192 hardware threads
> >> > RAM - 492GB
> >> >
> >> > Non-default postgresql.conf settings-
> >> > shared_buffers=16GB
> >> > max_connections=200
> >> > min_wal_size=15GB
> >> > max_wal_size=20GB
> >> > checkpoint_timeout=900
> >> > maintenance_work_mem=1GB
> >> > checkpoint_completion_target=0.9
> >> >
> >> > scale_factor - 300
> >> >
> >> > Performance at commit 43cd468cf01007f39312af05c4c92ceb6de8afd8 is
> >469002 at
> >> > 64-client count and then at
> >6150a1b08a9fe7ead2b25240be46dddeae9d98e1, it
> >> > went down to 200807.  This performance numbers are median of 3
> >15-min
> >> > pgbench read-only tests.  The similar data is seen even when we
> >revert
> >the
> >> > patch on latest commit.  We have yet to perform detail analysis as
> >to
> >why
> >> > the commit 6150a1b08a9fe7ead2b25240be46dddeae9d98e1 lead to
> >degradation,
> >> > but any ideas are welcome.
> >>
> >> Ugh. Especially the varying performance is odd. Does it vary between
> >> restarts, or is it just happenstance?  If it's the former, we might
> >be
> >> dealing with some alignment issues.
> >>
> >
> >It varies between restarts.
> >
> >>
> >> If not, I wonder if the issue is massive buffer header contention. As
> >a
> >> LL/SC architecture acquiring the content lock might interrupt buffer
> >> spinlock acquisition and vice versa.
> >>
> >> Does applying the patch from
> >
> http://archives.postgresql.org/message-id/CAPpHfdu77FUi5eiNb%2BjRPFh5S%2B1U%2B8ax4Zw%3DAUYgt%2BCPsKiyWw%40mail.gmail.com
> >> change the picture?
> >>
> >
> >Not tried, but if this is alignment issue as you are suspecting above,
> >then
> >does it make sense to try this out?
>
> It's the other theory I had. And it's additionally useful testing
> regardless of this regression...
>
> ---
> Please excuse brevity and formatting - I am writing this on my mobile
> phone.
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Performance_Results.xlsx
Description: MS-Excel 2007 spreadsheet

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2016-03-23 Thread Kyotaro HORIGUCHI
Hello,

At Tue, 22 Mar 2016 23:08:36 +0900, Fujii Masao  wrote 
in 
> On Tue, Mar 22, 2016 at 9:58 PM, Kyotaro HORIGUCHI
>  wrote:
> > Thank you for the revised patch.
> 
> Thanks for reviewing the patch!
> 
> > This version looks to focus on n-priority method. Stuffs for the
> > other methods like n-quorum has been removed. It is okay for me.
> 
> I don't think it's so difficult to extend this version so that
> it supports also quorum commit.

Mmm. I think I understand this just now. As Sawada-san said
before, all standbys in a single-level quorum set having the same
sync_standby_prioirity, the current algorithm works as it is. It
also true for the case that some quorum sets are in a priority
set.

What about some priority sets in a quorum set?

> > StringInfo for double-quoted names seems to me to be overkill,
> > since it allocates 1024 byte block for every such name. A static
> > buffer seems enough for the usage as I said.
> 
> So, what about changing the scanner code as follows?
> 
> {xdstop} {
> yylval.str = pstrdup(xdbuf.data);
> pfree(xdbuf.data);
> BEGIN(INITIAL);
> return NAME;
> 
> > The parser is called for not only for SIGHUP, but also for
> > starting of every walsender. The latter is not necessary but it
> > is the matter of trade-off between simplisity and
> > effectiveness.
> 
> Could you elaborate why you think that's not necessary?

Sorry, starting of walsender is not so large problem, 1024 bytes
memory is just abandoned once. SIGHUP is rather a problem.

The part is called under two kinds of memory context, "config
file processing" then "Replication command context". The former
is deleted just after reading the config file so no harm but the
latter is a quite long-lasting context and every reloading bloats
the context with abandoned memory blocks. It is needed to be
pfreed or to use a memory context with shorter lifetime, or use
static storage of 64 byte-length, even though the bloat become
visible after very many times of conf reloads.


> BTW, in previous patch, s_s_names is parsed by postmaster during the server
> startup. A child process takes over the internal data struct for the parsed
> s_s_names when it's forked by the postmaster. This is what the previous
> patch was expecting. However, this doesn't work in EXEC_BACKEND environment.
> In that environment, the data struct should be passed to a child process via
> the special file (like write_nondefault_variables() does), or it should
> be constructed during walsender startup (like latest version of the patch
> does). IMO the latter is simpler.

Ah, I haven't notice that but I agree with it.


As per my previous comment, syncrep_scanner.l doesn't reject some
(nonprintable and multibyte) characters in a name, which is to be
silently replaced with '?' for application_name. It would not be
a problem for almost all of us but might be needed to be
documented if we won't change the behavior to be the same as
application_name.

By the way, the following documentation fix mentioned by Thomas,

-to as 2-safe replication in computer science theory.
+to as group-safe replication in computer science theory.

should be restored if the discussion in the following message is
true. And some supplemental description would be needed.

http://www.postgresql.org/message-id/20160316.164833.188624159.horiguchi.kyot...@lab.ntt.co.jp


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in searching path in jsonb_set when walking through JSONB array

2016-03-23 Thread Oleg Bartunov
On Wed, Mar 23, 2016 at 6:37 AM, Vitaly Burovoy 
wrote:

> Hello, Hackers!
>
> While I was reviewed a patch with "json_insert" function I found a bug
> which wasn't connected with the patch and reproduced at master.
>
> It claims about non-integer whereas input values are obvious integers
> and in an allowed range.
> More testing lead to understanding it appears when numbers length are
> multiplier of 4:
>
> postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", }',
> '"4"');
> ERROR:  path element at the position 2 is not an integer
>

Hmm, I see in master

select version();
 version
-
 PostgreSQL 9.6devel on x86_64-apple-darwin15.4.0, compiled by Apple LLVM
version 7.3.0 (clang-703.0.29), 64-bit
(1 row)

select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", }', '"4"');
 jsonb_set

 {"a": [[], 1, 2, 3, "4"], "b": []}
(1 row)




>
> postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"b", 1000}',
> '"4"');
> ERROR:  path element at the position 2 is not an integer
>
> postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", -999}',
> '"4"');
> ERROR:  path element at the position 2 is not an integer
>
> postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a",
> 1000}', '"4"');
> ERROR:  path element at the position 2 is not an integer
>
> Close values are ok:
> postgres=# select jsonb_set('{"a":[[],1,2,3]}', '{"a", 0, 999}', '"4"');
> jsonb_set
> -
>  {"a": [["4"], 1, 2, 3]}
> (1 row)
>
> postgres=# select jsonb_set('{"a":[[],1,2,3]}', '{"a", 0, 1}', '"4"');
> jsonb_set
> -
>  {"a": [["4"], 1, 2, 3]}
> (1 row)
>
>
> Research lead to setPathArray where a string which is got via
> VARDATA_ANY but is passed to strtol which expects cstring.
>
> In case of string number length is not a multiplier of 4 rest bytes
> are padding by '\0', when length is a multiplier of 4 there is no
> padding, just garbage after the last digit of the value.
>
> Proposed patch in an attachment fixes it.
>
> There is a magic number "20" as a length of an array for copying key
> from a path before passing it to strtol. It is a maximal length of a
> value which can be parsed by the function. I could not find a proper
> constant for it. Also I found similar direct value in the code (e.g.
> in numeric.c).
>
> I've added a comment, I hope it is enough for it.
>
>
> --
> Best regards,
> Vitaly Burovoy
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


Re: [HACKERS] Postgres_fdw join pushdown - getting server crash in left outer join of three table

2016-03-23 Thread Rajkumar Raghuwanshi
Thanks Ashutosh for the patch. I have apply and retested it, now not
getting server crash.

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation

On Mon, Mar 21, 2016 at 8:02 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> Thanks Michael for looking into this.
>
>
>
>> In get_useful_ecs_for_relation, it seems to me that this assertion
>> should be removed and replaces by an actual check because even if
>> right_ec and left_ec are initialized, we cannot be sure that ec_relids
>> contains the relations specified:
>> /*
>>  * restrictinfo->mergeopfamilies != NIL is sufficient to guarantee
>>  * that left_ec and right_ec will be initialized, per comments in
>>  * distribute_qual_to_rels, and rel->joininfo should only contain
>> ECs
>>  * where this relation appears on one side or the other.
>>  */
>> if (bms_is_subset(relids, restrictinfo->right_ec->ec_relids))
>> useful_eclass_list =
>> list_append_unique_ptr(useful_eclass_list,
>>
>>  restrictinfo->right_ec);
>> else
>> {
>> Assert(bms_is_subset(relids,
>> restrictinfo->left_ec->ec_relids));
>> useful_eclass_list =
>> list_append_unique_ptr(useful_eclass_list,
>>
>> restrictinfo->left_ec);
>> }
>>
>
> An EC covers all the relations covered by all the equivalence members it
> contains. In case of mergejoinable clause for outer join, EC may cover just
> a single relation whose column appears on either side of the clause. In
> this case, bms_is_subset() for a given join relation covering single
> relation in EC will be false. So, we have to use bms_overlap() instead of
> bms_is_subset(). The caller get_useful_pathkeys_for_rel() extracts the
> equivalence member (if any), which is entirely covered by the given
> relation. Otherwise, you are correct that we have to convert the assertion
> into a condition. I have added comments in get_useful_ecs_for_relation()
> explaining, why.
>
> See for example the attached (with more tests including combinations
>> of joins, and three-table joins). I have added an open item for 9.6 on
>> the wiki.
>>
>
> Thanks for those tests. Actually, that code is relevant for joins which
> can not be pushed down to the foreign server. For such joins we try to add
> pathkeys which will help merge joins. I have included the relevant tests
> rewriting them to use local tables, so that the entire join is not pushed
> down to the foreign server.
>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>


Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2016-03-23 Thread Aleksander Alekseev
> > I have a strong feeling that we are just wasting our time here.
> 
> That is possible.  However, I would like it if you would give me the
> benefit of the doubt and assume that, if I seem to be more cautious
> than you would be were you a committer, there might possibly be some
> good reasons for that.  The fact is that, despite being more cautious
> than some people think I should be, I still manage to introduce quite
> a number of bugs via the patches I commit - see the thread 'Missing
> rows with index scan when collation is not "C"' on pgsql-bugs for just
> the very latest example of that.  Nobody thinks that will happen with
> *their* patch, of course, but it does all the same.

Oh, it explains a lot! You see, I couldn't figure out whats happening.
Patch was discussed and reviewed a long time ago, everyone seems to be
reasonably happy with it, etc. But for some reason it's ignored for
weeks and no one tells why. Now it makes sense.

I should probably mention that this patch was merged to PostgresPro
build a few months ago. Several our clients are already using this code
in production environment (guess who discovered this issue and wanted
it to be fixed). There were no complains so far.

> I'd still like an answer to the question of why this helps so much
> when there must be huge amounts of false sharing between the different
> mutexes.  Maybe it doesn't matter, though.

Well, the reason is that there is no more bottleneck here. Code is
executed like 1% of the time here and 99% of the time somewhere else.
There is a false sharing but it's not as expensive as 99% of other
code. Thus optimizing 1% of the code any further doesn't give noticeable
performance improvement.

I still believe that optimizing 1% blindly without considering possible
side effects this optimization can bring (other data alignment, some
additional machine instructions - just to name a few) and having no
way to measure these side effects is a bad idea. But I also admit a
possibility that I can somehow be wrong on this. So I rewrote this
patch one again :'( the way you suggested (without that alignment
related hack I tried, it's just too ugly). I also attached original
hashhdr-rmh.patch just to have both patches in one message so it would
be easier to find both patches in this thread.

If there are any other questions or doubts regarding these patches
please don't hesitate to ask.

-- 
Best regards,
Aleksander Alekseev
http://eax.me/
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index 24a53da..06c413c 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -15,7 +15,7 @@
  * to hash_create.  This prevents any attempt to split buckets on-the-fly.
  * Therefore, each hash bucket chain operates independently, and no fields
  * of the hash header change after init except nentries and freeList.
- * A partitioned table uses a spinlock to guard changes of those two fields.
+ * A partitioned table uses spinlocks to guard changes of those fields.
  * This lets any subset of the hash buckets be treated as a separately
  * lockable partition.  We expect callers to use the low-order bits of a
  * lookup key's hash value as a partition number --- this will work because
@@ -111,6 +111,8 @@
 #define DEF_DIRSIZE			   256
 #define DEF_FFACTOR			   1	/* default fill factor */
 
+/* Number of freelists to be used for a partitioned hash table. */
+#define NUM_FREELISTS			32
 
 /* A hash bucket is a linked list of HASHELEMENTs */
 typedef HASHELEMENT *HASHBUCKET;
@@ -128,12 +130,17 @@ typedef HASHBUCKET *HASHSEGMENT;
  */
 struct HASHHDR
 {
-	/* In a partitioned table, take this lock to touch nentries or freeList */
-	slock_t		mutex;			/* unused if not partitioned table */
-
-	/* These fields change during entry addition/deletion */
-	long		nentries;		/* number of entries in hash table */
-	HASHELEMENT *freeList;		/* linked list of free elements */
+	/*
+	 * The freelist can become a point of contention on high-concurrency hash
+	 * tables, so we use an array of freelist, each with its own mutex and
+	 * nentries count, instead of just a single one.
+	 *
+	 * If hash table is not partitioned only nentries[0] and freeList[0] are
+	 * used and spinlocks are not used at all.
+	 */
+	slock_t		mutex[NUM_FREELISTS];		/* array of spinlocks */
+	long		nentries[NUM_FREELISTS];	/* number of entries */
+	HASHELEMENT *freeList[NUM_FREELISTS]; /* lists of free elements */
 
 	/* These fields can change, but not in a partitioned table */
 	/* Also, dsize can't change in a shared table, even if unpartitioned */
@@ -166,6 +173,9 @@ struct HASHHDR
 
 #define IS_PARTITIONED(hctl)  ((hctl)->num_partitions != 0)
 
+#define FREELIST_IDX(hctl, hashcode) \
+	(IS_PARTITIONED(hctl) ? hashcode % NUM_FREELISTS : 0)
+
 /*
  * Top control structure for a hashtable --- in a shared table, each backend
  * has its own copy (OK since no fields change at runtime)
@@ -219,10 +229,10 @@ static long hash_accesses,
 

Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data

2016-03-23 Thread Pavan Deolasee
On Wed, Mar 23, 2016 at 1:13 PM, Michael Paquier 
wrote:

>
> +   /*
> +* Compute targetRecOff. It should typically be greater than short
> +* page-header since a valid record can't , but can also be zero
> when
> +* caller has supplied a page-aligned address or when we are
> skipping
> +* multi-page continuation record. It doesn't matter though because
> +* ReadPageInternal() will read at least short page-header worth of
> +* data
> +*/
> This needs some polishing, there is an unfinished sentence here.
>
> +   targetRecOff = tmpRecPtr % XLOG_BLCKSZ;
> targetRecOff, pageHeaderSize and targetPagePtr could be declared
> inside directly the new while loop.
>

Thanks Michael for reviewing the patch. I've fixed these issues and new
version is attached.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


xlogdump_multipage_cont_record_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in searching path in jsonb_set when walking through JSONB array

2016-03-23 Thread Vitaly Burovoy
On 2016-03-23, Oleg Bartunov  wrote:
> On Wed, Mar 23, 2016 at 6:37 AM, Vitaly Burovoy 
> wrote:
>
>> Hello, Hackers!
>>
>> While I was reviewed a patch with "json_insert" function I found a bug
>> which wasn't connected with the patch and reproduced at master.
>>
>> It claims about non-integer whereas input values are obvious integers
>> and in an allowed range.
>> More testing lead to understanding it appears when numbers length are
>> multiplier of 4:
>>
>> postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", }',
>> '"4"');
>> ERROR:  path element at the position 2 is not an integer
>>
>
> Hmm, I see in master
>
> select version();
>  version
> -
>  PostgreSQL 9.6devel on x86_64-apple-darwin15.4.0, compiled by Apple LLVM
> version 7.3.0 (clang-703.0.29), 64-bit
> (1 row)
>
> select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", }', '"4"');
>  jsonb_set
> 
>  {"a": [[], 1, 2, 3, "4"], "b": []}
> (1 row)

Yes, I can't reproduce it with "CFLAGS=-O2", but it is still
reproduced with "CFLAGS='-O0 -g3'".

postgres=# select version();
 version
--
 PostgreSQL 9.6devel on x86_64-pc-linux-gnu, compiled by gcc (Gentoo
4.8.4 p1.4, pie-0.6.1) 4.8.4, 64-bit
(1 row)

postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", }', '"4"');
ERROR:  path element at the position 2 is not an integer


It depends on memory after the string. In debug mode it always (most
of the time?) has a garbage (in my case the char '~' following by
'\x7f' multiple times) there.

I think it is just a question of complexity of reproducing in release,
not a question whether there is a bug or not.

All the other occurrences of strtol in the file have
TextDatumGetCString before, except the occurrence in the setPathArray
function. It seems its type is TEXT (which is not null-terminated),
not cstring.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Yury Zhuravlev

Fabrízio de Royes Mello wrote:

I got an error when build this patch.
Thanks for review and tests! 


I'm working on it. Without ecpg everything works fine.
In ecpg we have black perl magic and special rules for PREPARE.
The error is in perl script.

Regards
--
Yury Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN is missing in multicolumn indexes documentation

2016-03-23 Thread Emre Hasegeli
> I guess multicolumn BRIN behaves similarly to B-tree or GiST. But I'm
> no expert, so I need someone knowledgeable to confirm this. If the
> following wording is OK, I will update the patch.

Multicolumn BRIN is like GIN.  Every column is indexed separately.
The order of the columns doesn't matter.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: "Causal reads" mode for load balancing reads without stale data

2016-03-23 Thread Robert Haas
On Wed, Mar 23, 2016 at 7:34 AM, Thomas Munro
 wrote:
> synchronous_commitTPS
>  
> off  9234
> local1223
> remote_write  907
> on587
> remote_apply  555
>
> synchronous_commitTPS
>  
> off  3937
> local1984
> remote_write 1701
> on   1373
> remote_apply 1349

Hmm, so "remote_apply" is barely more expensive than "on".  That's encouraging.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] VS 2015 support in src/tools/msvc

2016-03-23 Thread Petr Jelinek

On 23/03/16 08:17, Michael Paquier wrote:

On Mon, Mar 21, 2016 at 3:43 AM, Andrew Dunstan  wrote:



On 03/20/2016 12:02 AM, Michael Paquier wrote:


On Sun, Mar 20, 2016 at 8:06 AM, Andrew Dunstan 
wrote:



Still to do: the non-perl pieces.


The patch to address locales is the sensitive part. The patch from
Petr is taking the correct approach though I think that we had better
simplify it a bit and if possible we had better not rely on the else
block it introduces.




OK, the please send an updated set of patches for what remains.


Here you go:
- 0001 fixes the global declarations of TIMEZONE_GLOBAL and
TZNAME_GLOBAL to be WIN32-compliant. I got bitten by that in the ECPG
compilation.
- 0002, support of VS2015 in the scripts of src/tools/msvc
- 0003, this is necessary in order to run the regression tests,
details are here:
http://www.postgresql.org/message-id/cab7npqtdixxs8cdl8moivh6qfq-1qv9mkn0ayjzybvzv6wc...@mail.gmail.com


So that's basically what Andres did? Interesting that we now actually 
really need it. I was in favor of doing those changes in any case.



- 0004 is the fix for locales. This is actually Petr's work upthread,
I have updated comments in the code though and did a bit more
polishing. This still looks like the cleanest solution we have. Other
solutions are mentioned upthread: redeclaration of struct _locale_t in
backend code is one.



Thanks for polishing this.

I think this is ready to be committed, but the 0003 might be somewhat 
controversial based on the original thread.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reworks of CustomScan serialization/deserialization

2016-03-23 Thread Kouhei Kaigai
> >> On 15/03/16 05:03, Kouhei Kaigai wrote:
> >>> Petr,
> >>>
> >>> The attached patch is the revised one that follows the new extensible-
> >>> node routine.
> >>>
> >>> It is almost same the previous version except for:
> >>> - custom-apis.[ch] was renamed to custom-node.[ch]
> >>> - check for the length of custom-scan-method name followed
> >>> the manner of RegisterExtensibleNodeMethods()
> >>>
> >>
> >> Hi,
> >>
> >> looks good, only nitpick I have is that it probably should be
> >> custom_node.h with underscore given that we use underscore everywhere
> >> (except for libpq and for some reason atomic ops).
> >>
> > Sorry for my response late.
> >
> > The attached patch just renamed custom-node.[ch] by custom_node.[ch].
> > Other portions are not changed from the previous revison.
> >
> 
> Forgot to attach?
>
Yes Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 



pgsql-v9.6-custom-scan-serialization-reworks.3.patch
Description: pgsql-v9.6-custom-scan-serialization-reworks.3.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] VS 2015 support in src/tools/msvc

2016-03-23 Thread Michael Paquier
On Wed, Mar 23, 2016 at 8:45 PM, Petr Jelinek  wrote:
> On 23/03/16 08:17, Michael Paquier wrote:
>>
>> On Mon, Mar 21, 2016 at 3:43 AM, Andrew Dunstan 
>> wrote:
>>>
>>>
>>>
>>> On 03/20/2016 12:02 AM, Michael Paquier wrote:


 On Sun, Mar 20, 2016 at 8:06 AM, Andrew Dunstan 
 wrote:
>
>
>
> Still to do: the non-perl pieces.


 The patch to address locales is the sensitive part. The patch from
 Petr is taking the correct approach though I think that we had better
 simplify it a bit and if possible we had better not rely on the else
 block it introduces.
>>>
>>>
>>>
>>>
>>> OK, the please send an updated set of patches for what remains.
>>
>>
>> Here you go:
>> - 0001 fixes the global declarations of TIMEZONE_GLOBAL and
>> TZNAME_GLOBAL to be WIN32-compliant. I got bitten by that in the ECPG
>> compilation.
>> - 0002, support of VS2015 in the scripts of src/tools/msvc
>> - 0003, this is necessary in order to run the regression tests,
>> details are here:
>>
>> http://www.postgresql.org/message-id/cab7npqtdixxs8cdl8moivh6qfq-1qv9mkn0ayjzybvzv6wc...@mail.gmail.com
>
> So that's basically what Andres did? Interesting that we now actually really
> need it. I was in favor of doing those changes in any case.

Yes, that's what Andres did. I am just attaching it here to allow
Andrew to test this patch set appropriately.

>> - 0004 is the fix for locales. This is actually Petr's work upthread,
>> I have updated comments in the code though and did a bit more
>> polishing. This still looks like the cleanest solution we have. Other
>> solutions are mentioned upthread: redeclaration of struct _locale_t in
>> backend code is one.
>
> Thanks for polishing this.
>
> I think this is ready to be committed, but the 0003 might be somewhat
> controversial based on the original thread.

I thought that the end consensus regarding 0003 was to apply it, but
we could as well wait for the final verdict (committer push) on the
other thread. And well, if this is not applied, some of the gin tests
will complain about "right sibling of GIN page is of different type" a
couple of times.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Craig Ringer
On 22 March 2016 at 21:01, Andres Freund  wrote:

> Hi,
>
>
> On 2016-03-22 12:41:43 +0300, Yury Zhuravlev wrote:
> > Do I understand correctly the only way know availability PREPARE it will
> > appeal to pg_prepared_statements?
> > I think this is not a good practice. In some cases, we may not be aware
> of
> > the PREPARE made (pgpool). Moreover, it seems popular question in the
> > Internet:
> http://stackoverflow.com/questions/1193020/php-postgresql-check-if-a-prepared-statement-already-exists
> >
> > What do you think about adding NOT EXIST functionality to PREPARE?
>
> Not very much. If you're not in in control of the prepared statements, you
> can't be sure it's not an entirely different statement. So NOT EXISTS
> doesn't really buy you anything, you'd still need to compare the
> statement somehow.
>

Yeah, agreed. I don't buy the reasoning given  for using this in PgJDBC and
think it'd just be the source of new and exciting subtle bugs.

I can only see it vaguely working if the client were required to checksum
the statement text (or the server was) and compare it with a checksum
stored against the prepared statement. On mismatch, ERROR.

If the problem Yuri is trying to solve is with pgbouncer in
transaction-pooling mode, wouldn't a possible solution be PREPARE LOCAL ?
i.e. transaction-scoped prepared statements?

With PREPARE IF NOT EXISTS the client is also paying parse and network
overhead for every time you send that statement. Much better not to send it
repeatedly in the first place.

I think we need to take a step back here and better define the problem
before stepping in with a proposed solution. Something that avoids the need
to spam the server with endless copies of the same PREPARE statements would
be good.

BTW, PgJDBC doesn't use SQL-level PREPARE anyway, it does it with named
portals at the protocol level. You can't add IF NOT EXISTS there, at least
not the same way.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Bug in searching path in jsonb_set when walking through JSONB array

2016-03-23 Thread Michael Paquier
On Wed, Mar 23, 2016 at 7:48 PM, Vitaly Burovoy
 wrote:
> On 2016-03-23, Oleg Bartunov  wrote:
>> On Wed, Mar 23, 2016 at 6:37 AM, Vitaly Burovoy 
>> wrote:
>>
>>> Hello, Hackers!
>>>
>>> While I was reviewed a patch with "json_insert" function I found a bug
>>> which wasn't connected with the patch and reproduced at master.
>>>
>>> It claims about non-integer whereas input values are obvious integers
>>> and in an allowed range.
>>> More testing lead to understanding it appears when numbers length are
>>> multiplier of 4:
>>>
>>> postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", }',
>>> '"4"');
>>> ERROR:  path element at the position 2 is not an integer
>>>
>>
>> Hmm, I see in master
>>
>> select version();
>>  version
>> -
>>  PostgreSQL 9.6devel on x86_64-apple-darwin15.4.0, compiled by Apple LLVM
>> version 7.3.0 (clang-703.0.29), 64-bit
>> (1 row)
>>
>> select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", }', '"4"');
>>  jsonb_set
>> 
>>  {"a": [[], 1, 2, 3, "4"], "b": []}
>> (1 row)
>
> Yes, I can't reproduce it with "CFLAGS=-O2", but it is still
> reproduced with "CFLAGS='-O0 -g3'".

On my old-age laptop (OSX 10.8) I can reproduce the failure as well.

> postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", }', '"4"');
> ERROR:  path element at the position 2 is not an integer
>
> It depends on memory after the string. In debug mode it always (most
> of the time?) has a garbage (in my case the char '~' following by
> '\x7f' multiple times) there.
>
> I think it is just a question of complexity of reproducing in release,
> not a question whether there is a bug or not.

Er, this is definitely a bug. That's not really a problem I think.

> All the other occurrences of strtol in the file have
> TextDatumGetCString before, except the occurrence in the setPathArray
> function. It seems its type is TEXT (which is not null-terminated),
> not cstring.

-char   *c = VARDATA_ANY(path_elems[level]);
+char   *keyptr = VARDATA_ANY(path_elems[level]);
+intkeylen = VARSIZE_ANY_EXHDR(path_elems[level]);
+charc[20 + 1];   /* int64 = 18446744073709551615 (20
symbols) */
 longlindex;
That's ugly. We should actually use TextDatumGetCString because the
index is stored as text here via a Datum, and then it is converted
back to an integer. So I propose instead the simple patch attached
that fixes the failure for me. Could you check if that works for you?
-- 
Michael


jsonb-set-fix-v1.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: "Causal reads" mode for load balancing reads without stale data

2016-03-23 Thread Michael Paquier
On Wed, Mar 23, 2016 at 8:39 PM, Robert Haas  wrote:
> On Wed, Mar 23, 2016 at 7:34 AM, Thomas Munro
>  wrote:
>> synchronous_commitTPS
>>  
>> off  9234
>> local1223
>> remote_write  907
>> on587
>> remote_apply  555
>>
>> synchronous_commitTPS
>>  
>> off  3937
>> local1984
>> remote_write 1701
>> on   1373
>> remote_apply 1349
>
> Hmm, so "remote_apply" is barely more expensive than "on".  That's 
> encouraging.

Indeed, interesting. This is perhaps proving that just having the
possibility to have remote_apply (with feedback messages managed by
signals, which is the best thing proposed for this release) is what we
need to ensure the consistency of reads across nodes, and what would
satisfy most of the user's requirements. Getting a slightly lower TPS
may be worth the cost for some users if they can ensure that reads
across nodes are accessible after a local commit, and there is no need
to have an error management layer at application level to take care of
errors caused by causal read timeouts.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data

2016-03-23 Thread Michael Paquier
On Wed, Mar 23, 2016 at 7:04 PM, Pavan Deolasee
 wrote:
>
>
> On Wed, Mar 23, 2016 at 1:13 PM, Michael Paquier 
> wrote:
>>
>>
>> +   /*
>> +* Compute targetRecOff. It should typically be greater than short
>> +* page-header since a valid record can't , but can also be zero
>> when
>> +* caller has supplied a page-aligned address or when we are
>> skipping
>> +* multi-page continuation record. It doesn't matter though
>> because
>> +* ReadPageInternal() will read at least short page-header worth
>> of
>> +* data
>> +*/
>> This needs some polishing, there is an unfinished sentence here.
>>
>> +   targetRecOff = tmpRecPtr % XLOG_BLCKSZ;
>> targetRecOff, pageHeaderSize and targetPagePtr could be declared
>> inside directly the new while loop.
>
>
> Thanks Michael for reviewing the patch. I've fixed these issues and new
> version is attached.

I'd just add dots at the end of the sentences in the comment blocks
because that's project style, but I'm being picky, except that the
logic looks quite good.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN is missing in multicolumn indexes documentation

2016-03-23 Thread Alvaro Herrera
Emre Hasegeli wrote:
> > I guess multicolumn BRIN behaves similarly to B-tree or GiST. But I'm
> > no expert, so I need someone knowledgeable to confirm this. If the
> > following wording is OK, I will update the patch.
> 
> Multicolumn BRIN is like GIN.  Every column is indexed separately.
> The order of the columns doesn't matter.

Right.  You can use one index to cover all columns; the position of the
column in the index won't matter for a query that uses one column.  The
only reason to have multiple BRIN indexes on a single table is to have
a different pages_per_range.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres_fdw join pushdown - getting server crash in left outer join of three table

2016-03-23 Thread Michael Paquier
On Mon, Mar 21, 2016 at 11:32 PM, Ashutosh Bapat
 wrote:
>> In get_useful_ecs_for_relation, it seems to me that this assertion
>> should be removed and replaces by an actual check because even if
>> right_ec and left_ec are initialized, we cannot be sure that ec_relids
>> contains the relations specified:
>> /*
>>  * restrictinfo->mergeopfamilies != NIL is sufficient to guarantee
>>  * that left_ec and right_ec will be initialized, per comments in
>>  * distribute_qual_to_rels, and rel->joininfo should only contain
>> ECs
>>  * where this relation appears on one side or the other.
>>  */
>> if (bms_is_subset(relids, restrictinfo->right_ec->ec_relids))
>> useful_eclass_list =
>> list_append_unique_ptr(useful_eclass_list,
>>
>> restrictinfo->right_ec);
>> else
>> {
>> Assert(bms_is_subset(relids,
>> restrictinfo->left_ec->ec_relids));
>> useful_eclass_list =
>> list_append_unique_ptr(useful_eclass_list,
>>
>> restrictinfo->left_ec);
>> }
>
> An EC covers all the relations covered by all the equivalence members it
> contains. In case of mergejoinable clause for outer join, EC may cover just
> a single relation whose column appears on either side of the clause. In this
> case, bms_is_subset() for a given join relation covering single relation in
> EC will be false. So, we have to use bms_overlap() instead of
> bms_is_subset(). The caller get_useful_pathkeys_for_rel() extracts the
> equivalence member (if any), which is entirely covered by the given
> relation. Otherwise, you are correct that we have to convert the assertion
> into a condition. I have added comments in get_useful_ecs_for_relation()
> explaining, why.

Ah, OK. Thanks for the explanation. This code is fixing the problem
for me as well here.
(note to self: study more the planner code).
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PROPOSAL] Add SCTP network protocol to postgresql backend and frontend

2016-03-23 Thread Eduardo Morras

Hello,

I want to propose to add sctp network protocol to backend and frontend code.

Light Sctp description:

Sctp is a layer 3 network protocol over ip networks, similar to tcp, udp and 
others. It's message oriented protocol, unlike tcp and udp that are data stream 
oriented, and it ensures that a message reach its destination, as tcp does with 
packets. Each sctp packet can have multiple messages or parts of messages 
inside. It provides multihoming, dinamically bind and unbind network devices 
(eth0, eth1, wifi...), mix ipv4 and ipv6, congestion control algorithms similar 
to tcp, message reliability and administration (message time to live, retries 
on failure...). A connection between server and client is called association.

Sctp support exists in Linux since 2.4+, FreeBSD 7+, Solaris10+ and Cisco, 
Juniper, F5 and others routers. Windows and MacOSX needs (AFAIK) third-party 
drivers.

Benefits:

Dynamic multihoming, modifiable at run time, don't need aggregate links at OS 
level or shutdown servers/clients for a hardware or topology network change.
Message oriented connection.
Message reliability.
Inmune to SYN floods that affect tcp.
Assimetric multihoming, a client with 4 links(3x 1GbEth + wifi) can connect to 
a server with 1 link (10GbEth).
Metadata connection messages.

Problems:

Windows and MacOSX needs a third-party drivers.
Can't use TLS, encryption of tls record n depends on previous record. In sctp 
order and reliability of packets is not assured and packet content may change. 
As sctp is message oriented it needs DTLS-SCTP (not in this proposal) where 
encryption of packet n is independent from any other. See RFC 6083
Some network cards and drivers don't support hardware acceleration, cpu does 
crc check/calculation.
Firewalls needs sctp rules.

Implementation:

The tcp code resides in src/backend/libpq/pqcomm.c (unix/windows sockets and 
SSL socket portion) and src/backend/libpq/ip.c (ipv6). Similar in frontend. 
Need new GUCs for sctp configuration, and a way to modify them at runtime.

I don't know how you want it, if you accept this proposal:

a) replicate pqcomm.c replacing tcp code with sctp in a new file pqcommsctp.c 
(full sctp, not compatibility tcp hack),
b) inside pqcomm.c and ip.c,
c) other

My main develop environment is FreeBSD10 and my contractor uses FreeBSD10 too 
but I'll check it works on some Linux distros.


Useful links:

http://www.bsdcan.org/2008/schedule/attachments/44_bsdcan_sctp.pdf
https://en.wikipedia.org/wiki/Stream_Control_Transmission_Protocol
https://tools.ietf.org/html/rfc6083

Thanks

---   ---
Eduardo Morras 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Add SCTP network protocol to postgresql backend and frontend

2016-03-23 Thread Andreas Karlsson

On 03/23/2016 01:55 PM, Eduardo Morras wrote:

Benefits:

Dynamic multihoming, modifiable at run time, don't need aggregate links at OS 
level or shutdown servers/clients for a hardware or topology network change.
Message oriented connection.
Message reliability.
Inmune to SYN floods that affect tcp.
Assimetric multihoming, a client with 4 links(3x 1GbEth + wifi) can connect to 
a server with 1 link (10GbEth).
Metadata connection messages.


While SCTP has some nice advantages in general (I think it is a pity it 
is not used more) I wonder how well these benefits translate into the 
database space. Many databases are run either in a controlled server 
environment with no direct access from the Internet, or locally on the 
same machine as the application. In those environments you generally do 
not have to worry about SYN floods or asymmetric links.


Do you have any specific use case in mind?

Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2016-03-23 Thread Thom Brown
On 22 March 2016 at 02:30, Etsuro Fujita  wrote:
> On 2016/03/19 3:30, Robert Haas wrote:
>>
>> On Fri, Mar 18, 2016 at 5:15 AM, Etsuro Fujita
>>  wrote:
>>>
>>> Attached is the updated version of the patch.

I've noticed that you now can't cancel a query if there's DML pushdown
to a foreign server.  This previously worked while it was sending
individual statements as it interrupted and rolled it back.

Here's what the local server sees when trying to cancel:

# DELETE FROM remote.contacts;
^CCancel request sent
DELETE 500

This should probably be fixed.

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Merlin Moncure
On Wed, Mar 23, 2016 at 7:27 AM, Craig Ringer  wrote:
> With PREPARE IF NOT EXISTS the client is also paying parse and network
> overhead for every time you send that statement. Much better not to send it
> repeatedly in the first place.

How did you determine that?  The client prepares the statement exactly
once.  The problem is there's no easy way to determine if someone else
prepared it first and this patch directly addresses that.

> I think we need to take a step back here and better define the problem
> before stepping in with a proposed solution. Something that avoids the need
> to spam the server with endless copies of the same PREPARE statements would
> be good.

I'm not understanding the objection at all.  You have N client
sessions connecting to the database that all utilize the same named
prepared statement.  A typical pattern is for the application to
prepare them all upon startup, but currently each PREPARE needs to be
wrapped with an exception handler in case someone else prepared it
first.  Having an IF NOT EXISTS decoration simplifies this.  This can
happen both inside and outside of connection pooling scenarios.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multivariate statistics v14

2016-03-23 Thread Tomas Vondra

On 03/23/2016 06:20 AM, Tatsuo Ishii wrote:

I am now looking into the create statistics doc to see if the example
appearing in it is working. I will get back if I find any.


I have the ref doc: CREATE STATISTICS

There are nice examples how the multivariate statistics gives better
row number estimation. So I gave them a try.

"Create table t1 with two functionally dependent columns,
 i.e. knowledge of a value in the first column is sufficient for
 determining the value in the other column" The example creates table
 "t1", then populates it using generate_series. After CREATE
 STATISTICS, ANALYZE and EXPLAIN. I expected the EXPLAIN demonstrates
 how result rows estimation is enhanced by using the multivariate
 statistics.

Here is the EXPLAIN output using the multivariate statistics:

EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 1);
QUERY PLAN
---
 Seq Scan on t1  (cost=0.00..19425.00 rows=98 width=8) (actual 
time=76.876..76.876 rows=0 loops=1)
   Filter: ((a = 1) AND (b = 1))
   Rows Removed by Filter: 100
 Planning time: 0.146 ms
 Execution time: 76.896 ms
(5 rows)

Here is the EXPLAIN output without the multivariate statistics:

EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 1);
QUERY PLAN
--
 Seq Scan on t1  (cost=0.00..19425.00 rows=1 width=8) (actual 
time=78.867..78.867 rows=0 loops=1)
   Filter: ((a = 1) AND (b = 1))
   Rows Removed by Filter: 100
 Planning time: 0.102 ms
 Execution time: 78.885 ms
(5 rows)

It seems the row numbers estimation (98) using the multivariate
statistics is actually *worse* than the one (1) not using the
statistics because the actual row number is 0.


Yes, there's a mistake in the first query, because the conditions 
actually are not compatible. I.e. (i/100)=1 and (i/500)=1 have no 
overlapping rows, clearly. It should be


EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 0);

instead. Will fix.



Next example (using table "t2") is much better than the case using t1.

Here is the EXPLAIN output using the multivariate statistics:

EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 1);
   QUERY PLAN

 Seq Scan on t2  (cost=0.00..19425.00 rows=9633 width=8) (actual 
time=0.012..75.350 rows=1 loops=1)
   Filter: ((a = 1) AND (b = 1))
   Rows Removed by Filter: 99
 Planning time: 0.107 ms
 Execution time: 75.680 ms
(5 rows)

Here is the EXPLAIN output without the multivariate statistics:

EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 1);
  QUERY PLAN
--
 Seq Scan on t2  (cost=0.00..19425.00 rows=91 width=8) (actual 
time=0.008..76.614 rows=1 loops=1)
   Filter: ((a = 1) AND (b = 1))
   Rows Removed by Filter: 99
 Planning time: 0.067 ms
 Execution time: 76.935 ms
(5 rows)

This time it seems the row numbers estimation (9633) using the
multivariate statistics is much better than the one (91) not using the
statistics because the actual row number is 1.

The last example (using table "t3") seems no effect by multivariate statistics.


Yes. There's a typo in the example - it analyzes the wrong table (t2 
instead of t3). Once I fix that, the estimates are much better.



In summary, the only case which shows the effect of the multivariate
statistics is the "t2" case. So I don't see why other examples are
shown in the manual. Am I missing something?


No, thanks for spotting those mistakes. I'll fix them and submit a new 
version of the patch - either later today or perhaps tomorrow.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Generic WAL logical messages

2016-03-23 Thread Alvaro Herrera
Petr Jelinek wrote:

> +++ b/contrib/test_decoding/sql/messages.sql
> @@ -0,0 +1,17 @@
> +-- predictability
> +SET synchronous_commit = on;
> +
> +SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot', 
> 'test_decoding');
> +
> +SELECT 'msg1' FROM pg_logical_emit_message(true, 'test', 'msg1');
> +SELECT 'msg2' FROM pg_logical_emit_message(false, 'test', 'msg2');
> +
> +BEGIN;
> +SELECT 'msg3' FROM pg_logical_emit_message(true, 'test', 'msg3');
> +SELECT 'msg4' FROM pg_logical_emit_message(false, 'test', 'msg4');
> +SELECT 'msg5' FROM pg_logical_emit_message(true, 'test', 'msg5');
> +COMMIT;
> +
> +SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 
> 'force-binary', '0', 'skip-empty-xacts', '1');
> +
> +SELECT 'init' FROM pg_drop_replication_slot('regression_slot');

No tests for a rolled back transaction?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Add SCTP network protocol to postgresql backend and frontend

2016-03-23 Thread Alvaro Herrera
Andreas Karlsson escribió:
> On 03/23/2016 01:55 PM, Eduardo Morras wrote:
> >Benefits:
> >
> >Dynamic multihoming, modifiable at run time, don't need aggregate links at 
> >OS level or shutdown servers/clients for a hardware or topology network 
> >change.
> >Message oriented connection.
> >Message reliability.
> >Inmune to SYN floods that affect tcp.
> >Assimetric multihoming, a client with 4 links(3x 1GbEth + wifi) can connect 
> >to a server with 1 link (10GbEth).
> >Metadata connection messages.
> 
> While SCTP has some nice advantages in general (I think it is a pity it is
> not used more) I wonder how well these benefits translate into the database
> space. Many databases are run either in a controlled server environment with
> no direct access from the Internet, or locally on the same machine as the
> application. In those environments you generally do not have to worry about
> SYN floods or asymmetric links.

That might or might not be the most common cases, but replication across
the ocean and similar long-range setups are a reality today and their use
will only increase.

I wonder about message ordering.  Is it possible to get messages out of
order in SCTP?  Say if you have an ordered resultset stream from the
server, it would be disastrous to get the data messages out of order.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in searching path in jsonb_set when walking through JSONB array

2016-03-23 Thread Tom Lane
Michael Paquier  writes:
> That's ugly. We should actually use TextDatumGetCString because the
> index is stored as text here via a Datum, and then it is converted
> back to an integer. So I propose instead the simple patch attached
> that fixes the failure for me. Could you check if that works for you?

Yeah, this seems better.  But that code needs review anyway, as it's using
elog() for user-facing error conditions, and I'm thinking the error texts
could use a bit of attention from a native English speaker.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Add something in struct PGPROC

2016-03-23 Thread A student from china
Hi
I just add something in struct PGPORC, such as "int level", But I cann't 
seet what I added in MyProc while debugging. That's confused, 


zhangxiaojian

Re: [HACKERS] [PROPOSAL] Add SCTP network protocol to postgresql backend and frontend

2016-03-23 Thread Andreas Karlsson

On 03/23/2016 02:13 PM, Alvaro Herrera wrote:

Andreas Karlsson escribió:

On 03/23/2016 01:55 PM, Eduardo Morras wrote:

Benefits:

Dynamic multihoming, modifiable at run time, don't need aggregate links at OS 
level or shutdown servers/clients for a hardware or topology network change.
Message oriented connection.
Message reliability.
Inmune to SYN floods that affect tcp.
Assimetric multihoming, a client with 4 links(3x 1GbEth + wifi) can connect to 
a server with 1 link (10GbEth).
Metadata connection messages.


While SCTP has some nice advantages in general (I think it is a pity it is
not used more) I wonder how well these benefits translate into the database
space. Many databases are run either in a controlled server environment with
no direct access from the Internet, or locally on the same machine as the
application. In those environments you generally do not have to worry about
SYN floods or asymmetric links.


That might or might not be the most common cases, but replication across
the ocean and similar long-range setups are a reality today and their use
will only increase.


Agreed. When I reread my message I realized that I implied things I did 
not mean. People run databases today in the cloud and, as you said, long 
distance replication will only get more common. What I am actually 
curious about is how the advantages of SCTP translate into the database 
space.



I wonder about message ordering.  Is it possible to get messages out of
order in SCTP?  Say if you have an ordered resultset stream from the
server, it would be disastrous to get the data messages out of order.


Message ordering is an optional feature in SCTP, so if you need message 
ordering you can get it.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: "Causal reads" mode for load balancing reads without stale data

2016-03-23 Thread Robert Haas
On Wed, Mar 23, 2016 at 8:43 AM, Michael Paquier
 wrote:
> On Wed, Mar 23, 2016 at 8:39 PM, Robert Haas  wrote:
>> On Wed, Mar 23, 2016 at 7:34 AM, Thomas Munro
>>  wrote:
>>> synchronous_commitTPS
>>>  
>>> off  9234
>>> local1223
>>> remote_write  907
>>> on587
>>> remote_apply  555
>>>
>>> synchronous_commitTPS
>>>  
>>> off  3937
>>> local1984
>>> remote_write 1701
>>> on   1373
>>> remote_apply 1349
>>
>> Hmm, so "remote_apply" is barely more expensive than "on".  That's 
>> encouraging.
>
> Indeed, interesting. This is perhaps proving that just having the
> possibility to have remote_apply (with feedback messages managed by
> signals, which is the best thing proposed for this release) is what we
> need to ensure the consistency of reads across nodes, and what would
> satisfy most of the user's requirements. Getting a slightly lower TPS
> may be worth the cost for some users if they can ensure that reads
> across nodes are accessible after a local commit, and there is no need
> to have an error management layer at application level to take care of
> errors caused by causal read timeouts.

Well, I wouldn't go that far.  It seems pretty clear that remote_apply
by itself is useful - I can't imagine anybody seriously arguing the
contrary, whatever they think of this implementation.  My view,
though, is that by itself that's pretty limiting: you can only have
one standby, and if that standby falls over then you lose
availability.  Causal reads fixes both of those problems - admittedly
that requires some knowledge in the application or the pooler, but
it's no worse than SSI in that regard.  Still, half a loaf is better
than none, and I imagine even just getting remote_apply would make a
few people quite happy.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Add SCTP network protocol to postgresql backend and frontend

2016-03-23 Thread Eduardo Morras
On Wed, 23 Mar 2016 14:03:31 +0100
Andreas Karlsson  wrote:

> On 03/23/2016 01:55 PM, Eduardo Morras wrote:
> > Benefits:
> >
> > Dynamic multihoming, modifiable at run time, don't need aggregate
> > links at OS level or shutdown servers/clients for a hardware or
> > topology network change. Message oriented connection. Message
> > reliability. Inmune to SYN floods that affect tcp.
> > Assimetric multihoming, a client with 4 links(3x 1GbEth + wifi) can
> > connect to a server with 1 link (10GbEth). Metadata connection
> > messages.
> 
> While SCTP has some nice advantages in general (I think it is a pity
> it is not used more) I wonder how well these benefits translate into
> the database space. Many databases are run either in a controlled
> server environment with no direct access from the Internet, or
> locally on the same machine as the application. In those environments
> you generally do not have to worry about SYN floods or asymmetric
> links.
> 
> Do you have any specific use case in mind?

The main use case is change the network topology on the fly, without shutting 
down postgresql server, postgresql middleware, or any of the applications that 
uses it through libpq. 

Specific use case, backup is backup server on OS level or pgdump, not 
postgresql slave, (hope it don't wraps) 

backup <-> postgresql <-> middleware <-> client apps <-> backup

At peak times you need all nics connected between postgresql servers and 
middleware and client apps,

backup <-> postgresql <=> middleware <=> client apps <-> backup

at night or idle time or while backup, you can reassign the nics to get more 
network bandwith to backup server

backup <=> postgresql <-> middleware <-> client apps <=> backup

On a crash restore, all nics are used from backup to servers

backup  postgresql < > middleware < > client apps  backup

> Andreas


---   ---
Eduardo Morras 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add something in struct PGPROC

2016-03-23 Thread Aleksander Alekseev
> Hi
> I just add something in struct PGPORC, such as "int level", But I
> cann't seet what I added in MyProc while debugging. That's confused, 
> 
> 
> zhangxiaojian

I think you forgot to run `make clean` after changing proc.h. When you
change header files dependent files are not recompiled automatically. 

-- 
Best regards,
Aleksander Alekseev
http://eax.me/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Add SCTP network protocol to postgresql backend and frontend

2016-03-23 Thread Eduardo Morras
On Wed, 23 Mar 2016 10:13:42 -0300
Alvaro Herrera  wrote:

> Andreas Karlsson escribió:
> > On 03/23/2016 01:55 PM, Eduardo Morras wrote:
> > >Benefits:
> > >
> > >Dynamic multihoming, modifiable at run time, don't need aggregate
> > >links at OS level or shutdown servers/clients for a hardware or
> > >topology network change. Message oriented connection. Message
> > >reliability. Inmune to SYN floods that affect tcp.
> > >Assimetric multihoming, a client with 4 links(3x 1GbEth + wifi)
> > >can connect to a server with 1 link (10GbEth). Metadata connection
> > >messages.
> > 
> > While SCTP has some nice advantages in general (I think it is a
> > pity it is not used more) I wonder how well these benefits
> > translate into the database space. Many databases are run either in
> > a controlled server environment with no direct access from the
> > Internet, or locally on the same machine as the application. In
> > those environments you generally do not have to worry about SYN
> > floods or asymmetric links.
> 
> That might or might not be the most common cases, but replication
> across the ocean and similar long-range setups are a reality today
> and their use will only increase.
> 
> I wonder about message ordering.  Is it possible to get messages out
> of order in SCTP?  Say if you have an ordered resultset stream from
> the server, it would be disastrous to get the data messages out of
> order.

Message ordering is optional, server decides if clients can use messages out of 
order as received or strictly in the same order as sended.

 
> -- 
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

---   ---
Eduardo Morras 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2016-03-23 Thread Robert Haas
On Wed, Mar 23, 2016 at 5:49 AM, Aleksander Alekseev
 wrote:
> I still believe that optimizing 1% blindly without considering possible
> side effects this optimization can bring (other data alignment, some
> additional machine instructions - just to name a few) and having no
> way to measure these side effects is a bad idea. But I also admit a
> possibility that I can somehow be wrong on this. So I rewrote this
> patch one again :'( the way you suggested (without that alignment
> related hack I tried, it's just too ugly). I also attached original
> hashhdr-rmh.patch just to have both patches in one message so it would
> be easier to find both patches in this thread.

Thanks!  I can't think of anything else to worry about with regards to
that version, so I have committed it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2016-03-23 Thread Alexandr Popov



On 18.03.2016 20:19, Anastasia Lubennikova wrote:
Please, find the new version of the patch attached. Now it has WAL 
functionality.


Detailed description of the feature you can find in README draft 
https://goo.gl/50O8Q0


This patch is pretty complicated, so I ask everyone, who interested in 
this feature,
to help with reviewing and testing it. I will be grateful for any 
feedback.
But please, don't complain about code style, it is still work in 
progress.


Next things I'm going to do:
1. More debugging and testing. I'm going to attach in next message 
couple of sql scripts for testing.

2. Fix NULLs processing
3. Add a flag into pg_index, that allows to enable/disable compression 
for each particular index.
4. Recheck locking considerations. I tried to write code as less 
invasive as possible, but we need to make sure that algorithm is still 
correct.

5. Change BTMaxItemSize
6. Bring back microvacuum functionality.




Hi, hackers.

It's my first review, so do not be strict to me.

I have tested this patch on the next table:
create table message
(
idserial,
usr_idinteger,
texttext
);
CREATE INDEX message_usr_id ON message (usr_id);
The table has 1000 records.

I found the following:
The less unique keys the less size of the table.

Next 2 tablas demonstrates it.
New B-tree
Count of unique keys (usr_id), index“s size , time of creation
1000;"214 MB";"00:00:34.193441"
333  ;"214 MB";"00:00:45.731173"
200  ;"129 MB";"00:00:41.445876"
100  ;"129 MB";"00:00:38.455616"
10;"86 MB"  ;"00:00:40.887626"
1  ;"79 MB"  ;"00:00:47.199774"

Old B-tree
Count of unique keys (usr_id), index“s size , time of creation
1000;"214 MB";"00:00:35.043677"
333  ;"286 MB";"00:00:40.922845"
200  ;"300 MB";"00:00:46.454846"
100  ;"278 MB";"00:00:42.323525"
10;"287 MB";"00:00:47.438132"
1  ;"280 MB";"00:01:00.307873"

I inserted data  randomly and sequentially, it did not influence the 
index's size.
Time of select, insert and update random rows is not changed. It is 
great, but certainly it needs some more detailed study.


Alexander Popov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Yury Zhuravlev

Fabrízio de Royes Mello wrote:

I got an error when build this patch.


I fix it! All tests pass (include ecpg tests). 
Patch in attachment. 

Thanks. 

PS Who use ecpg? For me it's like hell. 


--
Yury Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/doc/src/sgml/ref/prepare.sgml b/doc/src/sgml/ref/prepare.sgml
index dbce8f2..c52879f 100644
--- a/doc/src/sgml/ref/prepare.sgml
+++ b/doc/src/sgml/ref/prepare.sgml
@@ -26,7 +26,7 @@ PostgreSQL documentation
 
  
 
-PREPARE name [ ( data_type [, ...] ) ] AS statement
+PREPARE [ IF NOT EXISTS ] name [ ( data_type [, ...] ) ] AS statement
 
  
 
@@ -86,6 +86,15 @@ PREPARE name [ ( 

+IF NOT EXISTS
+
+ 
+  Do not throw an error if a prepare statement with the same name already exists.
+ 
+
+   
+
+   
 name
 
  
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index cec37ce..019330f 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -59,6 +59,7 @@ PrepareQuery(PrepareStmt *stmt, const char *queryString)
 	int			nargs;
 	Query	   *query;
 	List	   *query_list;
+	bool		found;
 	int			i;
 
 	/*
@@ -70,6 +71,30 @@ PrepareQuery(PrepareStmt *stmt, const char *queryString)
 (errcode(ERRCODE_INVALID_PSTATEMENT_DEFINITION),
  errmsg("invalid statement name: must not be empty")));
 
+	/* Find entry in hash table */
+	if(prepared_queries)
+	{
+		hash_search(prepared_queries,
+	stmt->name,
+	HASH_FIND,
+	&found);
+
+		/* Shouldn't get a duplicate entry */
+		if (found && stmt->if_not_exists)
+		{
+			ereport(NOTICE,
+	(errcode(ERRCODE_DUPLICATE_PSTATEMENT),
+	 errmsg("prepared statement \"%s\" already exists, skipping",
+			stmt->name)));
+			return;
+		}
+		else if (found && !stmt->if_not_exists)
+			ereport(ERROR,
+	(errcode(ERRCODE_DUPLICATE_PSTATEMENT),
+	 errmsg("prepared statement \"%s\" already exists",
+			stmt->name)));
+	}
+
 	/*
 	 * Create the CachedPlanSource before we do parse analysis, since it needs
 	 * to see the unmodified raw parse tree.
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index df7c2fa..be8ac78 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -4021,6 +4021,7 @@ _copyPrepareStmt(const PrepareStmt *from)
 	COPY_STRING_FIELD(name);
 	COPY_NODE_FIELD(argtypes);
 	COPY_NODE_FIELD(query);
+	COPY_SCALAR_FIELD(if_not_exists);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index b9c3959..fbd248b 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2017,6 +2017,7 @@ _equalPrepareStmt(const PrepareStmt *a, const PrepareStmt *b)
 	COMPARE_STRING_FIELD(name);
 	COMPARE_NODE_FIELD(argtypes);
 	COMPARE_NODE_FIELD(query);
+	COMPARE_SCALAR_FIELD(if_not_exists);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b9aeb31..e08d95f 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9443,6 +9443,16 @@ PrepareStmt: PREPARE name prep_type_clause AS PreparableStmt
 	n->name = $2;
 	n->argtypes = $3;
 	n->query = $5;
+	n->if_not_exists = false;
+	$$ = (Node *) n;
+}
+			| PREPARE IF_P NOT EXISTS name prep_type_clause AS PreparableStmt
+{
+	PrepareStmt *n = makeNode(PrepareStmt);
+	n->name = $5;
+	n->argtypes = $6;
+	n->query = $8;
+	n->if_not_exists = true;
 	$$ = (Node *) n;
 }
 		;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2fd0629..f08dee4 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2986,6 +2986,7 @@ typedef struct PrepareStmt
 	char	   *name;			/* Name of plan, arbitrary */
 	List	   *argtypes;		/* Types of parameters (List of TypeName) */
 	Node	   *query;			/* The query itself (as a raw parsetree) */
+	bool	   if_not_exists;
 } PrepareStmt;
 
 
diff --git a/src/interfaces/ecpg/preproc/check_rules.pl b/src/interfaces/ecpg/preproc/check_rules.pl
index 4c981e0..cba9a74 100644
--- a/src/interfaces/ecpg/preproc/check_rules.pl
+++ b/src/interfaces/ecpg/preproc/check_rules.pl
@@ -43,7 +43,9 @@ my %replace_line = (
 	  => 'CREATE OptTemp TABLE create_as_target AS EXECUTE prepared_name execute_param_clause',
 
 	'PrepareStmtPREPAREnameprep_type_clauseASPreparableStmt' =>
-	  'PREPARE prepared_name prep_type_clause AS PreparableStmt');
+	  'PREPARE prepared_name prep_type_clause AS PreparableStmt',
+	'PrepareStmtPREPAREIF_PNOTEXISTSnameprep_type_clauseASPreparableStmt' =>
+	  'PREPARE IF_P NOT EXISTS prepared_name prep_type_clause AS PreparableStmt');
 
 my $block= '';
 my $yaccmode = 0;
diff --git a/src/interfaces/ecpg/preproc/ecpg.addons b/src/interfaces/ecpg/preproc/ecpg.addons
index b3b36cf..c2742cf 100644
--- a/src/interfaces/ecpg/preproc/ecpg.addons
+++ b/src/interfaces/ecpg/preproc/ecpg.addons
@@ -283,6 +283,12 @@ EC

Re: [HACKERS] README for src/backend/replication/logical

2016-03-23 Thread Alvaro Herrera
Craig Ringer wrote:
> Hi all
> 
> As part of some internal training/discussion I wrote some notes on logical
> decoding's structure and flow that seemed worth polishing up into a draft
> README for src/backend/replication/logical .

That's nice, but you attached the one in src/backend/replication
instead.


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Relax requirement for INTO with SELECT in pl/pgsql

2016-03-23 Thread Jim Nasby

On 3/22/16 8:37 AM, Merlin Moncure wrote:

I afraid of useless and forgotten call of functions. But the risk is same
>like PERFORM - so this is valid from one half. The PERFORM statement holds
>special semantic, and it is interesting.

I see your point here, but the cost of doing that far outweighs the
risks.  And I don't think the arbitrary standard of defining forcing
the user to identify if the query should return data is a good way of
identifying dead code.


Not to mention that there's tons of other ways to introduce unintended 
inefficiencies. Off the top of my head, declaring variables that are 
never referenced and have no assignment is a big one.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] fix DROP OPERATOR to reset links to itself on commutator and negator

2016-03-23 Thread Robert Haas
On Tue, Mar 22, 2016 at 9:25 PM, Tomas Vondra
 wrote:
> OK, the new code seems more comprehensible to me.

I did not find this version very clear.  It wasn't consistent about
using ObjectIdGetDatum() where needed, but the bigger problem was that
I found the logic unnecessarily convoluted.  I rewrote it - I believe
more straightforwardly - as attached.  How does this look?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/catalog/pg_operator.c b/src/backend/catalog/pg_operator.c
index 3cd1899..cd66823 100644
--- a/src/backend/catalog/pg_operator.c
+++ b/src/backend/catalog/pg_operator.c
@@ -54,8 +54,6 @@ static Oid OperatorShellMake(const char *operatorName,
   Oid leftTypeId,
   Oid rightTypeId);
 
-static void OperatorUpd(Oid baseId, Oid commId, Oid negId);
-
 static Oid get_other_operator(List *otherOp,
    Oid otherLeftTypeId, Oid otherRightTypeId,
    const char *operatorName, Oid operatorNamespace,
@@ -566,7 +564,7 @@ OperatorCreate(const char *operatorName,
 		commutatorId = operatorObjectId;
 
 	if (OidIsValid(commutatorId) || OidIsValid(negatorId))
-		OperatorUpd(operatorObjectId, commutatorId, negatorId);
+		OperatorUpd(operatorObjectId, commutatorId, negatorId, false);
 
 	return address;
 }
@@ -639,125 +637,158 @@ get_other_operator(List *otherOp, Oid otherLeftTypeId, Oid otherRightTypeId,
  * OperatorUpd
  *
  *	For a given operator, look up its negator and commutator operators.
- *	If they are defined, but their negator and commutator fields
- *	(respectively) are empty, then use the new operator for neg or comm.
- *	This solves a problem for users who need to insert two new operators
- *	which are the negator or commutator of each other.
+ *	When isDelete is false, update their negator and commutator operators to
+ *	point back to the given operator; when isDelete is true, update those
+ *	operators to no longer point back to the given operator.
+ *
+ *	The !isDelete case solves a problem for users who need to insert two new
+ *  operators which are the negator or commutator of each other, while the
+ *  isDelete case is important so as not to leave dangling OID links behind
+ *  after dropping an operator.
  */
-static void
-OperatorUpd(Oid baseId, Oid commId, Oid negId)
+void
+OperatorUpd(Oid baseId, Oid commId, Oid negId, bool isDelete)
 {
-	int			i;
 	Relation	pg_operator_desc;
-	HeapTuple	tup;
-	bool		nulls[Natts_pg_operator];
-	bool		replaces[Natts_pg_operator];
-	Datum		values[Natts_pg_operator];
-
-	for (i = 0; i < Natts_pg_operator; ++i)
-	{
-		values[i] = (Datum) 0;
-		replaces[i] = false;
-		nulls[i] = false;
-	}
+	HeapTuple	tup = NULL;
 
 	/*
-	 * check and update the commutator & negator, if necessary
-	 *
-	 * We need a CommandCounterIncrement here in case of a self-commutator
-	 * operator: we'll need to update the tuple that we just inserted.
+	 * If we're making an operator into its own commutator, then we need a
+	 * command-counter increment here, since we've just inserted the tuple
+	 * we're about to update.  But when we're dropping an operator, we can
+	 * skip this.
 	 */
-	CommandCounterIncrement();
+	if (!isDelete)
+		CommandCounterIncrement();
 
+	/* Open the relation. */
 	pg_operator_desc = heap_open(OperatorRelationId, RowExclusiveLock);
 
-	tup = SearchSysCacheCopy1(OPEROID, ObjectIdGetDatum(commId));
+	/* Get a copy of the commutator's tuple. */
+	if (OidIsValid(commId))
+		tup = SearchSysCacheCopy1(OPEROID, ObjectIdGetDatum(commId));
 
-	/*
-	 * if the commutator and negator are the same operator, do one update. XXX
-	 * this is probably useless code --- I doubt it ever makes sense for
-	 * commutator and negator to be the same thing...
-	 */
-	if (commId == negId)
+	/* Update the commutator's tuple if need be. */
+	if (HeapTupleIsValid(tup))
 	{
-		if (HeapTupleIsValid(tup))
+		Form_pg_operator t = (Form_pg_operator) GETSTRUCT(tup);
+		bool		update_commutator = false;
+		bool		nulls[Natts_pg_operator];
+		bool		replaces[Natts_pg_operator];
+		Datum		values[Natts_pg_operator];
+
+		memset(values, 0, sizeof(values));
+		memset(replaces, 0, sizeof(replaces));
+		memset(nulls, 0, sizeof(nulls));
+
+		/*
+		 * Out of due caution, we only change the commutator's orpcom field
+		 * if it has the exact value we expected: InvalidOid when creating an
+		 * operator, and baseId when dropping one.
+		 */
+		if (isDelete && t->oprcom == baseId)
 		{
-			Form_pg_operator t = (Form_pg_operator) GETSTRUCT(tup);
+			values[Anum_pg_operator_oprcom - 1] = ObjectIdGetDatum(InvalidOid);
+			replaces[Anum_pg_operator_oprcom - 1] = true;
+			update_commutator = true;
+		}
+		else if (!isDelete && !OidIsValid(t->oprcom))
+		{
+			values[Anum_pg_operator_oprcom - 1] = ObjectIdGetDatum(baseId);
+			replaces[Anum_pg_operator_oprcom - 1] = true;
+			update_commutator = true;
+		}
 
-			if (!OidIsValid(t->oprcom) || !OidIsValid(t->oprnegate))
+		/*
+		 * If the commutator is also the negator, which doesn't

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2016-03-23 Thread Aleksander Alekseev
> Thanks!  I can't think of anything else to worry about with regards to
> that version, so I have committed it.
> 

Thanks, Robert. And thanks everyone for contributing to this patch.


-- 
Best regards,
Aleksander Alekseev
http://eax.me/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Upper planner pathification

2016-03-23 Thread Jim Nasby

On 3/22/16 7:28 AM, Michael Paquier wrote:

On Mon, Mar 21, 2016 at 7:55 AM, Jim Nasby  wrote:

On 3/17/16 9:01 AM, Robert Haas wrote:


I think that
there are an awful lot of cases where extension authors haven't been
able to quite do what they want to do without core changes because
they couldn't get control in quite the right place; or they could do
it but they had to cut-and-paste a lot of code.


FWIW, I've certainly run into this at least once, maybe twice. The case I
can think of offhand is doing function resolution with variant. I don't
remember the details anymore, but my recollection is that to get what I
needed I would have needed to copy huge swaths of the rewrite code.


Amen, I have been doing that a couple of days ago with some elog stuff.


Any ideas on ways to address this? Adding more hooks in random places 
every time we stumble across something doesn't seem like a good method.


One thing I've wondered about is making it easier to find specific 
constructs in a parsed query so that you can make specific 
modifications. I recall looking at that once and finding a roadblock 
(maybe a bunch of functions were static?)

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Relax requirement for INTO with SELECT in pl/pgsql

2016-03-23 Thread Merlin Moncure
On Wed, Mar 23, 2016 at 10:57 AM, Jim Nasby  wrote:
> On 3/22/16 8:37 AM, Merlin Moncure wrote:
>>>
>>> I afraid of useless and forgotten call of functions. But the risk is same
>>> >like PERFORM - so this is valid from one half. The PERFORM statement
>>> > holds
>>> >special semantic, and it is interesting.
>>
>> I see your point here, but the cost of doing that far outweighs the
>> risks.  And I don't think the arbitrary standard of defining forcing
>> the user to identify if the query should return data is a good way of
>> identifying dead code.
>
> Not to mention that there's tons of other ways to introduce unintended
> inefficiencies. Off the top of my head, declaring variables that are never
> referenced and have no assignment is a big one.

Yes. This comes up most often with dblink for me.   Mainly because of
the slight wonkiness of dblink API, but totally agree this should
never have to be done.  Anyways, I submitted the patch to the next
open commit fest.

Totally off topic aside: are you in dallas for the next PUG? I
unfortunately missed the last one and will likely miss the next one
too -- I coach the company kickball team and we play on Wednesdays --
oh well.  Maybe beers afterwards?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Michael Meskes
> PS Who use ecpg? For me it's like hell. 

More than you think.

I doubt you want to propose removing features that make developing new
features harder, or do you? :)

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Merlin Moncure
On Wed, Mar 23, 2016 at 8:21 AM, Merlin Moncure  wrote:
> I'm not understanding the objection at all.  You have N client
> sessions connecting to the database that all utilize the same named
> prepared statement.  A typical pattern is for the application to
> prepare them all upon startup, but currently each PREPARE needs to be
> wrapped with an exception handler in case someone else prepared it
> first.  Having an IF NOT EXISTS decoration simplifies this.  This can
> happen both inside and outside of connection pooling scenarios.

I'm walking that back a bit -- this is only interesting in pooler
scenarios, especially pgbouncer where you have no way of knowing if
the statement is created or not.  Of course, you can always re-prepare
them following a discard but that's quite pessimal in many cases.
Still, I've often wanted this exact feature.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] fix DROP OPERATOR to reset links to itself on commutator and negator

2016-03-23 Thread Tom Lane
Robert Haas  writes:
> I did not find this version very clear.  It wasn't consistent about
> using ObjectIdGetDatum() where needed, but the bigger problem was that
> I found the logic unnecessarily convoluted.  I rewrote it - I believe
> more straightforwardly - as attached.  How does this look?

I'd suggest that we save some code by always doing separate updates for
the commutator and negator entries.  We can handle the corner case where
they're the same by doing a CommandCounterIncrement between the updates,
instead of having convoluted and probably-never-yet-tested logic.

I'm also a bit dubious of the assumption in RemoveOperatorById that an
operator can't be its own negator.  Yeah, that should not be the case,
but if it is the case the deletion will fail outright.

We could resolve both of these issues by changing the semantics of
OprUpdate so that it unconditionally does a CommandCounterIncrement
after each update that it performs.  IMO that would be a lot simpler
and more bulletproof; it'd allow removal of a lot of these
overly-tightly-reasoned cases.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres_fdw join pushdown - getting server crash in left outer join of three table

2016-03-23 Thread Robert Haas
On Wed, Mar 23, 2016 at 5:24 AM, Rajkumar Raghuwanshi
 wrote:
> Thanks Ashutosh for the patch. I have apply and retested it, now not getting
> server crash.

Thanks for the report and the testing.  I have committed the patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Rationalizing code-sharing among src/bin/ directories

2016-03-23 Thread Tom Lane
I'm not terribly happy with the hack I used to get pgbench to be able
to borrow psql's psqlscan.l lexer.  It's a mess for the MSVC build
scripts, and I have seen it causing two concurrent builds of psqlscan.o
in parallel builds, which is likely to cause a build failure some of
the time.  Another unresolved issue is that we can't apply FLEX_NO_BACKUP
testing to both of psql's lexers, because that causes each of those
make steps to want to write lex.backup in the current directory.

I have a modest proposal for improving this: let's move all the code
that's currently shared by two or more src/bin/ subdirectories into a
new directory, say src/feutils, and build it into a ".a" library in
the same way that src/common/ is handled.  This would remove the
following klugy cross-directory links:

src/bin/pgbench/psqlscan.o -> ../../../src/bin/psql/psqlscan.o
src/bin/psql/dumputils.c -> ../../../src/bin/pg_dump/dumputils.c
src/bin/psql/keywords.c -> ../../../src/bin/pg_dump/keywords.c
src/bin/scripts/dumputils.c -> ../../../src/bin/pg_dump/dumputils.c
src/bin/scripts/keywords.c -> ../../../src/bin/pg_dump/keywords.c
src/bin/scripts/mbprint.c -> ../../../src/bin/psql/mbprint.c
src/bin/scripts/print.c -> ../../../src/bin/psql/print.c

Note: the reason for a new subdirectory, rather than putting this
stuff into src/common/, is that src/common/ is meant for code that's
shared between frontend and backend, which this stuff is not.  Also,
many of these files depend on libpq which seems like an inappropriate
dependency for src/common.

Having said that, I also notice these dependencies:

src/bin/pg_dump/kwlookup.c -> ../../../src/backend/parser/kwlookup.c
src/bin/psql/kwlookup.c -> ../../../src/backend/parser/kwlookup.c
src/bin/scripts/kwlookup.c -> ../../../src/backend/parser/kwlookup.c
src/interfaces/ecpg/preproc/kwlookup.c -> 
../../../../src/backend/parser/kwlookup.c
src/bin/initdb/encnames.c -> ../../../src/backend/utils/mb/encnames.c
src/interfaces/libpq/encnames.c -> ../../../src/backend/utils/mb/encnames.c

which seem like they'd be better handled by moving those files into
src/common/.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Regina Obe
In the past couple of weeks our PostGIS tests against PostgreSQL 9.6 dev
started failing.  I traced the issue down to a behavior change in 9.6 when
dealing with output of set returning functions when used with (func).*
syntax.

Here is an example not involving PostGIS.  Is this an intentional change in
behavior?

CREATE OR REPLACE FUNCTION dumpset(param_num integer, param_text text)
RETURNS TABLE(id integer, junk1 text, junk2 text)
AS
$$
BEGIN
 RETURN QUERY SELECT id2 As id, $1 || $2::text As junk1, $1 || id2::text AS
junk2
FROM generate_series(1,2) As id2;
END;

$$
language 'plpgsql';

-- Get 16 rows in 9.6, Get 8 rows in 9.5
SELECT (dumpset(f.test, 'hello world' || f.test)).*
FROM generate_series(1,4) As f(test)
ORDER BY junk2;


I know that functions get called multiple times with (..).* and so it's
frowned upon, but before the results would only return once and I suspect
for people who are lazy and also don't mind the penalty cost they might just
use this syntax.
If its intentional I can change the tests to follow the best practice
approach.

I think the tests started failing around March 8th which I thought might
have to do with this commit: 9118d03a8cca3d97327c56bf89a72e328e454e63
(around that time) 
When appropriate, postpone SELECT output expressions till after ORDER
BY.
It is frequently useful for volatile, set-returning, or expensive
functions in a SELECT's targetlist to be postponed till after ORDER BY
and LIMIT are done.

Which involved change in output sort.

I'm not absolutely sure if this has to do with that commit, because we had
another test failing (where the order of the result changed, and putting in
an order by fixed that test). 



Thanks,
Regina






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dealing with extension dependencies that aren't quite 'e'

2016-03-23 Thread Abhijit Menon-Sen
Hi.

I implemented "ALTER FUNCTION … DEPENDS ON EXTENSION" using a new node
(AlterObjectDependsStmt), and tried to add "ALTER TRIGGER … DEPENDS ON
EXTENSION" (partly because I wanted to make sure the code could support
multiple object types, partly because it's convenient in this particular
use case). Then I ran into a problem, which I could use some help with.

The main ExecAlterObjectDependsStmt() function is very simple:

+ObjectAddress
+ExecAlterObjectDependsStmt(AlterObjectDependsStmt *stmt)
+{
+   ObjectAddress address;
+   ObjectAddress extAddr;
+
+   address = get_object_address(stmt->objectType, stmt->objname, stmt->objargs,
+NULL, AccessExclusiveLock, false);
+   extAddr = get_object_address(OBJECT_EXTENSION, stmt->extname, NULL,
+NULL, AccessExclusiveLock, false);
+
+   recordDependencyOn(&address, &extAddr, DEPENDENCY_AUTO_EXTENSION);
+
+   return address;
+}

All that's needed is to construct the node based on variants of the
ALTER command:

+AlterObjectDependsStmt:
+ALTER FUNCTION function_with_argtypes DEPENDS ON EXTENSION name
+{
+AlterObjectDependsStmt *n = 
makeNode(AlterObjectDependsStmt);
+n->objectType = OBJECT_FUNCTION;
+n->objname = $3->funcname;
+n->objargs = $3->funcargs;
+n->extname = list_make1(makeString($7));
+$$ = (Node *)n;
+}
+| ALTER TRIGGER name ON any_name DEPENDS ON EXTENSION name
+{
+AlterObjectDependsStmt *n = 
makeNode(AlterObjectDependsStmt);
+n->objectType = OBJECT_TRIGGER;
+n->objname = list_make1(lappend($5, makeString($3)));
+n->objargs = NIL;
+n->extname = list_make1(makeString($9));
+$$ = (Node *)n;
+}
+;

Now, the first part of this works fine. But with the second part, I get
a reduce/reduce conflict if I use any_name. Here's an excerpt from the
verbose bison output:

State 2920

  1181 AlterObjectDependsStmt: ALTER TRIGGER name ON any_name DEPENDS . ON 
EXTENSION name

ON  shift, and go to state 3506


State 2921

  1165 RenameStmt: ALTER TRIGGER name ON qualified_name RENAME . TO name

TO  shift, and go to state 3507


State 2922

  829 attrs: '.' . attr_name
  2006 indirection_el: '.' . attr_name
  2007   | '.' . '*'

  …

  attr_name   go to state 3508
  ColLabelgo to state 1937
  unreserved_keyword  go to state 1938
  col_name_keywordgo to state 1939
  type_func_name_keyword  go to state 1940
  reserved_keywordgo to state 1941


State 3506

  1181 AlterObjectDependsStmt: ALTER TRIGGER name ON any_name DEPENDS ON . 
EXTENSION name

EXTENSION  shift, and go to state 4000


State 3507

  1165 RenameStmt: ALTER TRIGGER name ON qualified_name RENAME TO . name

…

namego to state 4001
ColId   go to state 543
unreserved_keyword  go to state 544
col_name_keywordgo to state 545


State 3508

  829 attrs: '.' attr_name .
  2006 indirection_el: '.' attr_name .

RENAMEreduce using rule 2006 (indirection_el)
'['   reduce using rule 2006 (indirection_el)
'.'   reduce using rule 829 (attrs)
'.'   [reduce using rule 2006 (indirection_el)]
$default  reduce using rule 829 (attrs)

I can see that the problem is that it can reduce '.' in two ways, but
I'm afraid I don't remember enough about yacc to know how to fix it. If
anyone has suggestions, I would be grateful.

Meanwhile, I tried using qualified_name instead of any_name, which does
work without conflicts, but I end up with a RangeVar instead of the sort
of List* that I can feed to get_object_address.

I could change ExecAlterObjectDependsStmt() to check if stmt->relation
is set, and if so, convert the RangeVar back to a List* in the right
format before passing it to get_object_address. That would work fine,
but it doesn't seem like a good solution.

I could write some get_object_address_rv() wrapper that does essentially
the same conversion, but that's only slightly better.

Are there any other options?

(Apart from the above, the rest of the patch is really just boilerplate
for the new node type, but I've attached it anyway for reference.)

-- Abhijit
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 951f59b..189b771 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -2864,6 +2864,19 @@
   
  
 
+
+
+ DEPENDENCY_AUTO_EXTENSION (x)
+ 
+  
+   The dependent object is not a member of the extension that is the
+   referenced object (and so should not be ignored by pg_dump), but
+   cannot function without it and should be dropped when the
+   extension itself is. The dependent obj

Re: [HACKERS] [PATCH] fix DROP OPERATOR to reset links to itself on commutator and negator

2016-03-23 Thread Robert Haas
On Wed, Mar 23, 2016 at 12:27 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I did not find this version very clear.  It wasn't consistent about
>> using ObjectIdGetDatum() where needed, but the bigger problem was that
>> I found the logic unnecessarily convoluted.  I rewrote it - I believe
>> more straightforwardly - as attached.  How does this look?
>
> I'd suggest that we save some code by always doing separate updates for
> the commutator and negator entries.  We can handle the corner case where
> they're the same by doing a CommandCounterIncrement between the updates,
> instead of having convoluted and probably-never-yet-tested logic.

Sure, we could do that, but it isn't necessary.  If the logic never
gets hit, the question of whether it has bugs isn't that important.
And I'd rather not rejigger things more than necessary in something
that's going to be back-patched.

> I'm also a bit dubious of the assumption in RemoveOperatorById that an
> operator can't be its own negator.  Yeah, that should not be the case,
> but if it is the case the deletion will fail outright.

So what?  We've never guaranteed that things are going to work if you
start by corrupting the catalogs, and I wouldn't pick this as a place
to start.

> We could resolve both of these issues by changing the semantics of
> OprUpdate so that it unconditionally does a CommandCounterIncrement
> after each update that it performs.  IMO that would be a lot simpler
> and more bulletproof; it'd allow removal of a lot of these
> overly-tightly-reasoned cases.

I tried this, but it did not seem to work.  With the command counter
increments added and the conditional logic removed, I got:

rhaas=# CREATE OPERATOR === (PROCEDURE = int8eq, LEFTARG = bigint,
RIGHTARG = bigint);
CREATE OPERATOR
rhaas=# update pg_operator set oprnegate = oid where oprname = '===';
UPDATE 1
rhaas=# drop operator === (bigint, bigint);
ERROR:  attempted to delete invisible tuple

The same test case without those changes fails with:

ERROR:  tuple already updated by self

Interestingly, that test case passes on unpatched master.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-03-23 Thread Daniel Verite
Alvaro Herrera wrote:

> >   tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len);
> > 
> > which fails because (HEAPTUPLESIZE + len) is again considered
> > an invalid size, the  size being 1468006476 in my test.
> 
> Um, it seems reasonable to make this one be a huge-zero-alloc:
> 
>   MemoryContextAllocExtended(CurrentMemoryContext,
>  HEAPTUPLESIZE + len,
> MCXT_ALLOC_HUGE | MCXT_ALLOC_ZERO)

Good, this allows the tests to go to completion! The tests in question
are dump/reload of a row with several fields totalling 1.4GB (deflated),
with COPY TO/FROM file and psql's \copy in both directions, as well as
pg_dump followed by pg_restore|psql.

The modified patch is attached.

It provides a useful mitigation to dump/reload databases having
rows in the 1GB-2GB range, but it works under these limitations:

- no single field has a text representation exceeding 1GB.
- no row as text exceeds 2GB (\copy from fails beyond that. AFAICS we
  could push this to 4GB with limited changes to libpq, by
  interpreting the Int32 field in the CopyData message as unsigned).

It's also possible to go beyond 4GB per row with this patch, but
when not using the protocol. I've managed to get a 5.6GB single-row
file with COPY TO file. That doesn't help with pg_dump, but that might
be useful in other situations.

In StringInfo, I've changed int64 to Size, because on 32 bits platforms
the downcast from int64 to Size is problematic, and as the rest of the
allocation routines seems to favor Size, it seems more consistent
anyway.

I couldn't test on 32 bits though, as I seem to never have enough
free contiguous memory available on a 32 bits VM to handle
that kind of data.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 6d0f3f3..ed9cabd 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -741,7 +741,9 @@ heap_form_tuple(TupleDesc tupleDescriptor,
 	 * Allocate and zero the space needed.  Note that the tuple body and
 	 * HeapTupleData management structure are allocated in one chunk.
 	 */
-	tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len);
+	tuple = MemoryContextAllocExtended(CurrentMemoryContext,
+	   HEAPTUPLESIZE + len,
+	   MCXT_ALLOC_HUGE | MCXT_ALLOC_ZERO);
 	tuple->t_data = td = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE);
 
 	/*
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 3201476..ce681d7 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -446,6 +446,15 @@ SendCopyEnd(CopyState cstate)
 	}
 }
 
+/*
+ * Prepare for output
+ */
+static void
+CopyStartSend(CopyState cstate)
+{
+	allowLongStringInfo(cstate->fe_msgbuf);
+}
+
 /*--
  * CopySendData sends output data to the destination (file or frontend)
  * CopySendString does the same for null-terminated strings
@@ -2015,6 +2024,8 @@ CopyOneRowTo(CopyState cstate, Oid tupleOid, Datum *values, bool *nulls)
 	MemoryContextReset(cstate->rowcontext);
 	oldcontext = MemoryContextSwitchTo(cstate->rowcontext);
 
+	CopyStartSend(cstate);
+
 	if (cstate->binary)
 	{
 		/* Binary per-tuple header */
@@ -3203,6 +3214,7 @@ CopyReadLine(CopyState cstate)
 	bool		result;
 
 	resetStringInfo(&cstate->line_buf);
+	allowLongStringInfo(&cstate->line_buf);
 	cstate->line_buf_valid = true;
 
 	/* Mark that encoding conversion hasn't occurred yet */
@@ -3272,6 +3284,7 @@ CopyReadLine(CopyState cstate)
 		{
 			/* transfer converted data back to line_buf */
 			resetStringInfo(&cstate->line_buf);
+			allowLongStringInfo(&cstate->line_buf);
 			appendBinaryStringInfo(&cstate->line_buf, cvt, strlen(cvt));
 			pfree(cvt);
 		}
@@ -3696,7 +3709,7 @@ CopyReadAttributesText(CopyState cstate)
 	}
 
 	resetStringInfo(&cstate->attribute_buf);
-
+	allowLongStringInfo(&cstate->attribute_buf);
 	/*
 	 * The de-escaped attributes will certainly not be longer than the input
 	 * data line, so we can just force attribute_buf to be large enough and
diff --git a/src/backend/lib/stringinfo.c b/src/backend/lib/stringinfo.c
index 7382e08..6e451b2 100644
--- a/src/backend/lib/stringinfo.c
+++ b/src/backend/lib/stringinfo.c
@@ -47,12 +47,24 @@ initStringInfo(StringInfo str)
 {
 	int			size = 1024;	/* initial default buffer size */
 
-	str->data = (char *) palloc(size);
+	str->data = (char *) palloc(size);	/* no need for "huge" at this point */
 	str->maxlen = size;
+	str->allowlong = false;
 	resetStringInfo(str);
 }
 
 /*
+ * allocLongStringInfo
+ *
+ * Mark the StringInfo as a candidate for a "huge" allocation
+ */
+void
+allowLongStringInfo(StringInfo str)
+{
+	str->allowlong = true;
+}
+
+/*
  * resetStringInfo
  *
  * Reset the StringInfo: the data buffer remains valid, but its
@@ -64,6 +76,7 @@ resetStringInfo(StringInfo str)
 	str->data[0] = '\0';
 	str->len = 0;
 	str->cursor = 0;
+	str->allowl

Re: [HACKERS] Rationalizing code-sharing among src/bin/ directories

2016-03-23 Thread Robert Haas
On Wed, Mar 23, 2016 at 12:55 PM, Tom Lane  wrote:
> I'm not terribly happy with the hack I used to get pgbench to be able
> to borrow psql's psqlscan.l lexer.  It's a mess for the MSVC build
> scripts, and I have seen it causing two concurrent builds of psqlscan.o
> in parallel builds, which is likely to cause a build failure some of
> the time.  Another unresolved issue is that we can't apply FLEX_NO_BACKUP
> testing to both of psql's lexers, because that causes each of those
> make steps to want to write lex.backup in the current directory.
>
> I have a modest proposal for improving this: let's move all the code
> that's currently shared by two or more src/bin/ subdirectories into a
> new directory, say src/feutils, and build it into a ".a" library in
> the same way that src/common/ is handled.  This would remove the
> following klugy cross-directory links:
>
> src/bin/pgbench/psqlscan.o -> ../../../src/bin/psql/psqlscan.o
> src/bin/psql/dumputils.c -> ../../../src/bin/pg_dump/dumputils.c
> src/bin/psql/keywords.c -> ../../../src/bin/pg_dump/keywords.c
> src/bin/scripts/dumputils.c -> ../../../src/bin/pg_dump/dumputils.c
> src/bin/scripts/keywords.c -> ../../../src/bin/pg_dump/keywords.c
> src/bin/scripts/mbprint.c -> ../../../src/bin/psql/mbprint.c
> src/bin/scripts/print.c -> ../../../src/bin/psql/print.c
>
> Note: the reason for a new subdirectory, rather than putting this
> stuff into src/common/, is that src/common/ is meant for code that's
> shared between frontend and backend, which this stuff is not.  Also,
> many of these files depend on libpq which seems like an inappropriate
> dependency for src/common.
>
> Having said that, I also notice these dependencies:
>
> src/bin/pg_dump/kwlookup.c -> ../../../src/backend/parser/kwlookup.c
> src/bin/psql/kwlookup.c -> ../../../src/backend/parser/kwlookup.c
> src/bin/scripts/kwlookup.c -> ../../../src/backend/parser/kwlookup.c
> src/interfaces/ecpg/preproc/kwlookup.c -> 
> ../../../../src/backend/parser/kwlookup.c
> src/bin/initdb/encnames.c -> ../../../src/backend/utils/mb/encnames.c
> src/interfaces/libpq/encnames.c -> ../../../src/backend/utils/mb/encnames.c
>
> which seem like they'd be better handled by moving those files into
> src/common/.
>
> Thoughts?

No objection.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rationalizing code-sharing among src/bin/ directories

2016-03-23 Thread Petr Jelinek

On 23/03/16 17:55, Tom Lane wrote:

I'm not terribly happy with the hack I used to get pgbench to be able
to borrow psql's psqlscan.l lexer.  It's a mess for the MSVC build
scripts, and I have seen it causing two concurrent builds of psqlscan.o
in parallel builds, which is likely to cause a build failure some of
the time.  Another unresolved issue is that we can't apply FLEX_NO_BACKUP
testing to both of psql's lexers, because that causes each of those
make steps to want to write lex.backup in the current directory.

I have a modest proposal for improving this: let's move all the code
that's currently shared by two or more src/bin/ subdirectories into a
new directory, say src/feutils, and build it into a ".a" library in
the same way that src/common/ is handled.  This would remove the
following klugy cross-directory links:

src/bin/pgbench/psqlscan.o -> ../../../src/bin/psql/psqlscan.o
src/bin/psql/dumputils.c -> ../../../src/bin/pg_dump/dumputils.c
src/bin/psql/keywords.c -> ../../../src/bin/pg_dump/keywords.c
src/bin/scripts/dumputils.c -> ../../../src/bin/pg_dump/dumputils.c
src/bin/scripts/keywords.c -> ../../../src/bin/pg_dump/keywords.c
src/bin/scripts/mbprint.c -> ../../../src/bin/psql/mbprint.c
src/bin/scripts/print.c -> ../../../src/bin/psql/print.c

Note: the reason for a new subdirectory, rather than putting this
stuff into src/common/, is that src/common/ is meant for code that's
shared between frontend and backend, which this stuff is not.  Also,
many of these files depend on libpq which seems like an inappropriate
dependency for src/common.

Having said that, I also notice these dependencies:

src/bin/pg_dump/kwlookup.c -> ../../../src/backend/parser/kwlookup.c
src/bin/psql/kwlookup.c -> ../../../src/backend/parser/kwlookup.c
src/bin/scripts/kwlookup.c -> ../../../src/backend/parser/kwlookup.c
src/interfaces/ecpg/preproc/kwlookup.c -> 
../../../../src/backend/parser/kwlookup.c
src/bin/initdb/encnames.c -> ../../../src/backend/utils/mb/encnames.c
src/interfaces/libpq/encnames.c -> ../../../src/backend/utils/mb/encnames.c

which seem like they'd be better handled by moving those files into
src/common/.

Thoughts?



Yes please!

--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rationalizing code-sharing among src/bin/ directories

2016-03-23 Thread Magnus Hagander
On Wed, Mar 23, 2016 at 5:55 PM, Tom Lane  wrote:

> I'm not terribly happy with the hack I used to get pgbench to be able
> to borrow psql's psqlscan.l lexer.  It's a mess for the MSVC build
> scripts, and I have seen it causing two concurrent builds of psqlscan.o
> in parallel builds, which is likely to cause a build failure some of
> the time.  Another unresolved issue is that we can't apply FLEX_NO_BACKUP
> testing to both of psql's lexers, because that causes each of those
> make steps to want to write lex.backup in the current directory.
>
> I have a modest proposal for improving this: let's move all the code
> that's currently shared by two or more src/bin/ subdirectories into a
> new directory, say src/feutils, and build it into a ".a" library in
> the same way that src/common/ is handled.  This would remove the
> following klugy cross-directory links:
>
> src/bin/pgbench/psqlscan.o -> ../../../src/bin/psql/psqlscan.o
> src/bin/psql/dumputils.c -> ../../../src/bin/pg_dump/dumputils.c
> src/bin/psql/keywords.c -> ../../../src/bin/pg_dump/keywords.c
> src/bin/scripts/dumputils.c -> ../../../src/bin/pg_dump/dumputils.c
> src/bin/scripts/keywords.c -> ../../../src/bin/pg_dump/keywords.c
> src/bin/scripts/mbprint.c -> ../../../src/bin/psql/mbprint.c
> src/bin/scripts/print.c -> ../../../src/bin/psql/print.c
>
> Note: the reason for a new subdirectory, rather than putting this
> stuff into src/common/, is that src/common/ is meant for code that's
> shared between frontend and backend, which this stuff is not.  Also,
> many of these files depend on libpq which seems like an inappropriate
> dependency for src/common.
>
> Having said that, I also notice these dependencies:
>
> src/bin/pg_dump/kwlookup.c -> ../../../src/backend/parser/kwlookup.c
> src/bin/psql/kwlookup.c -> ../../../src/backend/parser/kwlookup.c
> src/bin/scripts/kwlookup.c -> ../../../src/backend/parser/kwlookup.c
> src/interfaces/ecpg/preproc/kwlookup.c ->
> ../../../../src/backend/parser/kwlookup.c
> src/bin/initdb/encnames.c -> ../../../src/backend/utils/mb/encnames.c
> src/interfaces/libpq/encnames.c -> ../../../src/backend/utils/mb/encnames.c
>
> which seem like they'd be better handled by moving those files into
> src/common/.
>
> Thoughts?
>
>
+1 on both. I've certainly been annoyed by that kwlookup thing many times
:)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Reworks of CustomScan serialization/deserialization

2016-03-23 Thread Petr Jelinek

On 23/03/16 09:14, Kouhei Kaigai wrote:

On 15/03/16 05:03, Kouhei Kaigai wrote:

Petr,

The attached patch is the revised one that follows the new extensible-
node routine.

It is almost same the previous version except for:
- custom-apis.[ch] was renamed to custom-node.[ch]
- check for the length of custom-scan-method name followed
 the manner of RegisterExtensibleNodeMethods()



Hi,

looks good, only nitpick I have is that it probably should be
custom_node.h with underscore given that we use underscore everywhere
(except for libpq and for some reason atomic ops).


Sorry for my response late.

The attached patch just renamed custom-node.[ch] by custom_node.[ch].
Other portions are not changed from the previous revison.



Forgot to attach?


Yes Thanks,



Ok, I am happy with it, marked it as ready for committer (it was marked 
as committed although it wasn't committed).


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Tom Lane
"Regina Obe"  writes:
> In the past couple of weeks our PostGIS tests against PostgreSQL 9.6 dev
> started failing.  I traced the issue down to a behavior change in 9.6 when
> dealing with output of set returning functions when used with (func).*
> syntax.

> CREATE OR REPLACE FUNCTION dumpset(param_num integer, param_text text)
> RETURNS TABLE(id integer, junk1 text, junk2 text)
> ...
> -- Get 16 rows in 9.6, Get 8 rows in 9.5
> SELECT (dumpset(f.test, 'hello world' || f.test)).*
> FROM generate_series(1,4) As f(test)
> ORDER BY junk2;

I think this is a side-effect of 9118d03a8 ("When appropriate, postpone
SELECT output expressions till after ORDER BY").  Previously, although you
got N evaluations of the SRF which is pretty horrid, they were all in the
same targetlist and hence ran in sync and produced only the expected
number of rows (thanks to the otherwise-indefensible ExecTargetList
behavior by which multiple SRF tlist expressions produce a number of rows
equal to the least common multiple of their periods, not the product).
That commit causes the evaluation of dumpset(...).junk1 to be postponed
till after the Sort step, but the evaluation of dumpset(...).junk2
necessarily can't be.  So now you get dumpset(...).junk2 inflating
the original rowcount 2X, and then dumpset(...).junk1 inflating it
another 2X after the Sort.

We could remain bug-compatible with the old behavior by adding some
restriction to keep all the tlist SRFs getting evaluated at the same
plan step, at least to the extent that we can.  I think you could get
similar strange behaviors in prior releases if you used GROUP BY or
another syntax that might result in early evaluation of the sort column,
and we're not going to be able to fix that.  But we could prevent this
particular optimization from introducing new strangeness.

But I'm not really sure that we should.  The way that you *should*
write this query is

SELECT ds.*
FROM generate_series(1,4) AS f(test),
 dumpset(f.test, 'hello world' || f.test) AS ds
ORDER BY junk2;

which is both SQL-standard semantics and much more efficient than
SRF-in-tlist.  We've more or less deprecated SRF-in-tlist since we
introduced LATERAL in 9.3.  How much are we willing to do to stay
bug-compatible with old behaviors here?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] fix DROP OPERATOR to reset links to itself on commutator and negator

2016-03-23 Thread Tom Lane
Robert Haas  writes:
> On Wed, Mar 23, 2016 at 12:27 PM, Tom Lane  wrote:
>> I'm also a bit dubious of the assumption in RemoveOperatorById that an
>> operator can't be its own negator.  Yeah, that should not be the case,
>> but if it is the case the deletion will fail outright.

> So what?  We've never guaranteed that things are going to work if you
> start by corrupting the catalogs, and I wouldn't pick this as a place
> to start.

I would not be worried except that it breaks a case that used to work,
as your test below demonstrates.

>> We could resolve both of these issues by changing the semantics of
>> OprUpdate so that it unconditionally does a CommandCounterIncrement
>> after each update that it performs.  IMO that would be a lot simpler
>> and more bulletproof; it'd allow removal of a lot of these
>> overly-tightly-reasoned cases.

> I tried this, but it did not seem to work.

Odd.  If you post the revised patch, I'll try to chase down what's wrong.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Vladimir Sitnikov
2016-03-23 16:21 GMT+03:00 Merlin Moncure :
> On Wed, Mar 23, 2016 at 7:27 AM, Craig Ringer  wrote:
Craig>> With PREPARE IF NOT EXISTS the client is also paying parse and network
Craig>> overhead for every time you send that statement. Much better
not to send it
Craig>> repeatedly in the first place.
>
Merlin> How did you determine that?  The client prepares the statement exactly
Merlin> once.  The problem is there's no easy way to determine if someone else
Merlin> prepared it first and this patch directly addresses that.

With suggested "prepare if not exists", client would still have to send full
query text along with "prepare if not exist" command.
That is "network overhead".
DB would still have to check if the same query with same "query name" has
already been registered. Well, that query check would probably be easier than
"generate execution plan", yet it can be of non-zero overhead.

Craig>> I think we need to take a step back here and better define the problem
Craig>> before stepping in with a proposed solution. Something that
avoids the need
Craig>> to spam the server with endless copies of the same PREPARE
statements would
Craig>> be good.

+1

Merlin> A typical pattern is for the application to
Merlin> prepare them all upon startup, but currently each PREPARE needs to be
Merlin> wrapped with an exception handler in case someone else prepared it
Merlin> first.

If you plan to have "prepare if not exists" at startup only, why don't
you just wrap it with
exception handler then?

If you plan to always issue "prepare if not exists", then you will
have to send full query text
for each prepare => overhead. Those repeated query texts are
"endless copies of the same PREPARE statements" Craig is talking about.

Merlin>The client prepares the statement exactly
Merlin>once.  The problem is there's no easy way to determine if someone else
Merlin>prepared it first

Merlin, if by "client" you somehow mean JDBC (e.g. pgjdbc), then it
does track which connections
have which queries prepared.
So the thing we/you need might be not backend support for "prepare if
not exists", but some kind of
bouncer vs jdbc integration, so jdbc knows which connection it is
using at each point in time.

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Generic WAL logical messages

2016-03-23 Thread Petr Jelinek

On 23/03/16 14:17, Alvaro Herrera wrote:

Petr Jelinek wrote:


+++ b/contrib/test_decoding/sql/messages.sql
@@ -0,0 +1,17 @@
+-- predictability
+SET synchronous_commit = on;
+
+SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot', 
'test_decoding');
+
+SELECT 'msg1' FROM pg_logical_emit_message(true, 'test', 'msg1');
+SELECT 'msg2' FROM pg_logical_emit_message(false, 'test', 'msg2');
+
+BEGIN;
+SELECT 'msg3' FROM pg_logical_emit_message(true, 'test', 'msg3');
+SELECT 'msg4' FROM pg_logical_emit_message(false, 'test', 'msg4');
+SELECT 'msg5' FROM pg_logical_emit_message(true, 'test', 'msg5');
+COMMIT;
+
+SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 
'force-binary', '0', 'skip-empty-xacts', '1');
+
+SELECT 'init' FROM pg_drop_replication_slot('regression_slot');


No tests for a rolled back transaction?



Good point, probably worth testing especially since we have 
non-transactional messages.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
>From 81fc28cedc19fe0f91f882d42989c14113a40f88 Mon Sep 17 00:00:00 2001
From: Petr Jelinek 
Date: Wed, 24 Feb 2016 17:02:36 +0100
Subject: [PATCH] Logical Decoding Messages

---
 contrib/test_decoding/Makefile  |   2 +-
 contrib/test_decoding/expected/ddl.out  |  21 ++--
 contrib/test_decoding/expected/messages.out |  71 ++
 contrib/test_decoding/sql/ddl.sql   |   3 +-
 contrib/test_decoding/sql/messages.sql  |  22 +
 contrib/test_decoding/test_decoding.c   |  18 
 doc/src/sgml/func.sgml  |  45 +
 doc/src/sgml/logicaldecoding.sgml   |  38 
 src/backend/access/rmgrdesc/Makefile|   4 +-
 src/backend/access/rmgrdesc/logicalmsgdesc.c|  41 
 src/backend/access/transam/rmgr.c   |   1 +
 src/backend/replication/logical/Makefile|   2 +-
 src/backend/replication/logical/decode.c|  46 +
 src/backend/replication/logical/logical.c   |  38 
 src/backend/replication/logical/logicalfuncs.c  |  27 ++
 src/backend/replication/logical/message.c   |  85 +
 src/backend/replication/logical/reorderbuffer.c | 121 
 src/backend/replication/logical/snapbuild.c |  19 
 src/bin/pg_xlogdump/.gitignore  |  20 +---
 src/bin/pg_xlogdump/rmgrdesc.c  |   1 +
 src/include/access/rmgrlist.h   |   1 +
 src/include/catalog/pg_proc.h   |   4 +
 src/include/replication/logicalfuncs.h  |   2 +
 src/include/replication/message.h   |  41 
 src/include/replication/output_plugin.h |  13 +++
 src/include/replication/reorderbuffer.h |  22 +
 src/include/replication/snapbuild.h |   2 +
 27 files changed, 679 insertions(+), 31 deletions(-)
 create mode 100644 contrib/test_decoding/expected/messages.out
 create mode 100644 contrib/test_decoding/sql/messages.sql
 create mode 100644 src/backend/access/rmgrdesc/logicalmsgdesc.c
 create mode 100644 src/backend/replication/logical/message.c
 create mode 100644 src/include/replication/message.h

diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile
index 06c9546..309cb0b 100644
--- a/contrib/test_decoding/Makefile
+++ b/contrib/test_decoding/Makefile
@@ -38,7 +38,7 @@ submake-test_decoding:
$(MAKE) -C $(top_builddir)/contrib/test_decoding
 
 REGRESSCHECKS=ddl xact rewrite toast permissions decoding_in_xact \
-   decoding_into_rel binary prepared replorigin time
+   decoding_into_rel binary prepared replorigin time messages
 
 regresscheck: | submake-regress submake-test_decoding temp-install
$(MKDIR_P) regression_output
diff --git a/contrib/test_decoding/expected/ddl.out 
b/contrib/test_decoding/expected/ddl.out
index 77719e8..32cd24d 100644
--- a/contrib/test_decoding/expected/ddl.out
+++ b/contrib/test_decoding/expected/ddl.out
@@ -220,11 +220,17 @@ SELECT data FROM 
pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'inc
 (7 rows)
 
 /*
- * check that disk spooling works
+ * check that disk spooling works (also for logical messages)
  */
 BEGIN;
 CREATE TABLE tr_etoomuch (id serial primary key, data int);
 INSERT INTO tr_etoomuch(data) SELECT g.i FROM generate_series(1, 10234) g(i);
+SELECT 'tx logical msg' FROM pg_logical_emit_message(true, 'test', 'tx logical 
msg');
+?column?
+
+ tx logical msg
+(1 row)
+
 DELETE FROM tr_etoomuch WHERE id < 5000;
 UPDATE tr_etoomuch SET data = - data WHERE id > 5000;
 COMMIT;
@@ -233,12 +239,13 @@ SELECT count(*), min(data), max(data)
 FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 
'include-xids', '0', 'skip-empty-xacts', '1')
 GROUP BY substring(data, 1, 24)
 ORDER BY 1,2;
- count |   min   |  

Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Merlin Moncure
On Wed, Mar 23, 2016 at 12:46 PM, Vladimir Sitnikov
 wrote:
> 2016-03-23 16:21 GMT+03:00 Merlin Moncure :
> Merlin> A typical pattern is for the application to
> Merlin> prepare them all upon startup, but currently each PREPARE needs to be
> Merlin> wrapped with an exception handler in case someone else prepared it
> Merlin> first.
>
> If you plan to have "prepare if not exists" at startup only, why don't
> you just wrap it with
> exception handler then?

That's impolite to our users.  Virtually all other commands have been
decorated with IF [NOT] exists to avoid having to guard with exception
handler -- why not this one?  Also, if the handler is on the client
side, it tends to be racey.

> If you plan to always issue "prepare if not exists", then you will
> have to send full query text
> for each prepare => overhead. Those repeated query texts are
> "endless copies of the same PREPARE statements" Craig is talking about.

No one is arguing that that you should send it any every time (at
least -- I hope not).

> Merlin>The client prepares the statement exactly
> Merlin>once.  The problem is there's no easy way to determine if someone else
> Merlin>prepared it first
>
> Merlin, if by "client" you somehow mean JDBC (e.g. pgjdbc), then it
> does track which connections
> have which queries prepared.

Again, not in pooling scenarios.  The problems integrating server side
prepared statements with pgbouncer are well known.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Tom Lane
Vladimir Sitnikov  writes:
> 2016-03-23 16:21 GMT+03:00 Merlin Moncure :
>> On Wed, Mar 23, 2016 at 7:27 AM, Craig Ringer  wrote:
> Craig>> With PREPARE IF NOT EXISTS the client is also paying parse and network
> Craig>> overhead for every time you send that statement. Much better
> Craig>> not to send it repeatedly in the first place.

> Merlin> How did you determine that?  The client prepares the statement exactly
> Merlin> once.  The problem is there's no easy way to determine if someone else
> Merlin> prepared it first and this patch directly addresses that.

> With suggested "prepare if not exists", client would still have to send full
> query text along with "prepare if not exist" command.
> That is "network overhead".

Yes.  Also, the query would certainly go through the lex and raw-parse
steps (scan.l and gram.y) on the database side before we could get as far
as realizing that it's a PREPARE IF NOT EXISTS and we should check for
pre-existence of the statement name.  Those steps are a nontrivial
fraction of the full parse-analysis overhead, especially for simpler
statements.

So I wouldn't take it as a given that this is actually going to be
very efficient.  Some measurements would be important to make the case
that this is worth having.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rationalizing code-sharing among src/bin/ directories

2016-03-23 Thread Alvaro Herrera
Tom Lane wrote:

> Note: the reason for a new subdirectory, rather than putting this
> stuff into src/common/, is that src/common/ is meant for code that's
> shared between frontend and backend, which this stuff is not.  Also,
> many of these files depend on libpq which seems like an inappropriate
> dependency for src/common.

Actually you could just list them in OBJS_FRONTEND in src/common.  That
way they're not built for the server at all; no need for a new subdir.
The libpq dependency argument AFAICS only applies to the ones in
src/bin/psql.  Perhaps we could have feutils with those only, and move
the other files to src/common.

I'm unclear on the #ifndef PGSCRIPTS thingy in mbprint.c.  Perhaps it's
no longer needed?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Vladimir Sitnikov
Merlin>No one is arguing that that you should send it any every time (at
least -- I hope not).

Well, what is your suggestion exactly?

pgjdbc is NOT using "prepare ..." sql command.
I'm inclined to suppose, it will not use "prepare..." even after your fix.

Merlin>Again, not in pooling scenarios
Merlin>The problems integrating server side
Merlin>prepared statements with pgbouncer are well known.

I'm afraid, they are not.

Your words are "This feature should be immediately be incorporated
by the JDBC driver" yet you have not raised that subject on pgsql-jdbc
mailing list/github issue. That is not very fair.

Let me just name an alternative, so you can see what "a step back" could be:
What if pg-bouncer generated _fake_ ParameterStatus(backend_id=...)
messages to pgjdbc?
Then pgjdbc can learn true backend session id and it can use proper
set of prepared statements. Basically, pgjdbc could have prepared statement
cache per backend_id.
Well, it is not a 100% solution, however it is yet another approach to
"pgbouncer" problem, and it will support all the PostgreSQL versions.

It fits into current frontend-backend protocol as all clients are supposed
to handle ParameterStatus messages, etc.

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multivariate statistics v14

2016-03-23 Thread Petr Jelinek

Hi,

I'll add couple of code comments from my first cursory read through 
(this is huge):


0002:
there is some whitespace noise between the varlistentries in 
alter_statistics.sgml


+   parentobject.classId = RelationRelationId;
+   parentobject.objectId = ObjectIdGetDatum(RelationGetRelid(rel));
+   parentobject.objectSubId = 0;
+   childobject.classId = MvStatisticRelationId;
+   childobject.objectId = statoid;
+   childobject.objectSubId = 0;

I wonder if this (several places similar code) would be simpler done 
using ObjectAddressSet()


The common.h in backend/utils/mvstat is slightly weird header file 
placement and naming.



0004:
+/* used for merging bitmaps - AND (min), OR (max) */
+#define MAX(x, y) (((x) > (y)) ? (x) : (y))
+#define MIN(x, y) (((x) < (y)) ? (x) : (y))

Huh? We have Max and Min macros defined in c.h

+   values[Anum_pg_mv_statistic_stamcv  - 1] = 
PointerGetDatum(data);

Why the double space (that's actually in several places in several of 
the patches).


I don't really understand why 0008 and 0009 are separate patches and 
aren't part of one of the other patches. But otherwise good job on 
splitting the functionality into patchset.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread David G. Johnston
On Wednesday, March 23, 2016, Tom Lane  wrote:

> "Regina Obe" > writes:
> > In the past couple of weeks our PostGIS tests against PostgreSQL 9.6 dev
> > started failing.  I traced the issue down to a behavior change in 9.6
> when
> > dealing with output of set returning functions when used with (func).*
> > syntax.
>
> > CREATE OR REPLACE FUNCTION dumpset(param_num integer, param_text text)
> > RETURNS TABLE(id integer, junk1 text, junk2 text)
> > ...
> > -- Get 16 rows in 9.6, Get 8 rows in 9.5
> > SELECT (dumpset(f.test, 'hello world' || f.test)).*
> > FROM generate_series(1,4) As f(test)
> > ORDER BY junk2;
>
> I think this is a side-effect of 9118d03a8 ("When appropriate, postpone
> SELECT output expressions till after ORDER BY").  Previously, although you
> got N evaluations of the SRF which is pretty horrid, they were all in the
> same targetlist and hence ran in sync and produced only the expected
> number of rows (thanks to the otherwise-indefensible ExecTargetList
> behavior by which multiple SRF tlist expressions produce a number of rows
> equal to the least common multiple of their periods, not the product).
> That commit causes the evaluation of dumpset(...).junk1 to be postponed
> till after the Sort step, but the evaluation of dumpset(...).junk2
> necessarily can't be.  So now you get dumpset(...).junk2 inflating
> the original rowcount 2X, and then dumpset(...).junk1 inflating it
> another 2X after the Sort.
>
> We could remain bug-compatible with the old behavior by adding some
> restriction to keep all the tlist SRFs getting evaluated at the same
> plan step, at least to the extent that we can.  I think you could get
> similar strange behaviors in prior releases if you used GROUP BY or
> another syntax that might result in early evaluation of the sort column,
> and we're not going to be able to fix that.  But we could prevent this
> particular optimization from introducing new strangeness.
>
> But I'm not really sure that we should.  The way that you *should*
> write this query is
>
> SELECT ds.*
> FROM generate_series(1,4) AS f(test),
>  dumpset(f.test, 'hello world' || f.test) AS ds
> ORDER BY junk2;
>
> which is both SQL-standard semantics and much more efficient than
> SRF-in-tlist.  We've more or less deprecated SRF-in-tlist since we
> introduced LATERAL in 9.3.  How much are we willing to do to stay
> bug-compatible with old behaviors here?
>
>
My gut reaction is that this is an unnecessary regression for the sake of a
performance optimization that is likely drowned out in the usage presented
anyway.

The pivot for me would be how hard would it be to maintain the old behavior
in this "more or less deprecated" scenario.  I have no way to judge that.

David J.


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Merlin Moncure
On Wed, Mar 23, 2016 at 1:15 PM, Vladimir Sitnikov
 wrote:
> Merlin>No one is arguing that that you should send it any every time (at
> least -- I hope not).
>
> Well, what is your suggestion exactly?
>
> pgjdbc is NOT using "prepare ..." sql command.
> I'm inclined to suppose, it will not use "prepare..." even after your fix.

Maybe so (note, the fix is not mine).  I guess better stated, the
proposed would allow use of server side prepared statements with JDBC.

> Merlin>Again, not in pooling scenarios
> Merlin>The problems integrating server side
> Merlin>prepared statements with pgbouncer are well known.
>
> I'm afraid, they are not.

yes, they are.   see:
https://pgbouncer.github.io/faq.html#how-to-use-prepared-statements-with-session-pooling

There are numerous pages on the web suggesting to DISCARD ALL in
transaction mode which is incredibly wasteful in the case of prepared
statements...so much so you're better off not using them if you can
help it.  With proposed, the application can simply prepare after
opening the 'connection' and not have to worry about handling the
error or scope.

> Your words are "This feature should be immediately be incorporated
> by the JDBC driver" yet you have not raised that subject on pgsql-jdbc
> mailing list/github issue. That is not very fair.

That's fair. Although there was a very long standing issue where jdbc
would try to prepare 'BEGIN' in such a a way that it could not be
disabled -- that was fixed.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Relation extension scalability

2016-03-23 Thread Robert Haas
On Tue, Mar 22, 2016 at 1:12 PM, Petr Jelinek  wrote:
> I also think the code simplicity makes this worth it.

Agreed.  I went over this patch and did a cleanup pass today.   I
discovered that the LockWaiterCount() function was broken if you try
to tried to use it on a lock that you didn't hold or a lock that you
held in any mode other than exclusive, so I tried to fix that.  I
rewrote a lot of the comments and tightened some other things up.  The
result is attached.

I'm baffled by the same code Amit asked about upthread, even though
there's now a comment:

+   /*
+* Here we are calling
RecordAndGetPageWithFreeSpace
+* instead of GetPageWithFreeSpace,
because other backend
+* who have got the lock might have
added extra blocks in
+* the FSM and its possible that FSM
tree might not have
+* been updated so far and we will not
get the page by
+* searching from top using
GetPageWithFreeSpace, so we
+* need to search the leaf page
directly using our last
+* valid target block.
+*
+* XXX. I don't understand what is
happening here. -RMH
+*/

I've read this over several times and looked at
RecordAndGetPageWithFreeSpace() and I'm still confused.  First of all,
if the lock was acquired by some other backend which did
RelationAddExtraBlocks(), it *will* have updated the FSM - that's the
whole point.  Second, if the other backend extended the relation in
some other manner and did not extend the FSM, how does calling
RecordAndGetPageWithFreeSpace help?  As far as I can see,
GetPageWithFreeSpace and RecordAndGetPageWithFreeSpace are both just
searching the FSM, so if one is stymied the other will be too.  What
am I missing?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c
index 8140418..3ab911f 100644
--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -169,6 +169,55 @@ GetVisibilityMapPins(Relation relation, Buffer buffer1, Buffer buffer2,
 }
 
 /*
+ * Extend a relation by multiple blocks to avoid future contention on the
+ * relation extension lock.  Our goal is to pre-extend the relation by an
+ * amount which ramps up as the degree of contention ramps up, but limiting
+ * the result to some sane overall value.
+ */
+static void
+RelationAddExtraBlocks(Relation relation, BulkInsertState bistate)
+{
+	Page		page;
+	Size		freespace;
+	BlockNumber blockNum;
+	int			extraBlocks = 0;
+	int			lockWaiters = 0;
+	Buffer		buffer;
+
+	/*
+	 * We use the length of the lock wait queue to judge how much to extend.
+	 * It might seem like multiplying the number of lock waiters by as much
+	 * as 20 is too aggressive, but benchmarking revealed that smaller numbers
+	 * were insufficient.  512 is just an arbitrary cap to prevent pathological
+	 * results (and excessive wasted disk space).
+	 */
+	lockWaiters = RelationExtensionLockWaiterCount(relation);
+	extraBlocks = Min(512, lockWaiters * 20);
+
+	while (extraBlocks-- >= 0)
+	{
+		/* Ouch - an unnecessary lseek() each time through the loop! */
+		buffer = ReadBufferBI(relation, P_NEW, bistate);
+
+		/* Extend by one page. */
+		LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
+		page = BufferGetPage(buffer);
+		PageInit(page, BufferGetPageSize(buffer), 0);
+		freespace = PageGetHeapFreeSpace(page);
+		MarkBufferDirty(buffer);
+		blockNum = BufferGetBlockNumber(buffer);
+		UnlockReleaseBuffer(buffer);
+
+		/*
+		 * Put the page in the freespace map so other backends can find it.
+		 * This is what will keep those other backends from also queueing up
+		 * on the relation extension lock.
+		 */
+		RecordPageWithFreeSpace(relation, blockNum, freespace);
+	}
+}
+
+/*
  * RelationGetBufferForTuple
  *
  *	Returns pinned and exclusive-locked buffer of a page in given relation
@@ -233,10 +282,11 @@ RelationGetBufferForTuple(Relation relation, Size len,
 	bool		use_fsm = !(options & HEAP_INSERT_SKIP_FSM);
 	Buffer		buffer = InvalidBuffer;
 	Page		page;
-	Size		pageFreeSpace,
-saveFreeSpace;
+	Size		pageFreeSpace = 0,
+saveFreeSpace = 0;
 	BlockNumber targetBlock,
-otherBlock;
+otherBlock,
+lastValidBlock = InvalidBlockNumber;
 	bool		needLock;
 
 	len = MAXALIGN(len);		/* be conservative */
@@ -308,6 +358,7 @@ RelationGetBufferForTuple(Relation relation, Size len,
 		}
 	}
 
+loop:
 	while (targetBlock != InvalidBlockNumber)
 	{
 		/*
@@ -388,6 +439,8 @@ RelationGetBufferForTuple(Relation relation, Size len,
  otherBlock, targetBlock, vmbuffer_other,
  vmbuffer);
 
+		lastValidBlock = targetBlock;
+
 		/*
 		 * Now we can check to see if there's

Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Merlin Moncure
On Wed, Mar 23, 2016 at 12:37 PM, Tom Lane  wrote:
> which is both SQL-standard semantics and much more efficient than
> SRF-in-tlist.  We've more or less deprecated SRF-in-tlist since we
> introduced LATERAL in 9.3.  How much are we willing to do to stay
> bug-compatible with old behaviors here?

I think we should, and the fact this was caught so early on the
release cycle underscores that.  One of the problems is that there are
reasonable cases (note, not impacted by this bug) of this usage that
are still commonplace, for example:

ysanalysis=# select unnest(current_schemas(true));
   unnest

 pg_catalog
 public

I'm something of a backwards compatibility zealot, but I've become one
for very good reasons.  Personally, I'd rather we'd define precisely
the usages that are deprecated (I guess SRF-tlist in the presence of
FROM) and force them to error out with an appropriate HINT rather than
give a different answer than they used to.  The problem here is that
LATERAL is still fairly new and there is a huge body of code out there
leveraging the 'bad' way, as it was for years and years the only way
to do a number of useful things.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Relation extension scalability

2016-03-23 Thread Petr Jelinek

On 23/03/16 19:39, Robert Haas wrote:

On Tue, Mar 22, 2016 at 1:12 PM, Petr Jelinek  wrote:

I also think the code simplicity makes this worth it.


Agreed.  I went over this patch and did a cleanup pass today.   I
discovered that the LockWaiterCount() function was broken if you try
to tried to use it on a lock that you didn't hold or a lock that you
held in any mode other than exclusive, so I tried to fix that.  I
rewrote a lot of the comments and tightened some other things up.  The
result is attached.

I'm baffled by the same code Amit asked about upthread, even though
there's now a comment:

+   /*
+* Here we are calling
RecordAndGetPageWithFreeSpace
+* instead of GetPageWithFreeSpace,
because other backend
+* who have got the lock might have
added extra blocks in
+* the FSM and its possible that FSM
tree might not have
+* been updated so far and we will not
get the page by
+* searching from top using
GetPageWithFreeSpace, so we
+* need to search the leaf page
directly using our last
+* valid target block.
+*
+* XXX. I don't understand what is
happening here. -RMH
+*/

I've read this over several times and looked at
RecordAndGetPageWithFreeSpace() and I'm still confused.  First of all,
if the lock was acquired by some other backend which did
RelationAddExtraBlocks(), it *will* have updated the FSM - that's the
whole point.


That's good point, maybe this coding is bit too defensive.


Second, if the other backend extended the relation in
some other manner and did not extend the FSM, how does calling
RecordAndGetPageWithFreeSpace help?  As far as I can see,
GetPageWithFreeSpace and RecordAndGetPageWithFreeSpace are both just
searching the FSM, so if one is stymied the other will be too.  What
am I missing?



The RecordAndGetPageWithFreeSpace will extend FSM as it calls 
fsm_set_and_search which in turn calls fsm_readbuf with extend = true.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Relation extension scalability

2016-03-23 Thread Robert Haas
On Wed, Mar 23, 2016 at 2:52 PM, Petr Jelinek  wrote:
>> Second, if the other backend extended the relation in
>> some other manner and did not extend the FSM, how does calling
>> RecordAndGetPageWithFreeSpace help?  As far as I can see,
>> GetPageWithFreeSpace and RecordAndGetPageWithFreeSpace are both just
>> searching the FSM, so if one is stymied the other will be too.  What
>> am I missing?
>>
>
> The RecordAndGetPageWithFreeSpace will extend FSM as it calls
> fsm_set_and_search which in turn calls fsm_readbuf with extend = true.

So how does that help?  If I'm reading this right, the new block will
be all zeroes which means no space available on any of those pages.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Tom Lane
"David G. Johnston"  writes:
> On Wednesday, March 23, 2016, Tom Lane  wrote:
>> ...  We've more or less deprecated SRF-in-tlist since we
>> introduced LATERAL in 9.3.  How much are we willing to do to stay
>> bug-compatible with old behaviors here?

> My gut reaction is that this is an unnecessary regression for the sake of a
> performance optimization that is likely drowned out in the usage presented
> anyway.
> The pivot for me would be how hard would it be to maintain the old behavior
> in this "more or less deprecated" scenario.  I have no way to judge that.

Well, it'd make make_sort_input_target() more complicated and a bit
slower, mainly because it would have to check SRF-ness of sorting columns
that it currently has no need to inspect at all.  My concern here is not
really with that, it's with trying to draw a boundary around what behavior
we're going to promise bug-compatibility with.  To illustrate, you really
can get the multiple-expansions behavior with existing releases, eg in
9.4

regression=# SELECT (dumpset(f.test, 'hello world' || f.test)).*, f.test
FROM generate_series(1,4) As f(test)
GROUP BY junk2, f.test;
 id | junk1 | junk2 | test 
+---+---+--
  1 | 3hello world3 | 31|3
  2 | 3hello world3 | 31|3
  1 | 4hello world4 | 42|4
  2 | 4hello world4 | 42|4
  1 | 4hello world4 | 41|4
  2 | 4hello world4 | 41|4
  1 | 1hello world1 | 11|1
  2 | 1hello world1 | 11|1
  1 | 3hello world3 | 32|3
  2 | 3hello world3 | 32|3
  1 | 2hello world2 | 21|2
  2 | 2hello world2 | 21|2
  1 | 2hello world2 | 22|2
  2 | 2hello world2 | 22|2
  1 | 1hello world1 | 12|1
  2 | 1hello world1 | 12|1
(16 rows)

which is kind of fun to wrap your head around, but after awhile you
realize that dumpset(...).junk2 is being evaluated before the GROUP BY
and dumpset.(...).id after it, which is how come the grouping behavior
is so obviously violated.

AFAICS, the only way to ensure non-crazy behavior for such examples would
be to force all SRFs in the tlist to be evaluated at the same plan step.
Which we've never done in the past, and if we were to start doing so,
it would cause a behavioral change for examples like this one.

Anyway, my concern is just that we're deciding to stay bug-compatible
with some behaviors that were not very well thought out to start with,
and we don't even have a clear understanding of where the limits of
that compatibility will be.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Vladimir Sitnikov
Merlin>proposed would allow use of server side prepared statements with JDBC.

It would not. If we discuss end-to-end scenarios in detail, we would end up with
"send full query on each execution" -> lex/gram on each execution kind
of overheads.

That is hardly a proper way of using prepared statements.
I'm not eager to merge half-a-solution to pgjdbc.
Just in case: I'm one of those who maintain pgjdbc.

Merlin>https://pgbouncer.github.io/faq.html#how-to-use-prepared-statements-with-session-pooling
pgbouncer's docs> the reset query must clean old prepared statements

I'm afraid, that is just "you must clean prepared statements".
Where are the issues/suggestions from pgbouncer team?
I'm just a year into pgjdbc, I've attended a couple of the largest
PostgreSQL conferences
(like 700+ attendees), however I know of exactly 0 suggestions on
improving the matter.
Everybody just keeps saying "discard all".

Merlin>With proposed, the application can simply prepare after
Merlin>opening the 'connection' and not have to worry about handling the
Merlin>error or scope.

Can you name at least a couple of applications that indeed
"prepare after opening the connection"?
Note: it is not something JDBC driver can do. That "prepare after opening"
requires cooperation from the application.

I'm afraid, I know exactly 0 such applications. At least, in java world.
Applications typically use framework generated queries, so it would be
hard to impossible to "simply prepare after opening".
The "set of used sql queries" is likely to be infinite even for a
single application.
That is sad, but true.

That is exactly the reason why I've implemented _transparent_ statement cache
for pgjdbc. As application is using generated queries, pgjdbc detects
query reuse and enables server-prepared queries behind the scenes.


If no one objects, I would just go ahead and file
"report ParameterStatus(pgbouncer.backend_id=...)" issue for pgbouncer.
I guess we can agree on the name and semantics of the new status message,
so it would not accidentally collide with the one of newer PG version.

Merlin>Although there was a very long standing issue where jdbc
Merlin>would try to prepare 'BEGIN' in such a a way that it could not be
Merlin>disabled -- that was fixed.

What bothers me is current pgjdbc CI has exactly 0 pgbouncer tests.
That means "BEGIN is fixed" can easily break and no one would notice that.
It is tracked under https://github.com/pgjdbc/pgjdbc/issues/509, so
if there's interest in pgbouncer vs pgjdbc, then #509 might be a good start.

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Relation extension scalability

2016-03-23 Thread Petr Jelinek

On 23/03/16 20:01, Robert Haas wrote:

On Wed, Mar 23, 2016 at 2:52 PM, Petr Jelinek  wrote:

Second, if the other backend extended the relation in
some other manner and did not extend the FSM, how does calling
RecordAndGetPageWithFreeSpace help?  As far as I can see,
GetPageWithFreeSpace and RecordAndGetPageWithFreeSpace are both just
searching the FSM, so if one is stymied the other will be too.  What
am I missing?



The RecordAndGetPageWithFreeSpace will extend FSM as it calls
fsm_set_and_search which in turn calls fsm_readbuf with extend = true.


So how does that help?  If I'm reading this right, the new block will
be all zeroes which means no space available on any of those pages.



I am bit confused as to what exactly you are saying, but what will 
happen is we get back to the while cycle and try again so eventually we 
should find either block with enough free space or add new one (not sure 
if this would actually ever happen in practice in heavily concurrent 
workload where the FSM would not be correctly extended during relation 
extension though, we might just loop here forever).


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Relation extension scalability

2016-03-23 Thread Petr Jelinek

On 23/03/16 20:19, Petr Jelinek wrote:

On 23/03/16 20:01, Robert Haas wrote:

On Wed, Mar 23, 2016 at 2:52 PM, Petr Jelinek 
wrote:

Second, if the other backend extended the relation in
some other manner and did not extend the FSM, how does calling
RecordAndGetPageWithFreeSpace help?  As far as I can see,
GetPageWithFreeSpace and RecordAndGetPageWithFreeSpace are both just
searching the FSM, so if one is stymied the other will be too.  What
am I missing?



The RecordAndGetPageWithFreeSpace will extend FSM as it calls
fsm_set_and_search which in turn calls fsm_readbuf with extend = true.


So how does that help?  If I'm reading this right, the new block will
be all zeroes which means no space available on any of those pages.



I am bit confused as to what exactly you are saying, but what will
happen is we get back to the while cycle and try again so eventually we
should find either block with enough free space or add new one (not sure
if this would actually ever happen in practice in heavily concurrent
workload where the FSM would not be correctly extended during relation
extension though, we might just loop here forever).



Btw thinking about it some more, ISTM that not finding the block and 
just doing the extension if the FSM wasn't extended correctly previously 
is probably cleaner behavior than what we do now. The reasoning for that 
opinion is that if the FSM wasn't extended, we'll fix it by doing 
relation extension since we know we do both in this code path and also 
if we could not find page before we'll most likely not find one even on 
retry and if there was page added at the end by extension that we might 
reuse partially here then there is no harm in adding new one anyway as 
the whole point of this patch is that it does bigger extension that 
strictly necessary so insisting on page reuse for something that seems 
like only theoretical possibility that does not even exist in current 
code does not seem right.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rationalizing code-sharing among src/bin/ directories

2016-03-23 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> Note: the reason for a new subdirectory, rather than putting this
>> stuff into src/common/, is that src/common/ is meant for code that's
>> shared between frontend and backend, which this stuff is not.  Also,
>> many of these files depend on libpq which seems like an inappropriate
>> dependency for src/common.

> Actually you could just list them in OBJS_FRONTEND in src/common.  That
> way they're not built for the server at all; no need for a new subdir.

Meh.  I think stuff in OBJS_FRONTEND has to be pretty weird special
cases, such as frontend emulations of server-environment code.  Which
is what fe_memutils is, so that's OK, but I kinda question whether
restricted_token.c belongs in src/common/ at all.

> The libpq dependency argument AFAICS only applies to the ones in
> src/bin/psql.  Perhaps we could have feutils with those only, and move
> the other files to src/common.

On looking closer, the only one that doesn't depend on libpq is
keywords.c, which seems like it belongs in the same place as kwlookup.c.
So yeah, let's move both of those to src/common.

Anybody want to bikeshed the directory name src/feutils?  Maybe fe-utils
would be more readable.  And where to put the corresponding header files?
src/include/fe-utils?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN is missing in multicolumn indexes documentation

2016-03-23 Thread Petr Jediný
>>
>> Multicolumn BRIN is like GIN.  Every column is indexed separately.
>> The order of the columns doesn't matter.
>
> Right.  You can use one index to cover all columns; the position of the
> column in the index won't matter for a query that uses one column.  The
> only reason to have multiple BRIN indexes on a single table is to have
> a different pages_per_range.

Thanks for the information, I've attached the patch updated to v2.


PJ


brin-multicolumn-doc-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Relation extension scalability

2016-03-23 Thread Robert Haas
On Wed, Mar 23, 2016 at 3:33 PM, Petr Jelinek  wrote:
> Btw thinking about it some more, ISTM that not finding the block and just
> doing the extension if the FSM wasn't extended correctly previously is
> probably cleaner behavior than what we do now. The reasoning for that
> opinion is that if the FSM wasn't extended, we'll fix it by doing relation
> extension since we know we do both in this code path and also if we could
> not find page before we'll most likely not find one even on retry and if
> there was page added at the end by extension that we might reuse partially
> here then there is no harm in adding new one anyway as the whole point of
> this patch is that it does bigger extension that strictly necessary so
> insisting on page reuse for something that seems like only theoretical
> possibility that does not even exist in current code does not seem right.

I'm not sure I completely follow this.  The fact that the last
sentence is 9 lines long may be related.  :-)

I think it's pretty clearly important to re-check the FSM after
acquiring the extension lock.  Otherwise, imagine that 25 backends
arrive at the exact same time.  The first one gets the lock and
extends the relation 500 pages; the next one, 480, and so on.  In
total, they extend the relation by 6500 pages, which is a bit rich.
Rechecking the FSM after acquiring the lock prevents that from
happening, and that's a very good thing.  We'll do one 500-page
extension and that's it.

However, I don't think using RecordAndGetPageWithFreeSpace rather than
GetPageWithFreeSpace is appropriate.  We've already recorded free
space on that page, and recording it again is a bad idea.  It's quite
possible that by the time we get the lock our old value is totally
inaccurate.  If there's some advantage to searching in the more
targeted way that RecordAndGetPageWithFreeSpace does over
GetPageWithFreeSpace then we need a new API into the freespace stuff
that does the more targeted search without updating anything.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Relation extension scalability

2016-03-23 Thread Petr Jelinek

On 23/03/16 20:43, Robert Haas wrote:

On Wed, Mar 23, 2016 at 3:33 PM, Petr Jelinek  wrote:

Btw thinking about it some more, ISTM that not finding the block and just
doing the extension if the FSM wasn't extended correctly previously is
probably cleaner behavior than what we do now. The reasoning for that
opinion is that if the FSM wasn't extended, we'll fix it by doing relation
extension since we know we do both in this code path and also if we could
not find page before we'll most likely not find one even on retry and if
there was page added at the end by extension that we might reuse partially
here then there is no harm in adding new one anyway as the whole point of
this patch is that it does bigger extension that strictly necessary so
insisting on page reuse for something that seems like only theoretical
possibility that does not even exist in current code does not seem right.


I'm not sure I completely follow this.  The fact that the last
sentence is 9 lines long may be related.  :-)



I tend to do that sometimes :)


I think it's pretty clearly important to re-check the FSM after
acquiring the extension lock.  Otherwise, imagine that 25 backends
arrive at the exact same time.  The first one gets the lock and
extends the relation 500 pages; the next one, 480, and so on.  In
total, they extend the relation by 6500 pages, which is a bit rich.
Rechecking the FSM after acquiring the lock prevents that from
happening, and that's a very good thing.  We'll do one 500-page
extension and that's it.



Right, but that would only happen if all the backends did it using 
different code which does not do the FSM extension because the current 
code does FSM extension and the point of using 
RecordAndGetPageWithFreeSpace seems to be "just in case" somebody is 
doing extension differently (at least I don't see other reason). So 
basically I am not saying we shouldn't do the search but that I agree 
GetPageWithFreeSpace should be enough as the worst that can happen is 
that we overextend the relation in case some theoretical code from 
somewhere else also did extension of relation without extending FSM 
(afaics).


But maybe Dilip had some other reason for using the 
RecordAndGetPageWithFreeSpace that is not documented.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rationalizing code-sharing among src/bin/ directories

2016-03-23 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera  writes:
> > Tom Lane wrote:

> > Actually you could just list them in OBJS_FRONTEND in src/common.  That
> > way they're not built for the server at all; no need for a new subdir.
> 
> Meh.  I think stuff in OBJS_FRONTEND has to be pretty weird special
> cases, such as frontend emulations of server-environment code.  Which
> is what fe_memutils is, so that's OK, but I kinda question whether
> restricted_token.c belongs in src/common/ at all.

OK, that makes sense.  Feel free to move restricted_token.c if you feel
like it.

> > The libpq dependency argument AFAICS only applies to the ones in
> > src/bin/psql.  Perhaps we could have feutils with those only, and move
> > the other files to src/common.
> 
> On looking closer, the only one that doesn't depend on libpq is
> keywords.c, which seems like it belongs in the same place as kwlookup.c.
> So yeah, let's move both of those to src/common.

I guess those dependencies are somewhat hidden.  No objections to that
move.

> Anybody want to bikeshed the directory name src/feutils?  Maybe fe-utils
> would be more readable.

Yes, +1 for either a - or _ in there.

> And where to put the corresponding header files?
> src/include/fe-utils?

Sounds fair but would that be installed in PREFIX/include/server?
That'd be a bit odd, but I'm not -1 on that.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Merlin Moncure
On Wed, Mar 23, 2016 at 2:17 PM, Vladimir Sitnikov
 wrote:
> Merlin>proposed would allow use of server side prepared statements with JDBC.
>
> It would not. If we discuss end-to-end scenarios in detail, we would end up 
> with
> "send full query on each execution" -> lex/gram on each execution kind
> of overheads.

I think we're talking over each other here.  I'm not suggesting the
jdbc driver needs to be adjusted.  All I'm saying is that the use of
server side prepared statements is extremely problematic in
conjunction with pgbouncer (or any technology where the application db
connection and the server session are not 1:1) and trivial with the
proposed patch.

Any discussion regarding jdbc is off topic relative to that.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Stephen Frost
Merlin,

* Merlin Moncure (mmonc...@gmail.com) wrote:
> No one is arguing that that you should send it any every time (at
> least -- I hope not).

I'm not sure I follow how you can avoid that though?

pgbouncer in transaction pooling mode may let a particular connection
die off and then, when you issue a new request, create a new one- which
won't have any prepared queries in it, even though you never lost your
connection to pgbouncer.

That's why I was saying you'd have to send it at the start of every
transaction, which does add to network load and requires parsing, etc.
Would be nice to avoid that, if possible, but I'm not quite sure how.

One thought might be to have the server somehow have a pre-canned set of
queries already set up and ready for you to use when you connect,
without any need to explicitly prepare them, etc.

Just a thought.  I do still like the general idea of INE support for
PREPARE, but perhaps there's a better option.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] VS 2015 support in src/tools/msvc

2016-03-23 Thread Petr Jelinek

On 23/03/16 13:05, Michael Paquier wrote:


OK, the please send an updated set of patches for what remains.



Here you go:
- 0001 fixes the global declarations of TIMEZONE_GLOBAL and
TZNAME_GLOBAL to be WIN32-compliant. I got bitten by that in the ECPG
compilation.
- 0002, support of VS2015 in the scripts of src/tools/msvc
- 0003, this is necessary in order to run the regression tests,
details are here:

http://www.postgresql.org/message-id/cab7npqtdixxs8cdl8moivh6qfq-1qv9mkn0ayjzybvzv6wc...@mail.gmail.com


So that's basically what Andres did? Interesting that we now actually really
need it. I was in favor of doing those changes in any case.


Yes, that's what Andres did. I am just attaching it here to allow
Andrew to test this patch set appropriately.


- 0004 is the fix for locales. This is actually Petr's work upthread,
I have updated comments in the code though and did a bit more
polishing. This still looks like the cleanest solution we have. Other
solutions are mentioned upthread: redeclaration of struct _locale_t in
backend code is one.


Thanks for polishing this.

I think this is ready to be committed, but the 0003 might be somewhat
controversial based on the original thread.


I thought that the end consensus regarding 0003 was to apply it, but
we could as well wait for the final verdict (committer push) on the
other thread. And well, if this is not applied, some of the gin tests
will complain about "right sibling of GIN page is of different type" a
couple of times.



Hmm I see you are right, I missed the last couple emails. Ok I'll mark 
it ready for committer - it does work fine on my vs2015 machine and I am 
happy with the code too. Well, as happy as I can be given the locale mess.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Stephen Frost
* Merlin Moncure (mmonc...@gmail.com) wrote:
> On Wed, Mar 23, 2016 at 12:37 PM, Tom Lane  wrote:
> > which is both SQL-standard semantics and much more efficient than
> > SRF-in-tlist.  We've more or less deprecated SRF-in-tlist since we
> > introduced LATERAL in 9.3.  How much are we willing to do to stay
> > bug-compatible with old behaviors here?
> 
> I think we should, and the fact this was caught so early on the
> release cycle underscores that.  One of the problems is that there are
> reasonable cases (note, not impacted by this bug) of this usage that
> are still commonplace, for example:
> 
> ysanalysis=# select unnest(current_schemas(true));
>unnest
> 
>  pg_catalog
>  public
> 
> I'm something of a backwards compatibility zealot, but I've become one
> for very good reasons.  Personally, I'd rather we'd define precisely
> the usages that are deprecated (I guess SRF-tlist in the presence of
> FROM) and force them to error out with an appropriate HINT rather than
> give a different answer than they used to.  The problem here is that
> LATERAL is still fairly new and there is a huge body of code out there
> leveraging the 'bad' way, as it was for years and years the only way
> to do a number of useful things.

I have to side with what I believe is Tom's position on this one.  I do
like the notion of throwing an error in cases where someone sent us
something that we're pretty sure is wrong, but I don't agree that we
should continue to carry on bug-compatibility with things that are
already one foot in the grave and really just need to be shoved all the
way in.

This isn't the only break in backwards compatibility we've had over the
years and is pretty far from the largest (string escaping, anyone?  or
removing implicit casts?) and I'd argue we're better off for it.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Vladimir Sitnikov
Merlin> All I'm saying is that the use of
Merlin> server side prepared statements is extremely problematic in
Merlin> conjunction with pgbouncer

I've filed https://github.com/pgbouncer/pgbouncer/issues/126 to get
pgbouncer improved in regard to prepared statements.

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Yury Zhuravlev

Michael Meskes wrote:

More than you think.

I doubt you want to propose removing features that make developing new
features harder, or do you? :)


I want to understand the situation. You may want to make the build ecpg 
optional. Personally, I want to.


--
Yury Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-03-23 Thread Andres Freund
On 2016-03-23 12:33:22 +0530, Amit Kapila wrote:
> On Wed, Mar 23, 2016 at 12:26 PM, Amit Kapila 
> wrote:
> >
> > On Tue, Mar 22, 2016 at 4:22 PM, Andres Freund  wrote:
> > >
> > >
> > > I think it's worthwhile to create a benchmark that does something like
> > > BEGIN;SELECT ... FOR UPDATE; SELECT pg_sleep(random_time);
> > > INSERT;COMMIT; you'd find that if random is a bit larger (say 20-200ms,
> > > completely realistic values for network RTT + application computation),
> > > the success rate of group updates shrinks noticeably.
> > >
> >
> > Will do some tests based on above test and share results.
> >
> 
> Forgot to mention that the effect of patch is better visible with unlogged
> tables, so will do the test with those and request you to use same if you
> yourself is also planning to perform some tests.

I'm playing around with SELECT txid_current(); right now - that should
be about the most specific load for setting clog bits.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rationalizing code-sharing among src/bin/ directories

2016-03-23 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> And where to put the corresponding header files?
>> src/include/fe-utils?

> Sounds fair but would that be installed in PREFIX/include/server?
> That'd be a bit odd, but I'm not -1 on that.

The only other plan I can think of is to put the .h files describing
src/fe-utils files into src/fe-utils itself.  Which would sort of match
the existing precedent of src/bin/ files looking into sibling directories
for relevant .h files, but it seems a bit odd too.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Tom Lane
Stephen Frost  writes:
> * Merlin Moncure (mmonc...@gmail.com) wrote:
>> I'm something of a backwards compatibility zealot, but I've become one
>> for very good reasons.  Personally, I'd rather we'd define precisely
>> the usages that are deprecated (I guess SRF-tlist in the presence of
>> FROM) and force them to error out with an appropriate HINT rather than
>> give a different answer than they used to.  The problem here is that
>> LATERAL is still fairly new and there is a huge body of code out there
>> leveraging the 'bad' way, as it was for years and years the only way
>> to do a number of useful things.

> I have to side with what I believe is Tom's position on this one.  I do
> like the notion of throwing an error in cases where someone sent us
> something that we're pretty sure is wrong, but I don't agree that we
> should continue to carry on bug-compatibility with things that are
> already one foot in the grave and really just need to be shoved all the
> way in.

FWIW, I think the case that we're looking at only comes up if you have
more than one SRF expression in the tlist (Regina's example has that
after *-expansion, though it might not've looked so to start with),
and some of them are sort/group columns while others are not.

So the question at hand is whether that falls within the scope of "huge
body of code leveraging the old way" or whether it's too much of a corner
case to want to tie ourselves down to.

I wrote a patch that fixes this for the sort-column case, thus getting us
back to the historical behavior; see attached.  If we really wanted to be
consistent, we'd have to look at pushing all SRFs clear back to the
scanjoin_target list if any SRFs appear in grouping columns.  That would
be significantly more code and it would deviate from the historical
behavior, as I illustrated upthread, so I'm not really inclined to do it.
But the historical behavior in this area is pretty unprincipled.

It's also worth noting that we've had multiple complaints about the
ExecTargetList least-common-multiple behavior over the years; it seems
sane enough in examples like these where all the SRFs have the same
period, but it gets way less so as soon as they don't.  I'd love to
toss the entire SRF-in-tlist feature overboard one of these years,
both because of semantic issues like these and because a fair amount
of code and overhead could be ripped out if it were disallowed.
But I don't know how we get to that --- as Merlin says, there's a
pretty large amount of application code depending on the feature.
In the meantime I suppose there's a case to be made for preserving
bug compatibility as much as possible.

So anyway the question is whether to commit the attached or not.

regards, tom lane

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 5229c84..db347b8 100644
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*** make_pathkeys_for_window(PlannerInfo *ro
*** 4615,4625 
   *
   * Our current policy is to postpone volatile expressions till after the sort
   * unconditionally (assuming that that's possible, ie they are in plain tlist
!  * columns and not ORDER BY/GROUP BY/DISTINCT columns).  We also postpone
!  * set-returning expressions unconditionally (if possible), because running
!  * them beforehand would bloat the sort dataset, and because it might cause
!  * unexpected output order if the sort isn't stable.  Expensive expressions
!  * are postponed if there is a LIMIT, or if root->tuple_fraction shows that
   * partial evaluation of the query is possible (if neither is true, we expect
   * to have to evaluate the expressions for every row anyway), or if there are
   * any volatile or set-returning expressions (since once we've put in a
--- 4615,4630 
   *
   * Our current policy is to postpone volatile expressions till after the sort
   * unconditionally (assuming that that's possible, ie they are in plain tlist
!  * columns and not ORDER BY/GROUP BY/DISTINCT columns).  We also prefer to
!  * postpone set-returning expressions, because running them beforehand would
!  * bloat the sort dataset, and because it might cause unexpected output order
!  * if the sort isn't stable.  However there's a constraint on that: all SRFs
!  * in the tlist should be evaluated at the same plan step, so that they can
!  * run in sync in ExecTargetList.  So if any SRFs are in sort columns, we
!  * mustn't postpone any SRFs.  (Note that in principle that policy should
!  * probably get applied to the group/window input targetlists too, but we
!  * have not done that historically.)  Lastly, expensive expressions are
!  * postponed if there is a LIMIT, or if root->tuple_fraction shows that
   * partial evaluation of the query is possible (if neither is true, we expect
   * to have to evaluate the expressions for every row anyway), or if there are
   * any volatile or set-returning expressions (since o

Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread David G. Johnston
On Wed, Mar 23, 2016 at 2:11 PM, Tom Lane  wrote:

>
> ​​
> In the meantime I suppose there's a case to be made for preserving
> bug compatibility as much as possible.
>
> So anyway the question is whether to commit the attached or not.


​+1 for commit - I'll trust Tom on the quality of the patch :)

David J.


Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Stephen Frost
* David G. Johnston (david.g.johns...@gmail.com) wrote:
> On Wed, Mar 23, 2016 at 2:11 PM, Tom Lane  wrote:
> > In the meantime I suppose there's a case to be made for preserving
> > bug compatibility as much as possible.
> >
> > So anyway the question is whether to commit the attached or not.
> 
> +1 for commit - I'll trust Tom on the quality of the patch :)

I'm not going to object to it.  All-in-all, I suppose '+0' from me.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Patch to implement pg_current_logfile() function

2016-03-23 Thread Gilles Darold
Le 22/03/2016 14:44, Michael Paquier a écrit :
> On Sat, Mar 12, 2016 at 12:46 AM, Gilles Darold
>  wrote:
>> Here is the patch rewritten to use alternate file
>> $PGDATA/pg_log_filename to store the current log filename used by
>> syslogger. All examples used in the first mail of this thread work the
>> exact same way. If there's no other remarks, I will add the patch to the
>> next commit fest.
> Please be sure to register this patch to the next CF:
> https://commitfest.postgresql.org/10/
> Things are hot enough with 9.6, so this will only be considered in 9.7.

Thanks for the reminder, here is the v3 of the patch after a deeper
review and testing. It is now registered to the next commit fest under
the System Administration topic.

Fixes in this patch are:

- Output file have been renamed as PGDATA/pg_log_file

- Log level of the warning when logging collector is not active has been
changed to NOTICE

postgres=# select pg_current_logfile();
NOTICE:  current log can not be reported, log collection is not active
 pg_current_logfile


(1 row)

- Log level for file access errors in function
store_current_log_filename() of file src/backend/postmaster/syslogger.c
has been set to WARNING, using ERROR level forced the backend to stop
with a FATAL error.

- Add information about file PGDATA/pg_log_file in storage file layout
of doc/src/sgml/storage.sgml


-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index ae93e69..155e76b 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -15158,6 +15158,11 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
(0 if not called, directly or indirectly, from inside a trigger)
   
 
+  pg_current_logfile()
+   text
+   current log file used by the logging collector
+  
+
   
session_user
name
@@ -15365,6 +15370,16 @@ SET search_path TO schema , schema, ..
 pg_notification_queue_usage

 
+   
+pg_current_logfile
+   
+
+   
+pg_current_logfile returns the name of the current log
+file used by the logging collector, as a text. Log collection
+must be active.
+   
+

 pg_listening_channels returns a set of names of
 asynchronous notification channels that the current session is listening
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index 9b2e09e..6284d54 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -170,6 +170,13 @@ last started with
   (this file is not present after server shutdown)
 
 
+
+ pg_log_file
+ A file recording the current log file used by the syslogger
+  when log collection is active
+
+
+
 
 
 
diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c
index e7e488a..c3bf672 100644
--- a/src/backend/postmaster/syslogger.c
+++ b/src/backend/postmaster/syslogger.c
@@ -145,6 +145,7 @@ static char *logfile_getname(pg_time_t timestamp, const char *suffix);
 static void set_next_rotation_time(void);
 static void sigHupHandler(SIGNAL_ARGS);
 static void sigUsr1Handler(SIGNAL_ARGS);
+static void store_current_log_filename(char *filename);
 
 
 /*
@@ -571,6 +572,9 @@ SysLogger_Start(void)
 
 	syslogFile = logfile_open(filename, "a", false);
 
+	/* store information about current log filename used by log collection */
+	store_current_log_filename(filename);
+
 	pfree(filename);
 
 #ifdef EXEC_BACKEND
@@ -1209,6 +1213,9 @@ logfile_rotate(bool time_based_rotation, int size_rotation_for)
 		fclose(syslogFile);
 		syslogFile = fh;
 
+		/* store information about current log filename used by log collection */
+		store_current_log_filename(filename);
+
 		/* instead of pfree'ing filename, remember it for next time */
 		if (last_file_name != NULL)
 			pfree(last_file_name);
@@ -1253,6 +1260,9 @@ logfile_rotate(bool time_based_rotation, int size_rotation_for)
 		fclose(csvlogFile);
 		csvlogFile = fh;
 
+		/* store information about current log filename used by log collection */
+		store_current_log_filename(csvfilename);
+
 		/* instead of pfree'ing filename, remember it for next time */
 		if (last_csv_file_name != NULL)
 			pfree(last_csv_file_name);
@@ -1362,3 +1372,35 @@ sigUsr1Handler(SIGNAL_ARGS)
 
 	errno = save_errno;
 }
+
+
+/*
+ * Store the name of the file where current log messages are written when
+ * log collector is enabled. Useful to find the name of the current log file
+ * when a time-based rotation is defined. 
+ */
+static void
+store_current_log_filename(char *filename)
+{
+	FILE   *fh;
+	charlogpathfilename[MAXPGPATH];
+
+	snprintf(logpathfilename, sizeof(logpathfilename), "%s",
+		CURRENT_LOG_FILENAME);
+	if ((fh = fopen(logpathfilename, "w")) == NULL)
+	{
+	   ereport(WARNING,
+			   (errcode_for_file_access(),
+			   errmsg("could not open log file \"%s\": %m",
+	   logpathfilename)));
+	   return;
+	}
+	if (f

Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Tom Lane
I wrote:
> ... I'd love to
> toss the entire SRF-in-tlist feature overboard one of these years,
> both because of semantic issues like these and because a fair amount
> of code and overhead could be ripped out if it were disallowed.
> But I don't know how we get to that --- as Merlin says, there's a
> pretty large amount of application code depending on the feature.

BTW, after further thought I seem to recall some discussion about whether
we could mechanically transform SRF-in-tlist into a LATERAL query.
That is, we could make the rewriter or planner convert

SELECT srf1(...), srf2(...)
  FROM ...
  WHERE ...

into

SELECT u.s1, u.s2
  FROM ...,  LATERAL ROWS FROM(srf1(...), srf2(...)) AS u(s1,s2)
  WHERE ...

(The SRF invocations might be buried inside expressions, but we'd find
them and convert them anyway.  Also, we'd merge textually-identical SRF
invocations into a single ROWS FROM entry to avoid multiple evaluation,
at least for SRFs not marked volatile.)  Having done that, the executor's
support for SRFs in general expressions could be removed, a significant
savings.

One problem with this is that it only duplicates the current behavior
in cases where the SRFs all have the same period.  But you could argue
that the current behavior in cases where they don't is widely considered
a bug anyway.

A possibly larger problem is that it causes the SRFs to be evaluated
before sorting/ordering/limiting.  In view of the discussions that led
up to 9118d03a8, that could be a fatal objection :-(.  Maybe we could
get around it with a different transformation, into a two-level query?
The above sketch doesn't really consider what to do with GROUP BY,
ORDER BY, etc, but maybe we could push those down into a sub-select
that becomes the first FROM item.

Anyway, that's food for thought not something that could realistically
happen right now.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >