Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-17 Thread Andres Freund
Hi,

On 2017-10-17 14:39:57 -0700, Andres Freund wrote:
> I've spent some time looking into this, and I'm not quite convinced this
> is the right approach.  Let me start by describing where I see the
> current performance problems around gather stemming from.

One further approach to several of these issues could also be to change
things a bit more radically:

Instead of the current shm_mq + tqueue.c, have a drastically simpler
queue, that just stores fixed width dsa_pointers. Dealing with that
queue will be quite a bit faster. In that queue one would store dsa.c
managed pointers to tuples.

One thing that makes that attractive is that that'd move a bunch of
copying in the leader process solely to the worker processes, because
the leader could just convert the dsa_pointer into a local pointer and
hand that upwards the execution tree.

We'd possibly need some halfway clever way to reuse dsa allocations, but
that doesn't seem impossible.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise aggregation/grouping

2017-10-17 Thread Dilip Kumar
On Tue, Oct 17, 2017 at 10:44 PM, Jeevan Chalke
 wrote:
>

> I didn't get what you mean by regression here. Can you please explain?
>
> I see that PWA plan is selected over regular plan when enabled on the basis
> of costing.
> Regular planning need a Result node due to which costing increases where as
> PWA don't need that and thus wins.

Sorry for not clearly explaining,  I meant that with normal plan
execution time is 263.678 ms whereas with PWA its 339.929 ms.

I only set enable_partition_wise_agg=on and it switched to PWA and
execution time increased by 30%.
I understand that the this is the worst case for PWA where
FinalizeAggregate is getting all the tuple.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] A handful of typos in allpaths.c

2017-10-17 Thread David Rowley
A small patch to fix these is attached.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


allpath_typos_fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Is anything preventing us from allowing write to foreign tables from standby?

2017-10-17 Thread Michael Paquier
On Wed, Oct 18, 2017 at 9:14 AM, Craig Ringer  wrote:
> Superficially at least, it sounds like a good idea.

Indeed.

> We should only need a virtual xid when we're working with foreign
> tables since we don't do any local heap changes.
>
> How's it work with savepoints?

That's one thing to worry about.

At least to me, it feels like cheating to allow an INSERT query to
happen for a transaction which is read-only actually read-only because
XactReadOnly is set to true when the transaction starts. I am
wondering if we should extend BEGIN TRANSACTION with a sort of "WRITE
ONLY FOREIGN" mode, which allows read queries as well as write queries
for foreign tables, because we know that those will not generate WAL
locally. This way it would be possible to block as well INSERT queries
happening in a transaction which should be intrinsically read-only.

+ if (rte->relkind == 'f')
+ continue;
Better to use RELKIND_FOREIGN_TABLE here.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Implement table partitioning.

2017-10-17 Thread Amit Langote
On 2017/10/18 1:52, Alvaro Herrera wrote:
> Alvaro Herrera wrote:
>> Robert Haas wrote:
>>> Implement table partitioning.
>>
>> Is it intentional that you can use ALTER TABLE OWNER TO on the parent
>> table, and that this does not recurse to modify the partitions' owners?
>> This doesn't seem to be mentioned in comments nor documentation, so it
>> seems an oversight to me.

Hmm, I would say of it that the new partitioning didn't modify the
behavior that existed for inheritance.

That said, I'm not sure if the lack of recursive application of ownership
change to descendant tables is unintentional.

> The alter table docs say that ONLY must be specified if one does not
> want to modify descendants, so I think this is a bug.

Just to clarify, if we do think of it as a bug, then it will apply to the
inheritance case as well, right?

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] alter table doc fix

2017-10-17 Thread Amit Langote
Hi.

Noticed that a alter table sub-command's name in Description (where it's
OWNER) differs from that in synopsis (where it's OWNER TO).  Attached
patch to make them match, if the difference is unintentional.

Thanks,
Amit
diff --git a/doc/src/sgml/ref/alter_table.sgml 
b/doc/src/sgml/ref/alter_table.sgml
index 68393d70b4..b4b8dab911 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -713,7 +713,7 @@ ALTER TABLE [ IF EXISTS ] name

 

-OWNER
+OWNER TO
 
  
   This form changes the owner of the table, sequence, view, materialized 
view,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Is anything preventing us from allowing write to foreign tables from standby?

2017-10-17 Thread Craig Ringer
On 18 October 2017 at 02:01, Alexander Korotkov
 wrote:
> On Wed, Sep 6, 2017 at 4:42 PM, Alexander Korotkov
>  wrote:
>>
>> We're currently blocking writing queries on standby if even they are
>> modifying contents of foreign tables.  But do we have serious reasons for
>> that?
>> Keeping in the mind FDW-sharding, making FDW-tables writable from standby
>> would be good to prevent single-master bottleneck.
>> I wrote simple patch enabling writing to foreign tables from standbys.  It
>> works from the first glance for me.
>
>
> No interest yet, but no objections too :-)
> I'm going to add this to next commitfest.

Superficially at least, it sounds like a good idea.

We should only need a virtual xid when we're working with foreign
tables since we don't do any local heap changes.

How's it work with savepoints?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] Faster processing at Gather node

2017-10-17 Thread Andres Freund
Hi Everyone,

On 2017-05-19 17:25:38 +0530, Rafia Sabih wrote:
> While analysing the performance of TPC-H queries for the newly developed
> parallel-operators, viz, parallel index, bitmap heap scan, etc. we noticed
> that the time taken by gather node is significant. On investigation, as per
> the current method it copies each tuple to the shared queue and notifies
> the receiver. Since, this copying is done in shared queue, a lot of locking
> and latching overhead is there.
>
> So, in this POC patch I tried to copy all the tuples in a local queue thus
> avoiding all the locks and latches. Once, the local queue is filled as per
> it's capacity, tuples are transferred to the shared queue. Once, all the
> tuples are transferred the receiver is sent the notification about the same.
>
> With this patch I could see significant improvement in performance for
> simple queries,

I've spent some time looking into this, and I'm not quite convinced this
is the right approach.  Let me start by describing where I see the
current performance problems around gather stemming from.

The observations here are made using
select * from t where i < 3000 offset 2999 - 1;
with Rafia's data. That avoids slowdowns on the leader due to too many
rows printed out. Sometimes I'll also use
SELECT * FROM lineitem WHERE l_suppkey > '5012' OFFSET 10 LIMIT 1;
on tpch data to show the effects on wider tables.

The precise query doesn't really matter, the observations here are more
general, I hope.

1) nodeGather.c re-projects every row from workers. As far as I can tell
   that's now always exactly the same targetlist as it's coming from the
   worker. Projection capability was added in 8538a6307049590 (without
   checking whether it's needed afaict), but I think it in turn often
   obsoleted by 992b5ba30dcafdc222341505b072a6b009b248a7.  My
   measurement shows that removing the projection yields quite massive
   speedups in queries like yours, which is not too surprising.

   I suspect this just needs a tlist_matches_tupdesc check + an if
   around ExecProject(). And a test, right now tests pass unless
   force_parallel_mode is used even if just commenting out the
   projection unconditionally.

   before commenting out nodeGather projection:

   rafia time: 8283.583
   rafia profile:
+   30.62%  postgres  postgres [.] shm_mq_receive
+   18.49%  postgres  postgres [.] s_lock
+   10.08%  postgres  postgres [.] SetLatch
-7.02%  postgres  postgres [.] slot_deform_tuple
   - slot_deform_tuple
  - 88.01% slot_getsomeattrs
   ExecInterpExpr
   ExecGather
   ExecLimit

   lineitem time: 8448.468
   lineitem profile:
+   24.63%  postgres  postgres [.] slot_deform_tuple
+   24.43%  postgres  postgres [.] shm_mq_receive
+   17.36%  postgres  postgres [.] ExecInterpExpr
+7.41%  postgres  postgres [.] s_lock
+5.73%  postgres  postgres [.] SetLatch

after:
   rafia time: 6660.224
   rafia profile:
+   36.77%  postgres  postgres  [.] shm_mq_receive
+   19.33%  postgres  postgres  [.] s_lock
+   13.14%  postgres  postgres  [.] SetLatch
+9.22%  postgres  postgres  [.] AllocSetReset
+4.27%  postgres  postgres  [.] ExecGather
+2.79%  postgres  postgres  [.] AllocSetAlloc

   lineitem time: 4507.416
   lineitem profile:
+   34.81%  postgres  postgres[.] shm_mq_receive
+   15.45%  postgres  postgres[.] s_lock
+   13.38%  postgres  postgres[.] SetLatch
+9.87%  postgres  postgres[.] AllocSetReset
+5.82%  postgres  postgres[.] ExecGather

   as quite clearly visible, avoiding the projection yields some major
   speedups.

   The following analysis here has the projection removed.

2) The spinlocks both on the the sending and receiving side a quite hot:

   rafia query leader:
+   36.16%  postgres  postgres[.] shm_mq_receive
+   19.49%  postgres  postgres[.] s_lock
+   13.24%  postgres  postgres[.] SetLatch

   The presence of s_lock shows us that we're clearly often contending
   on spinlocks, given that's the slow-path for SpinLockAcquire(). In
   shm_mq_receive the instruction profile shows:

   │   SpinLockAcquire(>mq_mutex);
   │1 5ab:   mov$0xa9b580,%ecx
   │ mov$0x4a4,%edx
   │ mov$0xa9b538,%esi
   │ mov%r15,%rdi
   │   → callq  s_lock
   │   ↑ jmpq   2a1
   │   tas():
   │1 5c7:   mov$0x1,%eax
 32.83 │ lock   xchg %al,(%r15)
   │   shm_mq_inc_bytes_read():
   │   SpinLockAcquire(>mq_mutex);
and
  0.01 │ pop%r15
  0.04 │   ← retq
   │ nop
   │   tas():
   │1 338:   mov$0x1,%eax
 17.59 │ lock   xchg %al,(%r15)
   │   

[HACKERS] v10 telease note for pg_basebackup refers to old --xlog-method argument

2017-10-17 Thread David G. Johnston
diff --git a/doc/src/sgml/release-10.sgml b/doc/src/sgml/release-10.sgml
index 116f7224da..f1f7cfed5f 100644
--- a/doc/src/sgml/release-10.sgml
+++ b/doc/src/sgml/release-10.sgml
@@ -242,7 +242,7 @@

  
   This changes pg_basebackup's
-  -X/--xlog-method default to
stream.
+  -X/--wal-method default to
stream.
   An option value none has been added to reproduce
the old
   behavior.  The pg_basebackup option
-x
   has been removed (instead, use -X fetch).

Also attached.

David J.
diff --git a/doc/src/sgml/release-10.sgml b/doc/src/sgml/release-10.sgml
index 116f7224da..f1f7cfed5f 100644
--- a/doc/src/sgml/release-10.sgml
+++ b/doc/src/sgml/release-10.sgml
@@ -242,7 +242,7 @@
 
  
   This changes pg_basebackup's
-  -X/--xlog-method default to 
stream.
+  -X/--wal-method default to 
stream.
   An option value none has been added to reproduce the 
old
   behavior.  The pg_basebackup option 
-x
   has been removed (instead, use -X fetch).

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Determine state of cluster (HA)

2017-10-17 Thread Vik Fearing
On 10/17/2017 08:40 PM, Joshua D. Drake wrote:
> On 10/16/2017 07:31 PM, Craig Ringer wrote:
>> On 17 October 2017 at 01:02, Joshua D. Drake 
>> wrote:
>>> On 10/15/2017 07:39 PM, Craig Ringer wrote:
> 
 - Get info about master. We should finish merging recovery.conf into
 postgresql.conf.
>>>
>>> Definitely.
>>
>> There's a patch from Abhijit Menon-Sen you could adopt for PostgreSQL
>> 11 for that.
> 
> Do you have a link to this?

https://commitfest.postgresql.org/search/?searchterm=recovery.conf

-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization

2017-10-17 Thread Fabien COELHO


Hello Masahiko-san,


Attached the updated version patch.


Applies, compiles, make check & tap test ok, doc is fine.

All is well for me: the feature is useful, code is simple and clean, it is 
easy to invoke, easy to extend as well, which I'm planning to do once it gets 
in.


I switched the patch to "Ready for Committers". No doubt they will have their 
own opinions about it. Let's wait and see.


The patch needs a rebase in the documentation because of the xml-ilization 
of the sgml doc.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windows warnings from VS 2017

2017-10-17 Thread Andres Freund
On 2017-10-11 18:20:24 -0400, Tom Lane wrote:
> Well, it's not that much work to try it and see.  I compared results
> of this simplistic test case:
>   pgbench -S -c 1 -T 60 bench
> (using a scale-factor-10 pgbench database) on current HEAD and HEAD
> with the attached patch, which just lobotomizes all the MemSet
> macros into memset().  Median of 3 runs is 9698 tps for HEAD and
> 9675 tps with the patch; the range is ~100 tps though, making
> this difference well within the noise level.
> 
> I did this using RHEL6's gcc (Red Hat 4.4.7-18), which is pretty
> far from bleeding-edge so I think it's okay as a baseline for
> what optimizations we can expect to find used in the field.
> 
> So I can't sustain Andres' assertion that memset is actually faster
> for the cases we care about, but it certainly doesn't seem any
> slower either.  It would be interesting to check compromise
> patches, such as keeping only the MemSetLoop case.

Well, that'd be because that workload doesn't exercise either version of
memset to a meaningful degree, right?

> +#define MemSetAligned(start, val, len) memset(start, val, len)

If we actually care about this optimization, which seems to solely serve
palloc0fast, we could tell the compiler about the guaranteed alignment
etc.  I'm doubtful it's worth making this work for palloc0fast alone,
but I think it might be worthwhile to make palloc0 an inline function
like

#ifdef YES_I_AM_A_GOOD_COMPILER
/* not needed here, but probably good for plenty other places */
#define pg_attribute_assume_aligned(align) 
__attribute__((assume_aligned(align)))
#define pg_assume_aligned(ptr, align)  __builtin_assume_aligned(ptr, align)
#else
...
#end

extern void *palloc(Size size) pg_attribute_assume_aligned(MAXALIGN);

static inline void *
palloc0(Size size)
{
void *mem = palloc(size);

memset(mem, 0, size);
return mem;
}

that should allow the compiler to always optimize the memset based on
the alignment - at least x86 gcc does so - and on the size if a
built-time constant (nearly every compiler does so).

I assume we probably should do this dance not just for palloc, but for
the other allocation functions too.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fix traversal of half-frozen update chains

2017-10-17 Thread Peter Geoghegan
On Tue, Oct 17, 2017 at 3:02 AM, Alvaro Herrera  wrote:
> Yeah, me too.  If you see another way to fix the problem, let's discuss
> it.

I doubt that there is a better way.

> I think a possible way is to avoid considering that the relfrozenxid
> value computed by the caller is final.

While that alternative seems possible, it also seems riskier.

> One thing I didn't quite investigate is why this bug only shows up with
> multixacts so far.  Is it just because multixacts provide an easy way to
> reproduce it, and that there are others, more difficult ways to cause
> the same problem without involving multixacts?  If so, then the problem
> is likely present in 9.2 as well.

The obvious explanation (although not necessarily the correct one) is
that freezing didn't have a MultiXactIdGetUpdateXid() call in 9.2. The
way we pass down both cutoff_xid and cutoff_multi to
FreezeMultiXactId() seems like it might be involved in the data
corruption that we saw (the incorrect pruning/failed to find parent
tuple thing).

I might spend some time figuring this out later in the week. It's hard
to pin down, and I've only really started to learn about MultiXacts in
the past few months.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Determine state of cluster (HA)

2017-10-17 Thread Joshua D. Drake

On 10/16/2017 07:31 PM, Craig Ringer wrote:

On 17 October 2017 at 01:02, Joshua D. Drake  wrote:

On 10/15/2017 07:39 PM, Craig Ringer wrote:



- Get info about master. We should finish merging recovery.conf into
postgresql.conf.



Definitely.


There's a patch from Abhijit Menon-Sen you could adopt for PostgreSQL
11 for that.



Do you have a link to this?

Thanks!

JD

--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
* Unless otherwise stated, opinions are my own.   *


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Is anything preventing us from allowing write to foreign tables from standby?

2017-10-17 Thread Alexander Korotkov
On Wed, Sep 6, 2017 at 4:42 PM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

> We're currently blocking writing queries on standby if even they are
> modifying contents of foreign tables.  But do we have serious reasons for
> that?
> Keeping in the mind FDW-sharding, making FDW-tables writable from standby
> would be good to prevent single-master bottleneck.
> I wrote simple patch enabling writing to foreign tables from standbys.  It
> works from the first glance for me.
>

No interest yet, but no objections too :-)
I'm going to add this to next commitfest.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] Partition-wise aggregation/grouping

2017-10-17 Thread Jeevan Chalke
On Tue, Oct 17, 2017 at 7:13 PM, Dilip Kumar  wrote:

> On Fri, Oct 13, 2017 at 12:06 PM, Jeevan Chalke
>  wrote:
> >
> While playing around with the patch I have noticed one regression with
> the partial partition-wise aggregate.
>
> I am consistently able to reproduce this on my local machine.
>
> Scenario: Group by on non-key column and only one tuple per group.
>

I didn't get what you mean by regression here. Can you please explain?

I see that PWA plan is selected over regular plan when enabled on the basis
of costing.
Regular planning need a Result node due to which costing increases where as
PWA don't need that and thus wins.


-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] [COMMITTERS] pgsql: Implement table partitioning.

2017-10-17 Thread Alvaro Herrera
Alvaro Herrera wrote:
> Robert Haas wrote:
> > Implement table partitioning.
> 
> Is it intentional that you can use ALTER TABLE OWNER TO on the parent
> table, and that this does not recurse to modify the partitions' owners?
> This doesn't seem to be mentioned in comments nor documentation, so it
> seems an oversight to me.

The alter table docs say that ONLY must be specified if one does not
want to modify descendants, so I think this is a bug.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SIGSEGV in BRIN autosummarize

2017-10-17 Thread Tom Lane
Alvaro Herrera  writes:
> cur_datname here seems corrupted -- it points halfway into cur_nspname,
> which is also a corrupt value.

Yeah.

> And I think that's because we're not
> checking that the namespace OID is a valid value before calling
> get_namespace_name on it.

No, because get_namespace_name should return NULL if it doesn't find
any pg_namespace entry.  That would work just as well for InvalidOid
as for any other nonexistent schema OID.

The part of your patch that adds a check on avw_database is clearly
correct and necessary.  I'm thinking the change you propose in
perform_work_item is just overcomplicating code that's okay as it
stands.  We don't need to optimize for the schema-went-away case.

What I'm suspicious of as the actual bug cause is the comment in
perform_work_item about how we need to be sure that we're allocating these
strings in a long-lived context.  If, in fact, they were allocated in some
context that could get reset during the PG_TRY (particularly in the
error-cleanup path, which I bet we don't test), then the observed symptom
that the pointers seem to be pointing at garbage could be explained.

So what I'm thinking is that you need an error during perform_work_item,
and/or more than one work_item picked up in the calling loop, to make this
bug manifest.  You would need to enter perform_work_item in a
non-long-lived context, so the first time through the loop is probably
safe anyway.

BTW, it seems like the "Perform operations on collected tables." loop in
do_autovacuum has probably got similar latent bugs.  We take care to enter
it in AutovacMemCxt initially, but it looks to me like subsequent
iterations probably start out in some transaction context, because the
PG_TRY around autovacuum_do_vac_analyze doesn't do anything about
switching back to AutovacMemCxt.  There needs to be a bit more clarity
throughout this code about what CurrentMemoryContext ought to be at each
point.  Appropriate fixes might involve switching back to AutovacMemCxt
after each of those PG_TRY blocks.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Implement table partitioning.

2017-10-17 Thread Alvaro Herrera
Robert Haas wrote:
> Implement table partitioning.

Is it intentional that you can use ALTER TABLE OWNER TO on the parent
table, and that this does not recurse to modify the partitions' owners?
This doesn't seem to be mentioned in comments nor documentation, so it
seems an oversight to me.

Thoughts?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: enabling parallel execution for cursors explicitly (experimental)

2017-10-17 Thread Tomas Vondra


On 10/17/2017 03:16 PM, Robert Haas wrote:
> On Sat, Oct 14, 2017 at 2:14 PM, Tomas Vondra
>  wrote:
>> I propose is to add a new cursor option (PARALLEL), which would allow
>> parallel plans for that particular user-defined cursor. Attached is an
>> experimental patch doing this (I'm sure there are some loose ends).
> 
> I think you would need to do a huge amount of additional work in
> order to actually make this robust. I believe that a fair amount of
> what goes on in parallel mode right now is checked with elog() if we
> think that it's unreachable without writing C code -- but this will
> make a lot of those things reachable, which means they would need to
> be checked some other way.

Sure, additional checks may be necessary. I've tried to come up with
examples causing crashes, and haven't succeeded. Of course, that's no
proof of correctness, so if you have an example that will crash and burn
I'd like to see see it.

In general, it may be acceptable to rely on the elog() checks - which is
pretty much what the FETCH+INSERT+SHARE example I shared in the first
message of this thread does. I agree it's not particularly convenient,
and perhaps we should replace it with checks while planning the queries.

> Also, I doubt this guarantees that we won't try to call
> parallel-unsafe functions will parallel mode is active, so things
> will just blow up in whatever way they do, maybe crashing the server
> or rendering the database inconsistent or whatever.
> 

Probably. What happens right now is that declaring the cursor switches
the whole transaction into parallel mode (EnterParallelMode), so if you
do FETCH + INSERT + FETCH that will fail with elog(ERROR).

But yeah, this should probably be checked at planning time, and the
whole query should fail if the transaction is in parallel mode and the
query contains unsafe/restricted functions.

> Possibly I'm overestimating the extent of the danger, but I don't
> think so.  You're try to take a mechanism that was only ever meant to
> be active during the course of one query and applying it for long
> periods of time during which a user can do anything, with basically no
> upgrade of the infrastructure.  I think something like this could be
> made to work if you put a large amount of energy into it, but I think
> the patch as proposed is about the easiest 3-5% of what would actually
> be required to cover all the holes.
> 

Well, soliciting feedback like this was one of the points of sharing the
experimental patch, so thank you for that. I'm not sure if the estimate
of 3-5% is accurate, but I'm sure the patch is incomplete - which is why
I marked it as experimental, after all.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Query got Killed with CTE.

2017-10-17 Thread Craig Ringer
On 17 October 2017 at 21:18, Prabhat Sahu 
wrote:

> Hi,
>
> While quering with CTE against PG HEAD , i found that query got killed
> with this below error logs
> -- Machine Configuration: (d1.xlarge) CUPs : 8 , RAM  : 16GB , SIze : 640GB
>
> postgres=# with x as (select 5 c1 from generate_series(1,100) x)
> select * from x x1 join x x2 using(c1);
> Killed
> 2017-10-17 14:12:33.558 BST [949] LOG:  could not send data to client:
> Broken pipe
> 2017-10-17 14:12:33.558 BST [949] STATEMENT:  with x as (select 5 c1 from
> generate_series(1,100) x) select * from x x1 join x x2 using(c1);
> 2017-10-17 14:12:33.559 BST [949] FATAL:  connection to client lost
> 2017-10-17 14:12:33.559 BST [949] STATEMENT:  with x as (select 5 c1 from
> generate_series(1,100) x) select * from x x1 join x x2 using(c1);
>
>
You produced a huge cross-product by the looks, and psql ran of of RAM
buffering the result. The OOM killer fired (check 'dmesg' to confirm) and
killed psql.  The server noticed psql going away, and reported the fact.

None of this is surprising. What's the problem here?


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] SIGSEGV in BRIN autosummarize

2017-10-17 Thread Craig Ringer
On 17 October 2017 at 22:39, Tom Lane  wrote:
> Justin Pryzby  writes:
>> On Tue, Oct 17, 2017 at 09:34:24AM -0400, Tom Lane wrote:
>>> So: where did you get the existing binaries?  If it's from some vendor
>>> packaging system, what you should do is fetch the package source, add
>>> the patch to the probably-nonempty set of patches the vendor is applying,
>>> and rebuild your own custom package version.  If you haven't done that
>>> before, it's a good skill to acquire ...
>
>> I'm familiar with that process; but, these are PG10 binaries from PGDG for
>> centos6 x64.
>
> Well, I'm pretty sure Devrim builds those using the RPM build process.
> I'd have grabbed his SRPM and proceeded as above.

Yep, or unpack the tarball and apply the patch then build with the
same configure options as you find in the rpm spec. That can be easier
when iterating tests and builds.

Since the patches are separate, you can skip the tarball and clone the
same tag from git instead. Then apply the rpm patches as separate
commits. That's typically what I'll do, makes it easier to keep track.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SIGSEGV in BRIN autosummarize

2017-10-17 Thread Alvaro Herrera
Justin Pryzby wrote:

> I'm happy to try the patch, but in case it makes any difference, we have few
> DBs/schemas:

I don't expect that it does.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SIGSEGV in BRIN autosummarize

2017-10-17 Thread Tom Lane
Justin Pryzby  writes:
> On Tue, Oct 17, 2017 at 09:34:24AM -0400, Tom Lane wrote:
>> So: where did you get the existing binaries?  If it's from some vendor
>> packaging system, what you should do is fetch the package source, add
>> the patch to the probably-nonempty set of patches the vendor is applying,
>> and rebuild your own custom package version.  If you haven't done that
>> before, it's a good skill to acquire ...

> I'm familiar with that process; but, these are PG10 binaries from PGDG for
> centos6 x64.

Well, I'm pretty sure Devrim builds those using the RPM build process.
I'd have grabbed his SRPM and proceeded as above.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SIGSEGV in BRIN autosummarize

2017-10-17 Thread Justin Pryzby
On Tue, Oct 17, 2017 at 09:34:24AM -0400, Tom Lane wrote:
> Justin Pryzby  writes:
> > On Tue, Oct 17, 2017 at 12:59:16PM +0200, Alvaro Herrera wrote:
> >> Anyway, can give this patch a try?
> 
> > I've only compiled postgres once before and this is a production environment
> > (althought nothing so important that the crashes are a serious concern 
> > either).
> 
> > Is it reasonable to wget the postgres tarball, apply the patch, and run the
> > compiled postgres binary from the source tree, without running make install 
> > or
> > similar ?  Otherwise, would it be good enough to copy the postgres binary to
> > /usr/pgsql-10/bin (and reinstall the binary package later) ?
> 
> The trick in this sort of situation is to make sure you build binaries
> that match your existing install in every way except having the added
> patch, and maybe getting installed into a different directory.
> 
> So: where did you get the existing binaries?  If it's from some vendor
> packaging system, what you should do is fetch the package source, add
> the patch to the probably-nonempty set of patches the vendor is applying,
> and rebuild your own custom package version.  If you haven't done that
> before, it's a good skill to acquire ...

I'm familiar with that process; but, these are PG10 binaries from PGDG for
centos6 x64.  I had to add symlinks for postgis library, but otherwise seems to
be working fine (although I didn't preserve as many configure options as your
message would suggest I should have).

Justin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add recovery_min_apply_delay_reconnect recovery option

2017-10-17 Thread Michael Paquier
On Tue, Oct 17, 2017 at 10:40 PM, Eric Radman  wrote:
> On Tue, Oct 17, 2017 at 12:34:17PM +0900, Michael Paquier wrote:
> I thought I had observed cases where the WalReceiver was shut down
> without causing XLogCtl->recoveryWakeupLatch to return. If I'm wrong
> about this then there's no reason to spin every n seconds.

I would expect a patch to not move the timeout calculation within the
loop in recoveryApplyDelay() as you did.

> Which record are you suggesting should be forcibly "read again"?  The
> record identified by XLogCtl->replayEndRecPtr or
> XLogCtl->lastReplayedEndRecPtr?  I'll look more carefully at such an
> approach.

I have not looked at how to do that in details, but as the delay is
applied for commit WAL records, you would need to make the redo loop
look again at this same record once you have switched back to a
streaming state. Something to be careful about is that you should not
apply the same delay multiple times for the same record.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise aggregation/grouping

2017-10-17 Thread Dilip Kumar
On Fri, Oct 13, 2017 at 12:06 PM, Jeevan Chalke
 wrote:
>
While playing around with the patch I have noticed one regression with
the partial partition-wise aggregate.

I am consistently able to reproduce this on my local machine.

Scenario: Group by on non-key column and only one tuple per group.

Complete Test:

create table t(a int,b int) partition by range(a);
create table t1 partition of t for values from (1) to (10);
create table t2 partition of t for values from (10) to (20);

insert into t values (generate_series(1,19),generate_series(1, 19));
postgres=# explain analyze select sum(a) from t group by b;
  QUERY
PLAN
--
 Finalize GroupAggregate  (cost=20379.55..28379.51 rows=19
width=12) (actual time=102.311..322.969 rows=19 loops=1)
   Group Key: t1.b
   ->  Merge Append  (cost=20379.55..25379.53 rows=19 width=12)
(actual time=102.303..232.310 rows=19 loops=1)
 Sort Key: t1.b
 ->  Partial GroupAggregate  (cost=10189.72..11939.70
rows=9 width=12) (actual time=52.164..108.967 rows=9 loops=1)
   Group Key: t1.b
   ->  Sort  (cost=10189.72..10439.72 rows=9 width=8)
(actual time=52.158..66.236 rows=9 loops=1)
 Sort Key: t1.b
 Sort Method: external merge  Disk: 1768kB
 ->  Seq Scan on t1  (cost=0.00..1884.99
rows=9 width=8) (actual time=0.860..20.388 rows=9 loops=1)
 ->  Partial GroupAggregate  (cost=10189.82..11939.82
rows=10 width=12) (actual time=50.134..102.976 rows=10
loops=1)
   Group Key: t2.b
   ->  Sort  (cost=10189.82..10439.82 rows=10 width=8)
(actual time=50.128..63.362 rows=10 loops=1)
 Sort Key: t2.b
 Sort Method: external merge  Disk: 1768kB
 ->  Seq Scan on t2  (cost=0.00..1885.00
rows=10 width=8) (actual time=0.498..20.977 rows=10 loops=1)
 Planning time: 0.190 ms
 Execution time: 339.929 ms
(18 rows)

postgres=# set enable_partition_wise_agg=off;
SET
postgres=# explain analyze select sum(a) from t group by b;
QUERY PLAN
--
 GroupAggregate  (cost=26116.53..29616.51 rows=19 width=12)
(actual time=139.413..250.751 rows=19 loops=1)
   Group Key: t1.b
   ->  Sort  (cost=26116.53..26616.52 rows=19 width=8) (actual
time=139.406..168.775 rows=19 loops=1)
 Sort Key: t1.b
 Sort Method: external merge  Disk: 3544kB
 ->  Result  (cost=0.00..5769.98 rows=19 width=8) (actual
time=0.674..76.392 rows=19 loops=1)
   ->  Append  (cost=0.00..3769.99 rows=19 width=8)
(actual time=0.672..40.291 rows=19 loops=1)
 ->  Seq Scan on t1  (cost=0.00..1884.99
rows=9 width=8) (actual time=0.672..12.408 rows=9 loops=1)
 ->  Seq Scan on t2  (cost=0.00..1885.00
rows=10 width=8) (actual time=1.407..11.689 rows=10 loops=1)
 Planning time: 0.146 ms
 Execution time: 263.678 ms
(11 rows)


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add recovery_min_apply_delay_reconnect recovery option

2017-10-17 Thread Eric Radman
On Tue, Oct 17, 2017 at 12:34:17PM +0900, Michael Paquier wrote:
> On Tue, Oct 17, 2017 at 12:51 AM, Eric Radman  wrote:
> > This administrative compromise is necessary because the WalReceiver is
> > not resumed after a network interruption until all records are read,
> > verified, and applied from the archive on disk.
> 
> Taking a step back here... recoveryApplyDelay() uses
> XLogCtl->recoveryWakeupLatch which gets set if the WAL receiver has
> received new WAL, or if the WAL receiver shuts down properly.

I thought I had observed cases where the WalReceiver was shut down
without causing XLogCtl->recoveryWakeupLatch to return. If I'm wrong
about this then there's no reason to spin every n seconds.

> the WAL receiver gets down for whatever reason during the loop of
> recoveryApplyDelay(), the startup process waits for a record to be
> applied maybe for a long time, and as there is no WAL receiver we
> actually don't receive any new WAL records.
...
> indeed a waste to not have a WAL receiver online while waiting for a
> delay to be applied.

Exactly!

> If there is a flacky network between the primary and a standby, you
> may end up with a standby way behind its primary, and that could
> penalize a primary clean shutdown as the primary waits for the
> shutdown checkpoint record to be flushed on the standby.

This is another artifact that the database administrator would not
anticipate.

> I think that your way to deal with the problem is messy though. If you
> think about it, no parameters are actually needed. What you should try
> to achieve is to make recoveryApplyDelay() smarter, by making the wait
> to forcibly stop if you detect a failure by getting out of the redo
> routine, and then force again the record to be read again. This way,
> the startup process would try to start again a new WAL receiver if it
> thinks that the source it should read WAL from is a stream. That may
> turn to be a patch more complicated than you think though.

One of my earlier attempts was to break from the redo loop and try
reading the next record. This was too simple because it only starts the
WAL receiver if there is nothing more to be read from the archive. 

Which record are you suggesting should be forcibly "read again"?  The
record identified by XLogCtl->replayEndRecPtr or
XLogCtl->lastReplayedEndRecPtr?  I'll look more carefully at such an
approach.

> Your patch also breaks actually the use case of standbys doing
> recovery using only archives and no streaming. In this case
> WalRcvStreaming returns false, and recovery_min_apply_delay_reconnect
> would be used unconditionally, so you would break a lot of
> applications silently.

Excellent point--I had not thought of how this would interact with a
standby that used only archives.

All useful feedback, thank you for the thorough review!
  
--
Eric Radman  |  http://eradman.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SIGSEGV in BRIN autosummarize

2017-10-17 Thread Tom Lane
Justin Pryzby  writes:
> On Tue, Oct 17, 2017 at 12:59:16PM +0200, Alvaro Herrera wrote:
>> Anyway, can give this patch a try?

> I've only compiled postgres once before and this is a production environment
> (althought nothing so important that the crashes are a serious concern 
> either).

> Is it reasonable to wget the postgres tarball, apply the patch, and run the
> compiled postgres binary from the source tree, without running make install or
> similar ?  Otherwise, would it be good enough to copy the postgres binary to
> /usr/pgsql-10/bin (and reinstall the binary package later) ?

The trick in this sort of situation is to make sure you build binaries
that match your existing install in every way except having the added
patch, and maybe getting installed into a different directory.

So: where did you get the existing binaries?  If it's from some vendor
packaging system, what you should do is fetch the package source, add
the patch to the probably-nonempty set of patches the vendor is applying,
and rebuild your own custom package version.  If you haven't done that
before, it's a good skill to acquire ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Query got Killed with CTE.

2017-10-17 Thread Prabhat Sahu
Hi,

While quering with CTE against PG HEAD , i found that query got killed with
this below error logs
-- Machine Configuration: (d1.xlarge) CUPs : 8 , RAM  : 16GB , SIze : 640GB

postgres=# with x as (select 5 c1 from generate_series(1,100) x) select
* from x x1 join x x2 using(c1);
Killed
2017-10-17 14:12:33.558 BST [949] LOG:  could not send data to client:
Broken pipe
2017-10-17 14:12:33.558 BST [949] STATEMENT:  with x as (select 5 c1 from
generate_series(1,100) x) select * from x x1 join x x2 using(c1);
2017-10-17 14:12:33.559 BST [949] FATAL:  connection to client lost
2017-10-17 14:12:33.559 BST [949] STATEMENT:  with x as (select 5 c1 from
generate_series(1,100) x) select * from x x1 join x x2 using(c1);

*Thanks & Regards,*

*Prabhat Kumar Sahu*
Mob: 7758988455
Skype ID: prabhat.sahu1984

www.enterprisedb.co m



Re: [HACKERS] PATCH: enabling parallel execution for cursors explicitly (experimental)

2017-10-17 Thread Robert Haas
On Sat, Oct 14, 2017 at 2:14 PM, Tomas Vondra
 wrote:
> I propose is to add a new cursor option (PARALLEL), which would allow
> parallel plans for that particular user-defined cursor. Attached is an
> experimental patch doing this (I'm sure there are some loose ends).

I think you would need to do a huge amount of additional work in order
to actually make this robust.  I believe that a fair amount of what
goes on in parallel mode right now is checked with elog() if we think
that it's unreachable without writing C code -- but this will make a
lot of those things reachable, which means they would need to be
checked some other way.  Also, I doubt this guarantees that we won't
try to call parallel-unsafe functions will parallel mode is active, so
things will just blow up in whatever way they do, maybe crashing the
server or rendering the database inconsistent or whatever.

Possibly I'm overestimating the extent of the danger, but I don't
think so.  You're try to take a mechanism that was only ever meant to
be active during the course of one query and applying it for long
periods of time during which a user can do anything, with basically no
upgrade of the infrastructure.  I think something like this could be
made to work if you put a large amount of energy into it, but I think
the patch as proposed is about the easiest 3-5% of what would actually
be required to cover all the holes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SIGSEGV in BRIN autosummarize

2017-10-17 Thread Tomas Vondra
On 10/17/2017 02:29 PM, Justin Pryzby wrote:
> On Tue, Oct 17, 2017 at 12:59:16PM +0200, Alvaro Herrera wrote:
>> Anyway, can give this patch a try?
> 
> I've only compiled postgres once before and this is a production environment
> (althought nothing so important that the crashes are a serious concern 
> either).
> 
> Is it reasonable to wget the postgres tarball, apply the patch, and run the
> compiled postgres binary from the source tree, without running make install or
> similar ?  Otherwise, would it be good enough to copy the postgres binary to
> /usr/pgsql-10/bin (and reinstall the binary package later) ?
> 

You don't have to install the binaries to the same location, i.e. you
can keep both the current and modified binaries.

./configure --prefix=/path/to/alternative/binaries --enable-debug
CFLAGS="..."

To get the same compilation options you can run pg_config, look for
CONFIGURE line and then just modify add --prefix option.

And after `make install` you can add it to $PATH and start the server
using those binaries.

$ export PATH=/path/to/alternative/binaries/bin:$PATH
$ which pg_ctl
$ pg_ctl -D $DATADIR stop
$ pg_ctl -D $DATADIR start


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SIGSEGV in BRIN autosummarize

2017-10-17 Thread Justin Pryzby
On Tue, Oct 17, 2017 at 12:59:16PM +0200, Alvaro Herrera wrote:
> Justin Pryzby wrote:
> 
> > #1  0x006a52e9 in perform_work_item (workitem=0x7f8ad1f94824) at 
> > autovacuum.c:2676
> > cur_datname = 0x298c740 "no 1 :vartype 1184 :vartypmod -1 
> > :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 146} {CONST 
> > :consttype 1184 :consttypmod -1 :constcollid 0 :constlen 8 :constbyval true 
> > :constisnull fal"...
> > cur_nspname = 0x298c728 "s ({VAR :varno 1 :varattno 1 :vartype 1184 
> > :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 1 
> > :location 146} {CONST :consttype 1184 :consttypmod -1 :constcollid 0 
> > :constlen 8 :constbyv"...
> > cur_relname = 0x298cd68 
> > "cdrs_eric_msc_sms_2017_10_14_startofcharge_idx"
> > __func__ = "perform_work_item"
> 
> cur_datname here seems corrupted -- it points halfway into cur_nspname,
> which is also a corrupt value.  And I think that's because we're not
> checking that the namespace OID is a valid value before calling
> get_namespace_name on it.  And I'm betting that these values are all not
> what we expect, because we're not checking that we're in the correct
> database before trying to execute the work item.  I don't quite
> understand how this results in an invalid string rather than just a
> NULL, as I would have expected.

I'm happy to try the patch, but in case it makes any difference, we have few
DBs/schemas:

postgres=# \dn
  List of schemas
  Name  |  Owner   
+--
 public | postgres
(1 row)

postgres=# \l+
List of 
databases
   Name|  Owner   | Encoding |   Collate   |Ctype|   Access 
privileges   |  Size   | Tablespace |Description
 
---+--+--+-+-+---+-++
 gtt   | pryzbyj  | UTF8 | en_US.UTF-8 | en_US.UTF-8 |  
 | 2215 GB | pg_default | 
 postgres  | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |  
 | 76 GB   | pg_default | default administrative connection database
 template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres  
+| 7625 kB | pg_default | unmodifiable empty database
   |  |  | | | 
postgres=CTc/postgres | || 
 template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | 
postgres=CTc/postgres+| 7485 kB | pg_default | default template for new 
databases
   |  |  | | | =c/postgres  
 | || 
(4 rows)

Note that db:postgres is of substantial size since I set log_statement=all and
log_destination=stderr,csv, and import all CSVs into a table heirarchy in
db:postgres.  There are 3 BRIN indices per (daily child) table in db:gtt (and
none in postgres, although that's probably a good idea for the timestamp column
once all our customers are upgraded to PG10).

Justin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SIGSEGV in BRIN autosummarize

2017-10-17 Thread Justin Pryzby
On Tue, Oct 17, 2017 at 12:59:16PM +0200, Alvaro Herrera wrote:
> Anyway, can give this patch a try?

I've only compiled postgres once before and this is a production environment
(althought nothing so important that the crashes are a serious concern either).

Is it reasonable to wget the postgres tarball, apply the patch, and run the
compiled postgres binary from the source tree, without running make install or
similar ?  Otherwise, would it be good enough to copy the postgres binary to
/usr/pgsql-10/bin (and reinstall the binary package later) ?

Thanks,
Justin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] A GUC to prevent leader processes from running subplans?

2017-10-17 Thread Thomas Munro
Hi hackers,

While testing parallelism work I've wanted to be able to prevent
gather nodes from running the plan in the leader process, and I've
heard others say the same.  One way would be to add a GUC
"multiplex_gather", like in the attached patch.  If you set it to off,
Gather and Gather Merge won't run the subplan unless they have to
because no workers could be launched.  I thought about adding a new
value for force_parallel_mode instead, but someone mentioned they
might want to do this on a production system too and
force_parallel_mode is not really for end users.  Better ideas?

-- 
Thomas Munro
http://www.enterprisedb.com


0001-Add-a-GUC-to-control-whether-Gather-runs-subplans.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SIGSEGV in BRIN autosummarize

2017-10-17 Thread Alvaro Herrera
Justin Pryzby wrote:

> #1  0x006a52e9 in perform_work_item (workitem=0x7f8ad1f94824) at 
> autovacuum.c:2676
> cur_datname = 0x298c740 "no 1 :vartype 1184 :vartypmod -1 :varcollid 
> 0 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 146} {CONST :consttype 
> 1184 :consttypmod -1 :constcollid 0 :constlen 8 :constbyval true :constisnull 
> fal"...
> cur_nspname = 0x298c728 "s ({VAR :varno 1 :varattno 1 :vartype 1184 
> :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 
> 146} {CONST :consttype 1184 :consttypmod -1 :constcollid 0 :constlen 8 
> :constbyv"...
> cur_relname = 0x298cd68 
> "cdrs_eric_msc_sms_2017_10_14_startofcharge_idx"
> __func__ = "perform_work_item"

cur_datname here seems corrupted -- it points halfway into cur_nspname,
which is also a corrupt value.  And I think that's because we're not
checking that the namespace OID is a valid value before calling
get_namespace_name on it.  And I'm betting that these values are all not
what we expect, because we're not checking that we're in the correct
database before trying to execute the work item.  I don't quite
understand how this results in an invalid string rather than just a
NULL, as I would have expected.

Anyway, can give this patch a try?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 0a53aaf589dfdbd2f25ae2ee36323d77c2910a60 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 17 Oct 2017 12:58:38 +0200
Subject: [PATCH] Fix autovacuum workitems

---
 src/backend/postmaster/autovacuum.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/backend/postmaster/autovacuum.c 
b/src/backend/postmaster/autovacuum.c
index 776b1c0a9d..83366b862c 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2525,6 +2525,8 @@ deleted:
continue;
if (workitem->avw_active)
continue;
+   if (workitem->avw_database != MyDatabaseId)
+   continue;
 
/* claim this one, and release lock while performing it */
workitem->avw_active = true;
@@ -2592,6 +2594,7 @@ perform_work_item(AutoVacuumWorkItem *workitem)
char   *cur_datname = NULL;
char   *cur_nspname = NULL;
char   *cur_relname = NULL;
+   Oid cur_nspoid;
 
/*
 * Note we do not store table info in MyWorkerInfo, since this is not
@@ -2607,9 +2610,12 @@ perform_work_item(AutoVacuumWorkItem *workitem)
 */
 
cur_relname = get_rel_name(workitem->avw_relation);
-   cur_nspname = 
get_namespace_name(get_rel_namespace(workitem->avw_relation));
+   cur_nspoid = get_rel_namespace(workitem->avw_relation);
+   if (!cur_relname || !OidIsValid(cur_nspoid))
+   goto deleted2;
+   cur_nspname = get_namespace_name(cur_nspoid);
cur_datname = get_database_name(MyDatabaseId);
-   if (!cur_relname || !cur_nspname || !cur_datname)
+   if (!cur_nspname || !cur_datname)
goto deleted2;
 
autovac_report_workitem(workitem, cur_nspname, cur_datname);
-- 
2.11.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fix traversal of half-frozen update chains

2017-10-17 Thread Alvaro Herrera
Robert Haas wrote:

> I haven't really followed this thread in depth, but I'm very nervous
> about the idea that we should ever be examining the raw-xmin of a
> tuple that has been marked HEAP_XMIN_FROZEN for anything other than
> forensic purposes.

Yeah, me too.  If you see another way to fix the problem, let's discuss
it.

I think a possible way is to avoid considering that the relfrozenxid
value computed by the caller is final.  Something like this: if page
freezing sees that there is a HOT chain which would end up half-frozen
because of that freeze age, decrease the freeze xid enough that the
whole chain remains unfrozen; communicate the new value to caller so
that it is used as the true new relfrozenxid going forwards.  That
preserves the HOT chain intact so that it can be pruned and frozen
correctly in a later VACUUM call.  Need to be careful about HOT chains
containing members that are old enough to cause a long term problem
(i.e. a table where relfrozenxid doesn't move forward enough).

One thing I didn't quite investigate is why this bug only shows up with
multixacts so far.  Is it just because multixacts provide an easy way to
reproduce it, and that there are others, more difficult ways to cause
the same problem without involving multixacts?  If so, then the problem
is likely present in 9.2 as well.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SIGSEGV in BRIN autosummarize

2017-10-17 Thread Alvaro Herrera
Justin Pryzby wrote:
> On Sun, Oct 15, 2017 at 02:44:58PM +0200, Tomas Vondra wrote:
> > Thanks, but I'm not sure that'll help, at this point. We already know
> > what happened (corrupted memory), we don't know "how". And core files
> > are mostly just "snapshots" so are not very useful in answering that :-(
> 
> Is there anything I should be saving for these or hints how else to debug?

Sorry, I had missed this report.  I'll look into it today.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pageinspect function to decode infomasks

2017-10-17 Thread Masahiko Sawada
On Thu, Sep 14, 2017 at 11:00 PM, Craig Ringer  wrote:
> On 14 September 2017 at 19:57, Ashutosh Sharma 
> wrote:
>
>>
>>
>> Are you planning to work on the review comments from Robert, Moon
>> Insung and supply the new patch. I just had a quick glance into this
>> mail thread (after a long time) and could understand Robert's concern
>> till some extent. I think, he is trying to say that if a tuple is
>> frozen (committed|invalid) then it shouldn't be shown as COMMITTED and
>> INVALID together in fact it should just be displayed as FROZEN tuple.
>
>
> Yes, I'd like to, and should have time for it in this CF.
>
> My plan is to emit raw flags by default, so FROZEN would't be shown at all,
> only COMMITTED|INVALID. If the bool to decode combined flags is set, then
> it'll show things like FROZEN, and hide COMMITTED|INVALID. Similar for other
> combos.
>

FWIW, I agree with this direction. ISTM the showing the raw flags by
default and having an option to show combined flags would be a right
way.
I sometimes wanted to have the same mechanism for lp_flags but maybe
it should be discussed on a separated thread.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SIGSEGV in BRIN autosummarize

2017-10-17 Thread Justin Pryzby
On Sun, Oct 15, 2017 at 02:44:58PM +0200, Tomas Vondra wrote:
> Thanks, but I'm not sure that'll help, at this point. We already know
> what happened (corrupted memory), we don't know "how". And core files
> are mostly just "snapshots" so are not very useful in answering that :-(

Is there anything I should be saving for these or hints how else to debug?  I
just got a segfault while looking for logs from an segfault from 2hr
earlier...but it was a duplicate from the first stacktrace...

< 2017-10-16 23:21:23.172 -04  >LOG:  server process (PID 31543) was terminated 
by signal 11: Segmentation fault
< 2017-10-16 23:21:23.172 -04  >DETAIL:  Failed process was running: 
autovacuum: BRIN summarize public.gtt 162175

Oct 16 23:21:22 database kernel: postmaster[31543] general protection ip:4bd467 
sp:7ffe08a94890 error:0 in postgres[40+692000]
...
Oct 17 01:58:36 database kernel: postmaster[8646]: segfault at 8 ip 
0084a177 sp 7ffe08a94a88 error 4 in postgres[40+692000]
Oct 17 01:58:38 database abrt[9192]: Saved core dump of pid 8646 
(/usr/pgsql-10/bin/postgres) to /var/spool/abrt/ccpp-2017-10-17-01:58:38-8646 
(7692288 bytes)
Oct 17 01:58:38 database abrtd: Directory 'ccpp-2017-10-17-01:58:38-8646' 
creation detected

Oct 17 01:58:38 database abrt[9192]: Saved core dump of pid 8646 
(/usr/pgsql-10/bin/postgres) to /var/spool/abrt/ccpp-2017-10-17-01:58:38-8646 
(7692288 bytes)
Oct 17 01:58:38 database abrtd: Directory 'ccpp-2017-10-17-01:58:38-8646' 
creation detected
Oct 17 01:59:18 database abrtd: Generating core_backtrace
Oct 17 01:59:18 database abrtd: Duplicate: core backtrace
Oct 17 01:59:18 database abrtd: DUP_OF_DIR: 
/var/spool/abrt/ccpp-2017-10-14-18:05:35-26500
Oct 17 01:59:18 database abrtd: Deleting problem directory 
ccpp-2017-10-17-01:58:38-8646 (dup of ccpp-2017-10-14-18:05:35-26500)

postgres=# SELECT * FROM postgres_log_2017_10_16_2300 WHERE pid=31543;
-[ RECORD 1 
]--+-
log_time   | 2017-10-16 23:21:22.021-04
pid| 31543
session_id | 59e57677.7b37
session_line   | 2
session_start_time | 2017-10-16 23:18:15-04
virtual_transaction_id | 22/54376
transaction_id | 0
error_severity | ERROR
sql_state_code | 57014
message| canceling autovacuum task
context| automatic analyze of table 
"gtt.public.cdrs_eric_egsnpdprecord_2017_10_15"
-[ RECORD 2 
]--+-
log_time   | 2017-10-16 23:21:22.021-04
pid| 31543
session_id | 59e57677.7b37
session_line   | 3
session_start_time | 2017-10-16 23:18:15-04
virtual_transaction_id | 22/54377
transaction_id | 0
error_severity | ERROR
sql_state_code | 57014
message| canceling autovacuum task
context| processing work entry for relation 
"gtt.public.cdrs_eric_msc_voice_2017_10_16_startofcharge_idx"
-[ RECORD 3 
]--+-
log_time   | 2017-10-16 23:21:21.003-04
pid| 31543
session_id | 59e57677.7b37
session_line   | 1
session_start_time | 2017-10-16 23:18:15-04
virtual_transaction_id | 22/54374
transaction_id | 0
error_severity | ERROR
sql_state_code | 57014
message| canceling autovacuum task
context| automatic analyze of table 
"gtt.public.cdrs_eric_ggsnpdprecord_2017_10_15"

Note, the most recent LOCK TABLE I could find beofre 23:21:22 was actually
~10min earlier at 2017-10-16 23:12:16.519-04.

Core was generated by `postgres: autovacuum worker process   gtt '.
Program terminated with signal 11, Segmentation fault.
#0  index_close (relation=0x324647603246466, lockmode=1) at indexam.c:178
178 LockRelId   relid = relation->rd_lockInfo.lockRelId;
Missing separate debuginfos, use: debuginfo-install 
audit-libs-2.4.5-6.el6.x86_64 cyrus-sasl-lib-2.1.23-15.el6_6.2.x86_64 
keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64 
libcom_err-1.41.12-23.el6.x86_64 libgcc-4.4.7-18.el6.x86_64 
libicu-4.2.1-14.el6.x86_64 libselinux-2.0.94-7.el6.x86_64 
libstdc++-4.4.7-18.el6.x86_64 nspr-4.13.1-1.el6.x86_64 
nss-3.28.4-1.el6_9.x86_64 nss-softokn-freebl-3.14.3-23.3.el6_8.x86_64 
nss-util-3.28.4-1.el6_9.x86_64 openldap-2.4.40-16.el6.x86_64 
openssl-1.0.1e-57.el6.x86_64 pam-1.1.1-24.el6.x86_64 zlib-1.2.3-29.el6.x86_64

(gdb) p relation
$1 = (struct RelationData *) 0x324647603246466
(gdb) p relation->rd_lockInfo
Cannot access memory at address 0x3246476032464aa

(gdb) bt
#0  index_close (relation=0x324647603246466, lockmode=1) at indexam.c:178
#1  0x004bc3c2 in systable_endscan (sysscan=0x145b3c8) at