[HACKERS] Re: [BUGS] BUG #14350: VIEW with INSTEAD OF INSERT TRIGGER and COPY. Missing feature or working as designed.

2016-11-02 Thread Haribabu Kommi
On Tue, Nov 1, 2016 at 8:05 PM, Dilip Kumar  wrote:

> On Tue, Nov 1, 2016 at 12:40 PM, Haribabu Kommi
>  wrote:
> > COPY command is treated as an UTILITY command, During the query
> > processing, the rules are not applied on the COPY command, and in the
> > execution of COPY command, it just inserts the data into the heap and
> > indexes with direct calls.
> >
> > I feel supporting the COPY command for the case-2 is little bit complex
> > and may not be that much worth.
>
> I agree it will be complex..
> >
> > Attached is the updated patch with doc changes.
>
> Now since we are adding support for INSTEAD OF TRIGGER in COPY FROM
> command, It will be good that we provide a HINT to user if INSTEAD of
> trigger does not exist, like we do in case of insert ?
>
> INSERT case:
> postgres=# insert into ttt_v values(7);
> 2016-11-01 14:31:39.845 IST [72343] ERROR:  cannot insert into view "ttt_v"
> 2016-11-01 14:31:39.845 IST [72343] DETAIL:  Views that do not select
> from a single table or view are not automatically updatable.
> 2016-11-01 14:31:39.845 IST [72343] HINT:  To enable inserting into
> the view, provide an INSTEAD OF INSERT trigger or an unconditional ON
> INSERT DO INSTEAD rule.
>
> COPY case:
> postgres=# COPY ttt_v FROM stdin;
> 2016-11-01 14:31:52.235 IST [72343] ERROR:  cannot copy to view "ttt_v"
> 2016-11-01 14:31:52.235 IST [72343] STATEMENT:  COPY ttt_v FROM stdin;


Thanks for your suggestion. Yes, I agree that adding a hint is good.
Updated patch is attached with addition of hint message.

2016-11-03 14:56:28.685 AEDT [7822] ERROR:  cannot copy to view "ttt_v"
2016-11-03 14:56:28.685 AEDT [7822] HINT:  To enable copy to view, provide
an INSTEAD OF INSERT trigger
2016-11-03 14:56:28.685 AEDT [7822] STATEMENT:  COPY ttt_v FROM stdin;


Regards,
Hari Babu
Fujitsu Australia


copy_to_view_2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] who calls InitializeTimeouts() ?

2016-11-02 Thread Chapman Flack
Hi,

It looks like for about 3 years, PL/Java has been calling
InitializeTimeouts before calling RegisterTimeout. Looking over
the callers of InitializeTimeouts in core, though, it appears
that an extension like PL/Java should be able to assume it has
already been called, and in fact would be rude to call it again,
as it isn't idempotent and could conceivably clobber somebody
else's timeouts.

As PL/Java only uses it for a timeout on process exit anyway,
perhaps this is a mistake that has just never had much chance
to cause a noticeable problem.

Am I right that it's a mistake, though?

Thanks,
-Chap


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan_rows confusion with parallel queries

2016-11-02 Thread Tom Lane
Tomas Vondra  writes:
> On 11/02/2016 11:56 PM, Tomas Vondra wrote:
>> On 11/02/2016 09:00 PM, Tom Lane wrote:
>>> Tomas Vondra  writes:
 while eye-balling some explain plans for parallel queries, I got a bit
 confused by the row count estimates. I wonder whether I'm alone.

>>> I got confused by that a minute ago, so no you're not alone.  The problem
>>> is even worse in join cases.  For example:
>>>  Gather  (cost=34332.00..53265.35 rows=100 width=8)
>>>Workers Planned: 2
>>>->  Hash Join  (cost=2.00..52255.35 rows=100 width=8)
>>>  Hash Cond: ((pp.f1 = cc.f1) AND (pp.f2 = cc.f2))
>>>  ->  Append  (cost=0.00..8614.96 rows=417996 width=8)
>>>->  Parallel Seq Scan on pp  (cost=0.00..8591.67 rows=416667 
>>> width=8)
>>>->  Parallel Seq Scan on pp1  (cost=0.00..23.29 rows=1329 
>>> width=8)
>>>  ->  Hash  (cost=14425.00..14425.00 rows=100 width=8)
>>>->  Seq Scan on cc  (cost=0.00..14425.00 rows=100 
>>> width=8)
>>> There are actually 100 rows in pp, and none in pp1.  I'm not bothered
>>> particularly by the nonzero estimate for pp1, because I know where that
>>> came from, but I'm not very happy that nowhere here does it look like
>>> it's estimating a million-plus rows going into the join.

> Although - it is estimating 1M rows, but only "per worker" estimates are 
> shown, and because there are 2 workers planned it says 1M/2.4 which is 
> the 416k. I agree it's a bit unclear, but at least it's consistent with 
> how we treat loops (i.e. that the numbers are per loop).

Well, it's not *that* consistent.  If we were estimating all the numbers
underneath the Gather as being per-worker numbers, that would make some
amount of sense.  But neither the other seqscan, nor the hash on it, nor
the hashjoin's output count are scaled that way.  It's very hard to call
the above display anything but flat-out broken.

> But there's more fun with joins - consider for example this simple join:
> ...
> Suddenly we don't show per-worker estimates for the hash join - both the 
> Hash Join and the Gather have exactly the same cardinality estimate.

Yeah.  That doesn't seem to be quite the same problem as in my example,
but it's about as confused.

Maybe we need to bite the bullet and add a "number of workers" field
to the estimated and actual counts.  Not sure how much that helps for
the partial-count-for-the-leader issue, though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Row level security implementation in Foreign Table in Postgres

2016-11-02 Thread Sounak Chakraborty
Row level security feature implementation in Postgres is through policy and the 
row security qualifier is attached as a subquery to the main query before query 
planning. The RLS is implemented through ALTER TABLE STATEMENT. 
But my doubt is why this feature is not enabled in case of Foreign Table. 
(ALTER FOREIGN TABLE doesn't have a option of enabling Row Level Security). 
Is this is not implemented due to some limitations in the current design? 
Because from a quick view it looks like the security subquery can also be 
easily attached to the main query and passed for processing in foreign 
database. 

Thanks
Sounak

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan_rows confusion with parallel queries

2016-11-02 Thread Tomas Vondra

On 11/02/2016 11:56 PM, Tomas Vondra wrote:

On 11/02/2016 09:00 PM, Tom Lane wrote:

Tomas Vondra  writes:

while eye-balling some explain plans for parallel queries, I got a bit
confused by the row count estimates. I wonder whether I'm alone.


I got confused by that a minute ago, so no you're not alone.  The problem
is even worse in join cases.  For example:

 Gather  (cost=34332.00..53265.35 rows=100 width=8)
   Workers Planned: 2
   ->  Hash Join  (cost=2.00..52255.35 rows=100 width=8)
 Hash Cond: ((pp.f1 = cc.f1) AND (pp.f2 = cc.f2))
 ->  Append  (cost=0.00..8614.96 rows=417996 width=8)
   ->  Parallel Seq Scan on pp  (cost=0.00..8591.67
rows=416667 widt
h=8)
   ->  Parallel Seq Scan on pp1  (cost=0.00..23.29
rows=1329 width=8
)
 ->  Hash  (cost=14425.00..14425.00 rows=100 width=8)
   ->  Seq Scan on cc  (cost=0.00..14425.00 rows=100
width=8)

There are actually 100 rows in pp, and none in pp1.  I'm not bothered
particularly by the nonzero estimate for pp1, because I know where that
came from, but I'm not very happy that nowhere here does it look like
it's estimating a million-plus rows going into the join.



Although - it is estimating 1M rows, but only "per worker" estimates are 
shown, and because there are 2 workers planned it says 1M/2.4 which is 
the 416k. I agree it's a bit unclear, but at least it's consistent with 
how we treat loops (i.e. that the numbers are per loop).


But there's more fun with joins - consider for example this simple join:

   QUERY PLAN
--
 Gather  (cost=19515.96..43404.82 rows=96957 width=12)
 (actual time=295.167..746.312 rows=9 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Hash Join  (cost=18515.96..32709.12 rows=96957 width=12)
  (actual time=249.281..670.309 rows=3 loops=3)
 Hash Cond: (t2.a = t1.a)
 ->  Parallel Seq Scan on t2
 (cost=0.00..8591.67 rows=416667 width=8)
 (actual time=0.100..184.315 rows=33 loops=3)
 ->  Hash  (cost=16925.00..16925.00 rows=96957 width=8)
   (actual time=246.760..246.760 rows=9 loops=3)
   Buckets: 131072  Batches: 2  Memory Usage: 2976kB
   ->  Seq Scan on t1
   (cost=0.00..16925.00 rows=96957 width=8)
   (actual time=0.065..178.385 rows=9 loops=3)
 Filter: (b < 10)
 Rows Removed by Filter: 91
 Planning time: 0.763 ms
 Execution time: 793.653 ms
(13 rows)

Suddenly we don't show per-worker estimates for the hash join - both the 
Hash Join and the Gather have exactly the same cardinality estimate.


Now, let's try forcing Nested Loops and see what happens:

QUERY PLAN
-
 Gather  (cost=1000.42..50559.65 rows=96957 width=12)
 (actual time=0.610..203.694 rows=9 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Nested Loop  (cost=0.42..39863.95 rows=96957 width=12)
(actual time=0.222..182.755 rows=3 loops=3)
 ->  Parallel Seq Scan on t1
(cost=0.00..9633.33 rows=40399 width=8)
(actual time=0.030..40.358 rows=3 loops=3)
   Filter: (b < 10)
   Rows Removed by Filter: 30
 ->  Index Scan using t2_a_idx on t2
  (cost=0.42..0.74 rows=1 width=8)
  (actual time=0.002..0.002 rows=1 loops=9)
   Index Cond: (a = t1.a)
 Planning time: 0.732 ms
 Execution time: 250.707 ms
(11 rows)

So, different join method but same result - 2 workers, loops=3. But 
let's try with small tables (100k rows instead of 1M rows):


  QUERY PLAN

 Gather  (cost=0.29..36357.94 rows=100118 width=12) (actual 
time=13.219..589.723 rows=10 loops=1)

   Workers Planned: 1
   Workers Launched: 1
   Single Copy: true
   ->  Nested Loop  (cost=0.29..36357.94 rows=100118 width=12)
(actual time=0.288..442.821 rows=10 loops=1)
 ->  Seq Scan on t1  (cost=0.00..1444.18 rows=100118 width=8)
  (actual time=0.148..49.308 rows=10 loops=1)
 ->  Index Scan using t2_a_idx on t2
  (cost=0.29..0.34 rows=1 width=8)
  (actual time=0.002..0.002 rows=1 loops=10)
   Index Cond: (a = t1.a)
 Planning time: 0.483 ms
 Execution time: 648.941 ms
(10 rows)

Suddenly, we get nworkers=1 with loops=1 (and not nworkers+1 as before). 
FWIW I've only seen this with force_parallel_mode=on, and the row counts 
are correct, so perhaps that's OK. single_copy seems a bit 

Re: [HACKERS] about missing xml related functionnalities

2016-11-02 Thread Craig Ringer
On 2 November 2016 at 21:12, Jean-Paul Jorda
 wrote:
> Hi,
> I send this email after reading the xml2 module « Deprecation notice » in
> https://www.postgresql.org/docs/9.6/static/xml2.html.
>
> I still use xml2 module for one thing: xslt_process(). We store some
> data in xml fields, and we sometime need to correct / upgrade the format.
> Being able to apply a xslt transformation is a very efficient way to
> achieve that.
>
> So, as far as I'm concerned, xslt transformation appears to be a missing
> fuctionnality in the newer api.

It is, but so far nobody has stepped up to do the work required to
implement it in core.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Making table reloading easier

2016-11-02 Thread Craig Ringer
Hi all

A very common operation that users perform is reloading tables.
Sometimes as part of an ETL process. Sometimes as part of a dump and
reload. Sometimes loading data from external DBs, etc.

Right now users have to jump through a bunch of hoops to do this efficiently:

BEGIN;

TRUNCATE TABLE my_table;

SELECT pg_get_indexdef(indexrelid::regclass)
FROM pg_index WHERE indrelid = 'table_name'::regclass;

-- Drop 'em all
DROP INDEX ... ;

COPY my_table FROM 'file';

-- Re-create indexes
CREATE INDEX ...;

COMMIT;


This is pretty clunky.

We already have support for disabling indexes, it's just not exposed
to the user. So the simplest option would seem to be to expose it with
something like:

  ALTER TABLE my_table
  DISABLE INDEX ALL;

which would take an ACCESS EXCLUSIVE lock then set:

  indistready = 'f'
  indislive = 'f'
  indisvalid = 'f'

on each index, or the named index if the user specifies one particular index.

After loading the table, a REINDEX on the table would rebuild and
re-enable the indexes.

That changes the process to:

BEGIN;
TRUNCATE TABLE my_table;
ALTER TABLE my_table DISABLE INDEX ALL;
COPY ...;
REINDEX TABLE my_table;
COMMIT;

It'd be even better to also add a REINDEX flag to COPY, where it
disables indexes and re-creates them after it finishes. But that could
be done separately.

Thoughts?

I'm not sure I can tackle this in the current dev cycle, but it looks
simple enough that I can't help wondering what obvious thing I'm
missing about why it hasn't been done yet.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan_rows confusion with parallel queries

2016-11-02 Thread Tomas Vondra

On 11/02/2016 09:00 PM, Tom Lane wrote:

Tomas Vondra  writes:

while eye-balling some explain plans for parallel queries, I got a bit
confused by the row count estimates. I wonder whether I'm alone.


I got confused by that a minute ago, so no you're not alone.  The problem
is even worse in join cases.  For example:

 Gather  (cost=34332.00..53265.35 rows=100 width=8)
   Workers Planned: 2
   ->  Hash Join  (cost=2.00..52255.35 rows=100 width=8)
 Hash Cond: ((pp.f1 = cc.f1) AND (pp.f2 = cc.f2))
 ->  Append  (cost=0.00..8614.96 rows=417996 width=8)
   ->  Parallel Seq Scan on pp  (cost=0.00..8591.67 rows=416667 widt
h=8)
   ->  Parallel Seq Scan on pp1  (cost=0.00..23.29 rows=1329 width=8
)
 ->  Hash  (cost=14425.00..14425.00 rows=100 width=8)
   ->  Seq Scan on cc  (cost=0.00..14425.00 rows=100 width=8)

There are actually 100 rows in pp, and none in pp1.  I'm not bothered
particularly by the nonzero estimate for pp1, because I know where that
came from, but I'm not very happy that nowhere here does it look like
it's estimating a million-plus rows going into the join.



Yeah. I wonder how tools visualizing explain plans are going to compute 
time spent in a given node (i.e. excluding the time spent in child 
nodes), or expected cost of that node.


So far we could do something like

self_time = total_time - child_node_time * nloops

and example in this plan it's pretty clear we spend ~130ms in Aggregate:

 QUERY PLAN

 Aggregate  (cost=17140.50..17140.51 rows=1 width=8)
   (actual time=306.675..306.675 rows=1 loops=1)
   ->  Seq Scan on tables  (cost=0.00..16347.60 rows=317160 width=0)
(actual time=0.188..170.993 rows=317160 loops=1)
 Planning time: 0.201 ms
 Execution time: 306.860 ms
(4 rows)

But in parallel plans it can easily happen that

child_node_time * nloops > total_time

Consider for example this parallel plan:

QUERY PLAN

 Finalize Aggregate  (cost=15455.19..15455.20 rows=1 width=8)
 (actual time=107.636..107.636 rows=1 loops=1)
   ->  Gather  (cost=15454.87..15455.18 rows=3 width=8)
   (actual time=107.579..107.629 rows=4 loops=1)
 Workers Planned: 3
 Workers Launched: 3
 ->  Partial Aggregate  (cost=14454.87..14454.88 rows=1 ...)
   (actual time=103.895..103.895 rows=1 loops=4)
   ->  Parallel Seq Scan on tables
   (cost=0.00..14199.10 rows=102310 width=0)
  (actual time=0.059..59.217 rows=79290 loops=4)
 Planning time: 0.052 ms
 Execution time: 109.250 ms
(8 rows)

Reading explains for parallel plans will always be complicated, but 
perhaps overloading the nloops like this makes it more complicated?


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2016-11-02 Thread Adam Brusselback
>
> There may be some situations where crawling the indexes a row at a
> time will perform better than this by enough to want to retain that
> option.


If an index existed, wouldn't it still be able to use that in the set-based
implementation? Is there something which would make doing the check
set-based ever worse than row based inherently?


Re: [HACKERS] WAL consistency check facility

2016-11-02 Thread Michael Paquier
On Wed, Nov 2, 2016 at 11:30 PM, Kuntal Ghosh
 wrote:
> On Wed, Nov 2, 2016 at 1:11 PM, Kuntal Ghosh  
> wrote:
>> Rest of the suggestions are well-taken. I'll update the patch accordingly.
> I've updated the last submitted patch(v10) with the following changes:
> - added a block level flag BKPIMAGE_APPLY to distinguish backup image
> blocks which needs to be restored during replay.
> - at present, hash index operations are not WAL-logged. Hence, I've removed
> the consistency check option for hash indices. It can be added later.

Both make sense.

> - Another suggestion was to remove wal_consistency from PostgresNode.pm
> because small buildfarm machines may suffer on it. Although I've no
> experience in this matter, I would like to be certain that nothings breaks
> in recovery tests after some modifications.

An extra idea that I have here would be to extend the TAP tests to
accept an environment variable that would be used to specify extra
options when starting Postgres instances. Buildfarm machines could use
it.

+/*
+ * Remember that, if WAL consistency check is enabled for
the current rmid,
+ * we always include backup image with the WAL record.
But, during redo we
+ * restore the backup block only if needs_backup is set.
+ */
+if (needs_backup)
+bimg.bimg_info |= BKPIMAGE_APPLY;
+
+
You should be careful about extra newlines and noise in the code.

-/* If it's a full-page image, restore it. */
-if (XLogRecHasBlockImage(record, block_id))
+/* If full-page image should be restored, do it. */
+if (XLogRecBlockImageApply(record, block_id))
Hm. It seems to me that this modification is incorrect. If the page
does not need to be applied, aka if it needs to be used for
consistency checks, what should be done is more something like the
following in XLogReadBufferForRedoExtended:
if (Apply(record, block_id))
return;
if (HasImage)
{
//do what needs to be done with an image
}
else
{
//do default things
}

XLogRecBlockImageApply() should only check for BKP_APPLY and not imply
HasImage(). This will be more flexible when for example using it for
assertions.

With this patch the comments on top of XLogReadBufferForRedo are
wrong. A block is not unconditionally applied.

+#define XLogRecBlockImageApply(decoder, block_id) \
+(XLogRecHasBlockImage(decoder, block_id) && \
+(((decoder)->blocks[block_id].bimg_info & BKPIMAGE_APPLY) > 0))
Maybe != 0? That's the most common practice in the code.

It would be more consistent to have a boolean flag to treat
BKPIMAGE_APPLY in xlogreader.c. Have a look at has_image for example.
This will as well reduce dependencies on the header xlog_record.h

+/*
+ * For a speculative tuple, the content of t_ctid is conflicting
+ * between the backup page and current page. Hence, we set it
+ * to the current block number and current offset.
+ */
+if (HeapTupleHeaderIsSpeculative(page_htup))
+ItemPointerSet(_htup->t_ctid, blkno, off);
In the set of masking functions this is the only portion of the code
depending on blkno. But isn't that actually a bug in speculative
inserts? Andres (added now in CC), Peter, could you provide some input
regarding that? The masking functions should not prevent the detection
of future errors, and this code is making me uneasy.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2016-11-02 Thread Kevin Grittner
> 2016-11-02 15:57 GMT+01:00 Kevin Grittner :
>> On Sun, Oct 30, 2016 at 10:35 AM, Kevin Grittner  wrote:
>>
>>> SPI support would also
>>> allow us to consider using set logic for validating foreign keys,
>>> instead of the one-row-at-a-time approach currently used.
>>
>> Just as a proof of concept for this I used the attached test case
>> to create foreign keys using current techniques versus set-oriented
>> queries with the transition-tsr code.  These probably can be
>> improved, since this is a "first cut" off the top of my head.
>>
>> The delete of about one million rows from a "parent" table with no
>> matching rows in the "child" table, and no index on referencing
>> column in the child table, took 24:17.969 using current triggers
>> and 00:03.262 using the set-based triggers.  Yes, that reduces
>> current run time for that case by 99.78%

On Wed, Nov 2, 2016 at 11:07 AM, Pavel Stehule  wrote:
>
> this is great number

On Wed, Nov 2, 2016 at 11:41 AM, Adam Brusselback
 wrote:
>
> That is really incredible.  Gets rid of the need for an index on referencing
> columns for a ton of use cases.

Keep in mind that this is just a quick proof of concept.  Unless
all participating transactions are at serializable isolation level
something would need to be done to handle race conditions, and that
might affect performance.  I do think that this approach is likely
to be better in enough circumstances, even after that is covered,
that it will be worth pursuing -- either as an option when
declaring a foreign key, or as the only implementation.  Until we
have a version that covers the race conditions and benchmark it in
a variety of workloads, it is hard to feel sure about the latter.
There may be some situations where crawling the indexes a row at a
time will perform better than this by enough to want to retain that
option.

A big plus of a single set-oriented statement is that it doesn't
suck unlimited RAM -- it will use work_mem to limit each tuplestore
and each query node, spilling to disk if needed.  The current FK
implementation sometimes runs for a very long time and can run
people out of memory.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan_rows confusion with parallel queries

2016-11-02 Thread Tom Lane
Tomas Vondra  writes:
> while eye-balling some explain plans for parallel queries, I got a bit 
> confused by the row count estimates. I wonder whether I'm alone.

I got confused by that a minute ago, so no you're not alone.  The problem
is even worse in join cases.  For example:

 Gather  (cost=34332.00..53265.35 rows=100 width=8)
   Workers Planned: 2
   ->  Hash Join  (cost=2.00..52255.35 rows=100 width=8)
 Hash Cond: ((pp.f1 = cc.f1) AND (pp.f2 = cc.f2))
 ->  Append  (cost=0.00..8614.96 rows=417996 width=8)
   ->  Parallel Seq Scan on pp  (cost=0.00..8591.67 rows=416667 widt
h=8)
   ->  Parallel Seq Scan on pp1  (cost=0.00..23.29 rows=1329 width=8
)
 ->  Hash  (cost=14425.00..14425.00 rows=100 width=8)
   ->  Seq Scan on cc  (cost=0.00..14425.00 rows=100 width=8)

There are actually 100 rows in pp, and none in pp1.  I'm not bothered
particularly by the nonzero estimate for pp1, because I know where that
came from, but I'm not very happy that nowhere here does it look like
it's estimating a million-plus rows going into the join.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] plan_rows confusion with parallel queries

2016-11-02 Thread Tomas Vondra

Hi,

while eye-balling some explain plans for parallel queries, I got a bit 
confused by the row count estimates. I wonder whether I'm alone.


Consider for example a simple seq scan query, which in non-parallel 
explain looks like this:


  QUERY PLAN
-
 Seq Scan on tables t  (cost=0.00..16347.60 rows=317160 width=356)
   (actual rows=317160 loops=1)
 Planning time: 0.173 ms
 Execution time: 47.707 ms
(3 rows)

but a parallel plan looks like this:

 QUERY PLAN
-
 Gather  (cost=0.00..14199.10 rows=317160 width=356)
 (actual rows=317160 loops=1)
   Workers Planned: 3
   Workers Launched: 3
   ->  Parallel Seq Scan on tables t  (cost=... rows=102310 width=356)
  (actual rows=79290 loops=4)
 Planning time: 0.209 ms
 Execution time: 150.812 ms
(6 rows)


Now, for actual rows we can simply do 79290 * 4 = 317160, and we get the 
correct number of rows produced by the plan (i.e. matching the 
non-parallel query).


But for the estimate, it doesn't work like that:

102310 * 4 = 409240

which is ~30% above the actual estimate. It's clear why this is 
happening - when computing plan_rows, we don't count the leader as a 
full worker, but use this:


leader_contribution = 1.0 - (0.3 * path->parallel_workers);

so with 3 workers, the leader is only worth ~0.1 of a worker:

102310 * 3.1 = 317161

It's fairly easy to spot this here, because the Gather node is right 
above the Parallel Seq Scan, and the values in the Gather accurate. But 
in many plans the Gather will not be immediately above the node (e.g. 
there may be parallel aggregate in between).


Of course, the fact that we use planned number of workers when computing 
plan_rows but actual number of workers for actually produced rows makes 
this even more confusing.


BTW is it really a good idea to use nloops to track the number of 
workers executing a given node? How will that work if once we get 
parallel nested loops and index scans?


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-11-02 Thread Mithun Cy
On Tue, Nov 1, 2016 at 9:42 PM, Robert Haas  wrote:
> Ah, nuts.  Thanks, good catch.  Should be fixed in the attached version.

I repeated the test on new patch, It works fine now, Also did some more
negative tests forcibly failing some internal calls. All tests have passed.
This patch works as described and look good to me.

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] pageinspect: Hash index support

2016-11-02 Thread Tom Lane
Peter Eisentraut  writes:
> So, we're down to crashes in gin_metapage_info() on ia64 and sparc64.
> My guess is that the raw page data that is passed into the function
> needs to be 8-byte aligned before being accessed as GinMetaPageData.

That's what it looks like to me, too.  The "bytea" page image is
guaranteed to be improperly aligned for 64-bit access, since it will have
an int32 length word before the actual page data, breaking the alignment
that would exist for a page sitting in a page buffer.  This is likely to
be a problem for more things than just gin_metapage_info(); sooner or
later it could affect just about everything in pageinspect.

> (Maybe GinPageGetMeta() should do it?)

I think the right thing is likely to be to copy the presented bytea
into a palloc'd (and therefore properly aligned) buffer.  And not
just in this one function.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect: Hash index support

2016-11-02 Thread Peter Eisentraut
On 11/1/16 3:28 PM, Peter Eisentraut wrote:
> I have also committed the tests
> that I proposed and will work through the failures.

So, we're down to crashes in gin_metapage_info() on ia64 and sparc64.
My guess is that the raw page data that is passed into the function
needs to be 8-byte aligned before being accessed as GinMetaPageData.
(Maybe GinPageGetMeta() should do it?)  Does anyone have a box like this
handy to experiment?  Of course an actual core dump could tell more.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-11-02 Thread Tomas Vondra

On 11/02/2016 05:52 PM, Amit Kapila wrote:

On Wed, Nov 2, 2016 at 9:01 AM, Tomas Vondra
 wrote:

On 11/01/2016 08:13 PM, Robert Haas wrote:


On Mon, Oct 31, 2016 at 5:48 PM, Tomas Vondra
 wrote:




The one remaining thing is the strange zig-zag behavior, but that might
easily be a due to scheduling in kernel, or something else. I don't consider
it a blocker for any of the patches, though.



The only reason I could think of for that zig-zag behaviour is
frequent multiple clog page accesses and it could be due to below
reasons:

a. transaction and its subtransactions (IIRC, Dilip's case has one
main transaction and two subtransactions) can't fit into same page, in
which case the group_update optimization won't apply and I don't think
we can do anything for it.
b. In the same group, multiple clog pages are being accessed.  It is
not a likely scenario, but it can happen and we might be able to
improve a bit if that is happening.
c. The transactions at same time tries to update different clog page.
I think as mentioned upthread we can handle it by using slots an
allowing multiple groups to work together instead of a single group.

To check if there is any impact due to (a) or (b), I have added few
logs in code (patch - group_update_clog_v9_log). The log message
could be "all xacts are not on same page" or "Group contains
different pages".

Patch group_update_clog_v9_slots tries to address (c). So if there
is any problem due to (c), this patch should improve the situation.

Can you please try to run the test where you saw zig-zag behaviour
with both the patches separately? I think if there is anything due
to postgres, then you can see either one of the new log message or
performance will be improved, OTOH if we see same behaviour, then I
think we can probably assume it due to scheduler activity and move
on. Also one point to note here is that even when the performance is
down in that curve, it is equal to or better than HEAD.



Will do.

Based on the results with more client counts (increment by 6 clients 
instead of 36), I think this really looks like something unrelated to 
any of the patches - kernel, CPU, or something already present in 
current master.


The attached results show that:

(a) master shows the same zig-zag behavior - No idea why this wasn't 
observed on the previous runs.


(b) group_update actually seems to improve the situation, because the 
performance keeps stable up to 72 clients, while on master the 
fluctuation starts way earlier.


I'll redo the tests with a newer kernel - this was on 3.10.x which is 
what Red Hat 7.2 uses, I'll try on 4.8.6. Then I'll try with the patches 
you submitted, if the 4.8.6 kernel does not help.


Overall, I'm convinced this issue is unrelated to the patches.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


zig-zag.ods
Description: application/vnd.oasis.opendocument.spreadsheet

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-11-02 Thread Tomas Vondra

On 11/02/2016 05:52 PM, Amit Kapila wrote:

On Wed, Nov 2, 2016 at 9:01 AM, Tomas Vondra
 wrote:

On 11/01/2016 08:13 PM, Robert Haas wrote:


On Mon, Oct 31, 2016 at 5:48 PM, Tomas Vondra
 wrote:




The one remaining thing is the strange zig-zag behavior, but that might
easily be a due to scheduling in kernel, or something else. I don't consider
it a blocker for any of the patches, though.



The only reason I could think of for that zig-zag behaviour is
frequent multiple clog page accesses and it could be due to below
reasons:

a. transaction and its subtransactions (IIRC, Dilip's case has one
main transaction and two subtransactions) can't fit into same page, in
which case the group_update optimization won't apply and I don't think
we can do anything for it.
b. In the same group, multiple clog pages are being accessed.  It is
not a likely scenario, but it can happen and we might be able to
improve a bit if that is happening.
c. The transactions at same time tries to update different clog page.
I think as mentioned upthread we can handle it by using slots an
allowing multiple groups to work together instead of a single group.

To check if there is any impact due to (a) or (b), I have added few
logs in code (patch - group_update_clog_v9_log). The log message
could be "all xacts are not on same page" or "Group contains
different pages".

Patch group_update_clog_v9_slots tries to address (c). So if there
is any problem due to (c), this patch should improve the situation.

Can you please try to run the test where you saw zig-zag behaviour
with both the patches separately? I think if there is anything due
to postgres, then you can see either one of the new log message or
performance will be improved, OTOH if we see same behaviour, then I
think we can probably assume it due to scheduler activity and move
on. Also one point to note here is that even when the performance is
down in that curve, it is equal to or better than HEAD.



Will do.

Based on the results with more client counts (increment by 6 clients 
instead of 36), I think this really looks like something unrelated to 
any of the patches - kernel, CPU, or something already present in 
current master.


The attached results show that:

(a) master shows the same zig-zag behavior - No idea why this wasn't 
observed on the previous runs.


(b) group_update actually seems to improve the situation, because the 
performance keeps stable up to 72 clients, while on master the 
fluctuation starts way earlier.


I'll redo the tests with a newer kernel - this was on 3.10.x which is 
what Red Hat 7.2 uses, I'll try on 4.8.6. Then I'll try with the patches 
you submitted, if the 4.8.6 kernel does not help.


Overall, I'm convinced this issue is unrelated to the patches.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-11-02 Thread Amit Kapila
On Wed, Nov 2, 2016 at 9:01 AM, Tomas Vondra
 wrote:
> On 11/01/2016 08:13 PM, Robert Haas wrote:
>>
>> On Mon, Oct 31, 2016 at 5:48 PM, Tomas Vondra
>>  wrote:
>>>
>
> The one remaining thing is the strange zig-zag behavior, but that might
> easily be a due to scheduling in kernel, or something else. I don't consider
> it a blocker for any of the patches, though.
>

The only reason I could think of for that zig-zag behaviour is
frequent multiple clog page accesses and it could be due to below
reasons:

a. transaction and its subtransactions (IIRC, Dilip's case has one
main transaction and two subtransactions) can't fit into same page, in
which case the group_update optimization won't apply and I don't think
we can do anything for it.
b. In the same group, multiple clog pages are being accessed.  It is
not a likely scenario, but it can happen and we might be able to
improve a bit if that is happening.
c. The transactions at same time tries to update different clog page.
I think as mentioned upthread we can handle it by using slots an
allowing multiple groups to work together instead of a single group.

To check if there is any impact due to (a) or (b), I have added few
logs in code (patch - group_update_clog_v9_log).  The log message
could be "all xacts are not on same page" or  "Group contains
different pages".

Patch group_update_clog_v9_slots tries to address (c). So if there is
any problem due to (c), this patch should improve the situation.

Can you please try to run the test where you saw zig-zag behaviour
with both the patches separately?  I think if there is anything due to
postgres, then you can see either one of the new log message or
performance will be improved, OTOH if we see same behaviour, then I
think we can probably assume it due to scheduler activity and move on.
Also one point to note here is that even when the performance is down
in that curve, it is equal to or better than HEAD.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


group_update_clog_v9_log.patch
Description: Binary data


group_update_clog_v9_slots.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain vs. parallel query

2016-11-02 Thread Tomas Vondra

On 11/01/2016 08:32 PM, Robert Haas wrote:

On Tue, Nov 1, 2016 at 10:58 AM, Tomas Vondra
 wrote:

Damn! You're right of course. Who'd guess I need more coffee this early?

Attached is a fix replacing the flag with an array of flags, indexed by
ParallelMasterBackendId. Hopefully that makes it work with multiple
concurrent parallel queries ... still, I'm not sure this is the right
solution.


I feel like it isn't.  I feel like this ought to go in the DSM for
that parallel query, not the main shared memory segment, but I'm not
sure how to accomplish that offhand.  Also, if we do solve it this
way, surely we don't need the locking.  The flag's only set before any
workers have started and never changes thereafter.



I'm not sure what you mean by "DSM for that parallel query" - I thought 
the segments are created for Gather nodes, no? Or is there a DSM for the 
whole query that we could use?


Another thing is that maybe we don't really want to give extensions 
access to any of those segments - my impression was those segments are 
considered internal (is there RequestAddinShmemSpace for them?). And 
hacking something just for auto_explain seems a big ugly.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect: Hash index support

2016-11-02 Thread Jesper Pedersen

On 11/01/2016 03:28 PM, Peter Eisentraut wrote:

Ok, thanks for your feedback.

Maybe "Returned with Feedback" is more appropriate, as there is still
development left.


I have applied the documentation change that introduced subsections,
which seems quite useful independently.  I have also committed the tests
that I proposed and will work through the failures.

As no new patch has been posted for the 2016-11 CF, I will close the
patch entry now.  Please submit an updated patch when you have the time,
keeping an eye on ongoing work to update hash indexes.



Agreed.

Best regards,
 Jesper



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2016-11-02 Thread Adam Brusselback
>
> The delete of about one million rows from a "parent" table with no
> matching rows in the "child" table, and no index on referencing
> column in the child table, took 24:17.969 using current triggers
> and 00:03.262 using the set-based triggers.  Yes, that reduces
> current run time for that case by 99.78%


That is really incredible.  Gets rid of the need for an index on
referencing columns for a ton of use cases.


[HACKERS] Substantial bloat in postgres_fdw regression test runtime

2016-11-02 Thread Tom Lane
In 9.6, "make installcheck" in contrib/postgres_fdw takes a shade
under 3 seconds on my machine.  In HEAD, it's taking 10 seconds.
I am not happy, especially not since there's no parallelization
of the contrib regression tests.  That's a direct consumption of
my time and all other developers' time too.  This evidently came
in with commit 7012b132d (Push down aggregates to remote servers),
and while that's a laudable feature, I cannot help thinking that
it does not deserve this much of an imposition on every make check
that's ever done for the rest of eternity.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] split up psql \d Modifiers column

2016-11-02 Thread Kuntal Ghosh
On Fri, Oct 28, 2016 at 9:00 AM, Peter Eisentraut
 wrote:
> I propose to change the psql \d output a bit, best shown with an example:
>
>  \d persons3
> -   Table "public.persons3"
> - Column |  Type   |Modifiers
> -+-+--
> - id | integer | not null
> - name   | text| default ''::text
> +  Table "public.persons3"
> + Column |  Type   | Collation | Nullable | Default
> ++-+---+--+--
> + id | integer |   | not null |
> + name   | text|   |  | ''::text
>
+1.
psql-ref.sgml(line 4085) needs to be modified. Otherwise, the patch
looks good to me.

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-11-02 Thread Peter Eisentraut
On 10/24/16 9:22 AM, Petr Jelinek wrote:
> I added one more prerequisite patch (the first one) which adds ephemeral
> slots (or well implements UI on top of the code that was mostly already
> there). The ephemeral slots are different in that they go away either on
> error or when session is closed. This means the initial data sync does
> not have to worry about cleaning up the slots after itself. I think this
> will be useful in other places as well (for example basebackup). I
> originally wanted to call them temporary slots in the UI but since the
> behavior is bit different from temp tables I decided to go with what the
> underlying code calls them in UI as well.

I think it makes sense to expose this.

Some of the comments need some polishing.

Eventually, we might want to convert the option list in
CREATE_REPLICATION_SLOT into a list instead of adding more and more
keywords (see also VACUUM), but not necessarily now.

I find the way Acquire and Release are handled now quite confusing.
Because Release of an ephemeral slot means to delete it, you have
changed most code to never release them until the end of the session.
So there is a lot of ugly and confusing code that needs to know this
difference.  I think we need to use some different verbs for different
purposes here.  Acquire and release should keep their meaning of "I'm
using this", and the calls in proc.c and postgres.c should be something
like ReplicationSlotCleanup().

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2016-11-02 Thread Pavel Stehule
2016-11-02 15:57 GMT+01:00 Kevin Grittner :

> On Sun, Oct 30, 2016 at 10:35 AM, Kevin Grittner 
> wrote:
>
> > SPI support would also
> > allow us to consider using set logic for validating foreign keys,
> > instead of the one-row-at-a-time approach currently used.
>
> Just as a proof of concept for this I used the attached test case
> to create foreign keys using current techniques versus set-oriented
> queries with the transition-tsr code.  These probably can be
> improved, since this is a "first cut" off the top of my head.
>
> The delete of about one million rows from a "parent" table with no
> matching rows in the "child" table, and no index on referencing
> column in the child table, took 24:17.969 using current triggers
> and 00:03.262 using the set-based triggers.  Yes, that reduces
> current run time for that case by 99.78%
>

this is great number

Pavel


>
> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


Re: [HACKERS] emergency outage requiring database restart

2016-11-02 Thread Oskari Saarenmaa

26.10.2016, 21:34, Andres Freund kirjoitti:

Any chance that plsh or the script it executes does anything with the file 
descriptors it inherits? That'd certainly one way to get into odd corruption 
issues.

We processor really should use O_CLOEXEC for the majority of it file handles.


Attached a patch to always use O_CLOEXEC in BasicOpenFile if we're not 
using EXEC_BACKEND.  It'd be nice to not expose all fds to most 
pl-languages either, but I guess there's no easy solution to that 
without forcibly closing all fds whenever any functions are called.


/ Oskari
>From 50d7410b895a1aae26c7001f11bd0d71a200ef96 Mon Sep 17 00:00:00 2001
From: Oskari Saarenmaa 
Date: Wed, 2 Nov 2016 16:42:36 +0200
Subject: [PATCH] BasicOpenFile: always use O_CLOEXEC if it is available

---
 src/backend/storage/file/fd.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git src/backend/storage/file/fd.c src/backend/storage/file/fd.c
index b7ff5ef..6cbe378 100644
--- src/backend/storage/file/fd.c
+++ src/backend/storage/file/fd.c
@@ -894,7 +894,19 @@ BasicOpenFile(FileName fileName, int fileFlags, int fileMode)
 	int			fd;
 
 tryAgain:
-	fd = open(fileName, fileFlags, fileMode);
+	fd = open(fileName, fileFlags, fileMode
+#if defined(O_CLOEXEC) && !defined(EXEC_BACKEND)
+	/*
+	 * We don't want exec'd processes to inherit our file handles unless
+	 * EXEC_BACKEND is used.  We don't expect execve() calls inside the
+	 * postgres code, but extensions and pl-languages may spawn new
+	 * processes that either don't work due to having no usable file
+	 * descriptors or write garbage in the files previously opened by
+	 * us.
+	 */
+	| O_CLOEXEC
+#endif
+		);
 
 	if (fd >= 0)
 		return fd;/* success! */
-- 
2.5.5


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2016-11-02 Thread Kevin Grittner
Attached is a minor fix to go on top of transition-tsr for issues
found yesterday in testing.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/optimizer/path/costsize.c 
b/src/backend/optimizer/path/costsize.c
index 53bfd4b..2da841e 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -4512,7 +4512,7 @@ set_cte_size_estimates(PlannerInfo *root, RelOptInfo 
*rel, double cte_rows)
 void
 set_tuplestore_size_estimates(PlannerInfo *root, RelOptInfo *rel)
 {
-   RangeTblEntry *rte;
+   RangeTblEntry *rte PG_USED_FOR_ASSERTS_ONLY;
 
/* Should only be applied to base relations that are tuplestore 
references */
Assert(rel->relid > 0);
diff --git a/src/backend/parser/parse_relation.c 
b/src/backend/parser/parse_relation.c
index 7c943c3..0b45139 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1160,12 +1160,17 @@ parserOpenTable(ParseState *pstate, const RangeVar 
*relation, int lockmode)
else
{
/*
+* An unqualified name might be a tuplestore relation 
name.
+*/
+   if (get_visible_tuplestore_metadata(pstate->p_tsrcache, 
relation->relname))
+   rel = NULL;
+   /*
 * An unqualified name might have been meant as a 
reference to
 * some not-yet-in-scope CTE.  The bare "does not 
exist" message
 * has proven remarkably unhelpful for figuring out 
such problems,
 * so we take pains to offer a specific hint.
 */
-   if (isFutureCTE(pstate, relation->relname))
+   else if (isFutureCTE(pstate, relation->relname))
ereport(ERROR,

(errcode(ERRCODE_UNDEFINED_TABLE),
 errmsg("relation \"%s\" does 
not exist",
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index a9f5d6e..57906c0 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -158,7 +158,7 @@ struct ParseState
boolp_locked_from_parent;
Relationp_target_relation;
RangeTblEntry *p_target_rangetblentry;
-   Tsrcache*p_tsrcache;/* visible named tuplestore relations */
+   Tsrcache   *p_tsrcache; /* visible named tuplestore relations */
 
/*
 * Optional hook functions for parser callbacks.  These are null unless

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] about missing xml related functionnalities

2016-11-02 Thread Jean-Paul Jorda
Hi,
I send this email after reading the xml2 module « Deprecation notice » in
https://www.postgresql.org/docs/9.6/static/xml2.html.

I still use xml2 module for one thing: xslt_process(). We store some
data in xml fields, and we sometime need to correct / upgrade the format.
Being able to apply a xslt transformation is a very efficient way to
achieve that.

So, as far as I'm concerned, xslt transformation appears to be a missing
fuctionnality in the newer api.

Best regards,

Jean-Paul


-- 
Jean-Paul JORDA
Centre de ressources technologiques
Coordinateur de l'équipe des développeurs
EDP Sciences
BP 112
17, avenue du Hoggar
P.A. de Courtaboeuf
F-91944 Les Ulis Cedex A
tel: 33 (0)1 69 18 17 64


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2016-11-02 Thread Kevin Grittner
On Sun, Oct 30, 2016 at 10:35 AM, Kevin Grittner  wrote:

> SPI support would also
> allow us to consider using set logic for validating foreign keys,
> instead of the one-row-at-a-time approach currently used.

Just as a proof of concept for this I used the attached test case
to create foreign keys using current techniques versus set-oriented
queries with the transition-tsr code.  These probably can be
improved, since this is a "first cut" off the top of my head.

The delete of about one million rows from a "parent" table with no
matching rows in the "child" table, and no index on referencing
column in the child table, took 24:17.969 using current triggers
and 00:03.262 using the set-based triggers.  Yes, that reduces
current run time for that case by 99.78%

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


ri-set-logic.sql
Description: application/sql

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL consistency check facility

2016-11-02 Thread Kuntal Ghosh
On Wed, Nov 2, 2016 at 1:11 PM, Kuntal Ghosh  wrote:
> Rest of the suggestions are well-taken. I'll update the patch accordingly.
I've updated the last submitted patch(v10) with the following changes:
- added a block level flag BKPIMAGE_APPLY to distinguish backup image
blocks which needs to be restored during replay.
- at present, hash index operations are not WAL-logged. Hence, I've removed
the consistency check option for hash indices. It can be added later.

Few comments:
- Michael suggested to use an integer variable and bitwise-shift
operation to store
the RMGR values instead of using a boolean array. But, boolean array
implementation looks cleaner to me. For example,
+if (wal_consistency[rmid])
+   rechdr->xl_info |= XLR_CHECK_CONSISTENCY;

+include_image = needs_backup || wal_consistency[rmid];

- Another suggestion was to remove wal_consistency from PostgresNode.pm
because small buildfarm machines may suffer on it. Although I've no
experience in this matter, I would like to be certain that nothings breaks
in recovery tests after some modifications.

--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


walconsistency_v11.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2016-11-02 Thread Karl O. Pinc
On Wed, 2 Nov 2016 07:55:45 -0500
"Karl O. Pinc"  wrote:

> On Wed, 2 Nov 2016 10:07:34 +0100
> Gilles Darold  wrote:
> 
> > Please have a look at line  1137 on HEAD of syslogger.c 

> Ok.  Thanks.  Sorry for the confusion.

And yes, we did talk about this before.  I should have remembered.

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make coverage-html on OS X

2016-11-02 Thread Peter Eisentraut
On 10/29/16 4:31 PM, Jim Nasby wrote:
> tl;dr: It's critical that you actually do a make install, or at least it 
> is if you've set --prefix with configure. If you don't, then even if you 
> do make check you'le going to get the *installed* libpq, and not the 
> *built* libpq.

I was not able to reproduce this.  I suspect that this would happen if
you don't disable System Integrity Protection, because if SIP is on,
then the environment variable DYLD_LIBRARY_PATH is disabled, and a
regression test run would be prone to use a mix of source tree and
install tree components.  Often, this is not a problem, but if your
source tree is built with different options than the previous install,
then it could make a mess.  Building with coverage instrumentation is
obviously one case where the build options are quite different.

> Also, looks like there's a race between the .info and .c.gcov targets 
> with parallel make; I'm wondering if there's a way to fix that by not 
> allowing parallelism in each directory...? (Presumably the issue is the 
> rm *.gcov). It'd be nice to fix this because -j speeds up coverage-html 
> a lot, even with just 2 cores.

I was not able to reproduce this.  The make rules look reasonable to me
for parallelism.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: scan key push down to heap [WIP]

2016-11-02 Thread Dilip Kumar
On Sat, Oct 29, 2016 at 12:17 PM, Dilip Kumar  wrote:
> What about putting slot reference inside HeapScanDesc ?. I know it
> will make ,heap layer use executor structure but just a thought.
>
> I have quickly hacked this way where we use slot reference in
> HeapScanDesc and directly use
>  slot_getattr inside HeapKeyTest (only if we have valid slot otherwise
> use _heap_getattr) and measure the worst case performance (what you
> have mentioned above.)
>
> My Test: (21 column table with varchar in beginning + qual is on last
> few column + varying selectivity )
>
> postgres=# \d test
>   Table "public.test"
>  Column |   Type| Modifiers
> +---+---
>  f1 | integer   |
>  f2 | character varying |
>  f3 | integer   |
>  f4 | integer   |
>  f5 | integer   |
>  f6 | integer   |
>  f7 | integer   |
>  f8 | integer   |
>  f9 | integer   |
>  f10| integer   |
>  f11| integer   |
>  f12| integer   |
>  f13| integer   |
>  f14| integer   |
>  f15| integer   |
>  f16| integer   |
>  f17| integer   |
>  f18| integer   |
>  f19| integer   |
>  f20| integer   |
>  f21| integer   |
>
> tuple count : 1000 (10 Million)
> explain analyze select * from test where f21< $1 and f20 < $1 and f19
> < $1 and f15 < $1 and f10 < $1; ($1 vary from 1Million to 1Million).
>
> Target code base:
> ---
> 1. Head
> 2. Heap_scankey_pushdown_v1
> 3. My hack for keeping slot reference in HeapScanDesc
> (v1+use_slot_in_HeapKeyTest)
>
> Result:
> Selectivity Head   scan_key_pushdown_v1 v1+use_slot_in_HeapKeyTest
> 0.1 3880  2980 2747
> 0.2 4041  3187 2914
> 0.5 5051  4921 3626
> 0.8 5378  7296 3879
> 1.0 6161  8525 4575
>
> Performance graph is attached in the mail..
>
> Observation:
> 
> 1. Heap_scankey_pushdown_v1, start degrading after very high
> selectivity (this behaviour is only visible if table have 20 or more
> columns, I tested with 10 columns but with that I did not see any
> regression in v1).
>
> 2. (v1+use_slot_in_HeapKeyTest) is always winner, even at very high 
> selectivity.

The patch is attached for this  (storing slot reference in HeapScanDesc)..

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


heap_scankey_pushdown_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2016-11-02 Thread Karl O. Pinc
On Wed, 2 Nov 2016 10:07:34 +0100
Gilles Darold  wrote:

> Please have a look at line  1137 on HEAD of syslogger.c you will see
> that in case of failure function logfile_open() report a FATAL or LOG
> error with message:
> 
> errmsg("could not open log file \"%s\": %m", filename);

Ok.  Thanks.  Sorry for the confusion.

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transactions involving multiple postgres foreign servers

2016-11-02 Thread Ashutosh Bapat
On Mon, Oct 31, 2016 at 6:17 AM, Masahiko Sawada  wrote:
> On Fri, Oct 28, 2016 at 3:19 AM, Robert Haas  wrote:
>> On Wed, Oct 26, 2016 at 2:00 AM, Masahiko Sawada  
>> wrote:
>>> I think we can consider the atomic commit and the atomic visibility
>>> separately, and the atomic visibility can build on the top of the
>>> atomic commit.
>>
>> It is true that we can do that, but I'm not sure whether it's the best 
>> design.
>
> I'm not sure best design, too. We need to discuss more. But this is
> not a particular feature for the sharing solution. The atomic commit
> using 2PC is useful for other servers that can use 2PC, not only
> postgres_fdw.
>

I think, we need to discuss the big picture i.e. architecture for
distributed transaction manager for PostgreSQL. Divide it in smaller
problems and then solve each of them as series of commits possibly
producing a useful feature with each commit. I think, what Robert is
pointing out is if we spend time solving smaller problems, we might
end up with something which can not be used to solve the bigger
problem. Instead, if we define the bigger problem and come up with
clear subproblems that when solved would solve the bigger problem, we
may not end up in such a situation.

There are many distributed transaction models discussed in various
papers like [1], [2], [3]. We need to assess which one/s, would suit
PostgreSQL FDW infrastructure and may be specifically for
postgres_fdw. There is some discussion at [4]. It lists a few
approaches, but I could not find a discussion on pros and cons of each
of them, and a conclusion as to which of the approaches suits
PostgreSQL. May be we want to start that discussion.

I know that it's hard to come up with a single model that would suit
FDWs or would serve all kinds of applications. We may not be able to
support a full distributed transaction manager for every FDW out
there. It's possible that because of lack of the big picture, we will
not see anything happen in this area for another release. Given that
and since all of the models in those papers require 2PC as a basic
building block, I was of the opinion that we could at least start with
2PC implementation. But I think request for bigger picture is also
valid for reasons stated above.

> Attached latest 3 patches that incorporated review comments so far.
> But recovery speed improvement that is discussed on another thread is
> not incorporated yet.
> Please give me feedback.
>


[1] http://link.springer.com/article/10.1007/s00778-014-0359-9
[2] 
https://domino.mpi-inf.mpg.de/intranet/ag5/ag5publ.nsf/1c0a12a383dd2cd8c125613300585c64/7684dd8109a5b3d5c1256de40051686f/$FILE/tdd99.pdf
[3] http://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=1713=cstech
[4] https://wiki.postgresql.org/wiki/DTM

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-11-02 Thread Amit Langote
On 2016/11/02 2:44, Robert Haas wrote:
> On Fri, Oct 28, 2016 at 3:53 AM, Amit Langote
>  wrote:
>>> Insisting that you can't drop a child without detaching it first seems
>>> wrong to me.  If I already made this comment and you responded to it,
>>> please point me back to whatever you said.  However, my feeling is
>>> this is flat wrong and absolutely must be changed.
>>
>> I said the following [1]:
>>
>> | Hmm, I don't think I like this.  Why should it be necessary to detach
>> | a partition before dropping it?  That seems like an unnecessary step.
>>
>> I thought we had better lock the parent table when removing one of its
>> partitions and it seemed a bit odd to lock the parent table when dropping
>> a partition using DROP TABLE?  OTOH, with ALTER TABLE parent DETACH
>> PARTITION, the parent table is locked anyway.
> 
> That "OTOH" part seems like a pretty relevant point.
> 
> Basically, I think people expect to be able to say "DROP THINGTYPE
> thingname" or at most "DROP THINGTYPE thingname CASCADE" and have that
> thing go away.  I'm opposed to anything which requires some other
> series of steps without a very good reason, and possible surprise
> about the precise locks that the command requires isn't a good enough
> reason from my point of view.

OK, I changed things so that DROP TABLE a_partition no longer complains
about requiring to detach first.  Much like how index_drop() locks the
parent table ('parent' in a different sense, of course) and later
invalidates its relcache, heap_drop_with_catalog(), if the passed in relid
is a partition, locks the parent table using AccessExclusiveLock, performs
its usual business, and finally invalidates the parent's relcache before
closing it without relinquishing the lock.  Does that sounds sane?  One
downside is that if the specified command is DROP TABLE parent CASCADE,
the above described invalidation is a waste of cycles because the parent
will be dropped anyway after all the partitions are dropped.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL consistency check facility

2016-11-02 Thread Kuntal Ghosh
On Wed, Nov 2, 2016 at 1:34 PM, Michael Paquier
 wrote:
> Hm... Right. That was broken. And actually, while the record-level
> flag is useful so as you don't need to rely on checking
> wal_consistency when doing WAL redo, the block-level flag is useful to
> make a distinction between blocks that have to be replayed and the
> ones that are used only for consistency, and both types could be mixed
> in a record. Using it in bimg_info would be fine... Perhaps a better
> name for the flag would be something like BKPIMAGE_APPLY, to mean that
> the FPW needs to be applied at redo. Or BKPIMAGE_IGNORE, to bypass it
> when replaying it. IS_REQUIRED_FOR_REDO is quite confusing.
BKPIMAGE_APPLY seems reasonable.

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to implement pg_current_logfile() function

2016-11-02 Thread Gilles Darold
Le 02/11/2016 à 03:51, Karl O. Pinc a écrit :
> On Mon, 31 Oct 2016 09:26:27 +0100
> Gilles Darold  wrote:
>
>> Le 30/10/2016 à 08:04, Karl O. Pinc a écrit :
>> Attached patch v11 include your patch.
>>
>>> I have some questions about logfile_writename():
>>>
>>>  Why does the logfile_open() call fail silently?
>> logfile_open() "fail silently" in logfile_writename(), like in other
>> parts of syslogger.c where it is called, because this is a function()
>> that already report a message when an error occurs ("could not open
>> log file..."). I think I have already replied to this question.
> Please apply the attached patch on top of your v11 patch.
> It simulates a logfile_open() failure.  Upon simulated failure you do
> not get a "currrent_logfile" file, and neither do you get any
> indication of any problems anywhere in the logs.  It's failing
> silently.

Please have a look at line  1137 on HEAD of syslogger.c you will see
that in case of failure function logfile_open() report a FATAL or LOG
error with message:

errmsg("could not open log file \"%s\": %m", filename);

I don't think adding more error messages after this has any interest
this is why logfile_writename() just return without doing anything.
Other call to logfile_open() have the exact same behavior.

For a real example, create the temporary file and change its system
privileges:

   touch /usr/local/postgresql/data/current_logfiles.tmp
   chmod 400 /usr/local/postgresql/data/current_logfiles.tmp

So in this case of failure it prints:

2016-11-02 09:56:00.791 CET [6321] LOG:  could not open log file
"current_logfiles.tmp": Permission denied

I don't understand what you are expecting more? I agree that "open log"
can confuse a little but this is also a file where we log the current
log file name so I can live with that.


-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-11-02 Thread Amit Langote
On 2016/11/02 16:41, Amit Langote wrote:
> Having said all that, I am open to switching to the catalogued partition
> constraints if the arguments I make above in favor of this patch are not
> all that sound.

One problem I didn't immediately see a solution for if we go with the
catalogued partition constraints is how do we retrieve a relation's
partition constraint?  There are a couple of cases where that becomes
necessary.  Consider that we are adding a partition to a partitioned table
that is itself partition.  In this case, the new partition's constraint
consists of whatever we generate from its FOR VALUES specification *ANDed*
with the parent's constraint.  We must somehow get hold of the latter.
Which of the parent's named check constraints in the pg_constraint catalog
is supposed to be the partition constraint?  With the uncatalogued
partition constraints approach, we simply generate it from the parent's
pg_class.relpartbound (or we might get the relcache copy of the same
stored in rd_partcheck).  Hmm.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL consistency check facility

2016-11-02 Thread Michael Paquier
On Wed, Nov 2, 2016 at 4:41 PM, Kuntal Ghosh  wrote:
> On Wed, Nov 2, 2016 at 10:23 AM, Michael Paquier
>  wrote:
>> On Tue, Nov 1, 2016 at 10:31 PM, Robert Haas  wrote:
>>> IMHO, your rewrite of this patch was a bit heavy-handed.
>>
>> OK... Sorry for that.
>>
>>> I haven't
>>> scrutinized the code here so maybe it was a big improvement, and if so
>>> fine, but if not it's better to collaborate with the author than to
>>> take over.
>>
>> While reviewing the code, that has finished by being a large rewrite,
>> and that was more understandable than a review looking at all the
>> small tweaks and things I have been through while reading it. I have
>> also experimented a couple of ideas with the patch that I added, so at
>> the end it proves to be a gain for everybody. I think that the last
>> patch is an improvement, if you want to make your own opinion on the
>> matter looking at the differences between both patches would be the
>> most direct way to go.
>>
> If my understanding is correct regarding this feature, last two patches
> completely break the fundamental idea of wal consistency check feature.
> I mentioned this in my last reply as well that we've to use some flag
> to indicate
> whether an image should be restored during replay or not. Otherwise,
> XLogReadBufferForRedoExtended will always restore the image skipping the usual
> redo operation. What's happening now is the following:
> 1. If wal_consistency is on, include backup block image with the wal record.
> 2. During replay, XLogReadBufferForRedoExtended always restores the backup 
> block
> image in local buffer since XLogRecHasBlockImage is true for each block.
> 3. In checkConsistency, you compare the local buffer with the backup block 
> image
> from the wal record. It'll always be consistent.
> This feature aims to validate whether wal replay operation is
> happening correctly or not.
> To achieve that aim, we should not alter the wal replay operation itself.

Hm... Right. That was broken. And actually, while the record-level
flag is useful so as you don't need to rely on checking
wal_consistency when doing WAL redo, the block-level flag is useful to
make a distinction between blocks that have to be replayed and the
ones that are used only for consistency, and both types could be mixed
in a record. Using it in bimg_info would be fine... Perhaps a better
name for the flag would be something like BKPIMAGE_APPLY, to mean that
the FPW needs to be applied at redo. Or BKPIMAGE_IGNORE, to bypass it
when replaying it. IS_REQUIRED_FOR_REDO is quite confusing.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL consistency check facility

2016-11-02 Thread Kuntal Ghosh
On Wed, Nov 2, 2016 at 10:23 AM, Michael Paquier
 wrote:
> On Tue, Nov 1, 2016 at 10:31 PM, Robert Haas  wrote:
>> IMHO, your rewrite of this patch was a bit heavy-handed.
>
> OK... Sorry for that.
>
>> I haven't
>> scrutinized the code here so maybe it was a big improvement, and if so
>> fine, but if not it's better to collaborate with the author than to
>> take over.
>
> While reviewing the code, that has finished by being a large rewrite,
> and that was more understandable than a review looking at all the
> small tweaks and things I have been through while reading it. I have
> also experimented a couple of ideas with the patch that I added, so at
> the end it proves to be a gain for everybody. I think that the last
> patch is an improvement, if you want to make your own opinion on the
> matter looking at the differences between both patches would be the
> most direct way to go.
>
If my understanding is correct regarding this feature, last two patches
completely break the fundamental idea of wal consistency check feature.
I mentioned this in my last reply as well that we've to use some flag
to indicate
whether an image should be restored during replay or not. Otherwise,
XLogReadBufferForRedoExtended will always restore the image skipping the usual
redo operation. What's happening now is the following:
1. If wal_consistency is on, include backup block image with the wal record.
2. During replay, XLogReadBufferForRedoExtended always restores the backup block
image in local buffer since XLogRecHasBlockImage is true for each block.
3. In checkConsistency, you compare the local buffer with the backup block image
from the wal record. It'll always be consistent.
This feature aims to validate whether wal replay operation is
happening correctly or not.
To achieve that aim, we should not alter the wal replay operation itself.

Rest of the suggestions are well-taken. I'll update the patch accordingly.
-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-11-02 Thread Amit Langote
On 2016/11/02 2:34, Robert Haas wrote:
> On Fri, Oct 28, 2016 at 3:53 AM, Amit Langote
>  wrote:
>> [ new patches ]
> 
> Reviewing 0006:

Thanks for the review!

> This patch seems scary.  I sort of assumed from the title -- "Teach a
> few places to use partition check quals." -- that this was an optional
> thing, some kind of optimization from which we could reap further
> advantage once the basic infrastructure was in place.  But it's not
> that at all.  It's absolutely necessary that we do this, or data
> integrity is fundamentally compromised.  How do we know that we've
> found all of the places that need to be taught about these new,
> uncatalogued constraints?

Making this a separate commit from 0003 was essentially to avoid this
getting lost among all of its other changes.  In fact, it was to bring to
notice for closer scrutiny whether all the sites in the backend code that
are critical for data integrity in face of the implicit partition
constraints are being informed about those constraints.

> I'm feeling fairly strongly like you should rewind and make the
> partitioning constraints normal catalogued constraints.  That's got a
> number of advantages, most notably that we can be sure they will be
> properly enforced by the entire system (modulo existing bugs, of
> course).  Also, they'll show up automatically in tools like psql's \d
> output, pgAdmin, and anything else that is accustomed to being able to
> find constraints in the catalog.  We do need to make sure that those
> constraints can't be dropped (or altered?) inappropriately, but that's
> a relatively small problem.  If we stick with the design you've got
> here, every client tool in the world needs to be updated, and I'm not
> seeing nearly enough advantage in this system to justify that kind of
> upheaval.

As for which parts of the system need to know about these implicit
partition constraints to *enforce* them for data integrity, we could say
that it's really just one site - ExecConstraints() called from
ExecInsert()/ExecUpdate().

Admittedly, the current error message style as in this patch exposes the
implicit constraint approach to a certain criticism: "ERROR:  new row
violates the partition boundary specification of \"%s\"".  It would say
the following if it were a named constraint: "ERROR: new row for relation
\"%s\" violates check constraint \"%s\""

For constraint exclusion optimization, we teach get_relation_constraints()
to look at these constraints.  Although greatly useful, it's not the case
of being absolutely critical.

Beside the above two cases, there is bunch of code (relcache, DDL) that
looks at regular constraints, but IMHO, we need not let any of that code
concern itself with the implicit partition constraints.  Especially, I
wonder why the client tools should want the implicit partitioning
constraint to be shown as a CHECK constraint?  As the proposed patch 0004
(psql) currently does, isn't it better to instead show the partition
bounds as follows?

+CREATE TABLE part_b PARTITION OF parted (
+   b WITH OPTIONS NOT NULL DEFAULT 1 CHECK (b >= 0),
+   CONSTRAINT check_a CHECK (length(a) > 0)
+) FOR VALUES IN ('b');

+\d part_b
+ Table "public.part_b"
+ Column |  Type   | Modifiers
++-+
+ a  | text|
+ b  | integer | not null default 1
+Partition of: parted FOR VALUES IN ('b')
+Check constraints:
+"check_a" CHECK (length(a) > 0)
+"part_b_b_check" CHECK (b >= 0)

Needless to say, that could save a lot of trouble thinking about
generating collision-free names of these constraints, their dependency
handling, unintended altering of these constraints, pg_dump, etc.

> In fact, as far as I can see, the only advantage of this approach is
> that when the insert arrives through the parent and is routed to the
> child by whatever tuple-routing code we end up with (I guess that's
> what 0008 does), we get to skip checking the constraint, saving CPU
> cycles.  That's probably an important optimization, but I don't think
> that putting the partitioning constraint in the catalog in any way
> rules out the possibility of performing that optimization.  It's just
> that instead of having the partitioning excluded-by-default and then
> sometimes choosing to include it, you'll have it included-by-default
> and then sometimes choose to exclude it.

Hmm, doesn't it seem like we would be making *more* modifications to the
existing code (backend or otherwise) to teach it about excluding the
implicit partition constraints than the other way around?  The other way
around being to modify the existing code to include the implicit
constraints which is what this patch is about.

Having said all that, I am open to switching to the catalogued partition
constraints if the arguments I make above in favor of this patch are not
all that sound.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes 

[HACKERS] Re: [COMMITTERS] pgsql: Add make rules to download raw Unicode mapping files

2016-11-02 Thread Michael Paquier
On Wed, Nov 2, 2016 at 12:58 AM, Peter Eisentraut  wrote:
> Add make rules to download raw Unicode mapping files
>
> This serves as implicit documentation and is handy if someone wants to
> tweak things.  The rules are not part of a normal build, like this
> entire directory.

If the goal is to prevent to have those files in the code tree,
wouldn't it make sense as well to have a .gitignore with all those
files in it?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Change the way that LWLocks for extensions are allocated.

2016-11-02 Thread Ashutosh Sharma
Hi,

I have started with the review for this patch and would like to share
some of my initial review comments that requires author's attention.

1) I am getting some trailing whitespace errors when trying to apply
this patch using git apply command. Following are the error messages i
am getting.

[edb@localhost postgresql]$ git apply test_pg_stat_statements-v1.patch
test_pg_stat_statements-v1.patch:28: trailing whitespace.
$(MAKE) -C $(top_builddir)/src/test/regress all
test_pg_stat_statements-v1.patch:41: space before tab in indent.
 $(REGRESSCHECKS)
warning: 2 lines add whitespace errors.

2) The test-case you have added is failing that is because queryid is
going to be different every time you execute the test-case. I think it
would be better to remove the queryid column from "SELECT queryid,
query, calls, rows from pg_stat_statements ORDER BY rows". Below is
the output i got upon test-case execution.

make regresscheck

== running regression test queries==
test pg_stat_statements   ... FAILED
== shutting down postmaster   ==

3) Ok. As you mentioned upthread, I do agree with the fact that we
can't add pg_stat_statements tests for installcheck as this module
requires shared_preload_libraries to be set. But, if there is an
already existing installation with pg_stat_statements module loaded we
should allow the test to run on this installation if requested
explicitly by the user. I think we can add some target like
'installcheck-force' in the MakeFile that would help the end users to
run the test on his own installation. Thoughts?


Also, I am changed the status in the commit-fest from "Needs review"
to "waiting on Author".

On Mon, Sep 19, 2016 at 6:26 PM, Amit Kapila  wrote:
> On Sat, Sep 10, 2016 at 5:02 PM, Amit Kapila  wrote:
>> On Tue, Aug 30, 2016 at 8:01 AM, Andres Freund  wrote:
>>> On 2016-08-30 07:57:19 +0530, Amit Kapila wrote:
 I will write such a test case either in this week or early next week.
>>>
>>> Great.
>>>
>>
>> Okay, attached patch adds some simple tests for pg_stat_statements.
>> One thing to note is that we can't add pg_stat_statements tests for
>> installcheck as this module requires shared_preload_libraries to be
>> set.  Hope this suffices the need.
>>
>
> Registered this patch for next CF.
>
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Specifying the log file name of pgbench -l option

2016-11-02 Thread Beena Emerson
Hello,

On Wed, Nov 2, 2016 at 4:16 AM, Fabien COELHO  wrote:

>
> Hello Masahiko,
>
> So I would suggest to:
>>>  - fix the compilation issue
>>>  - leave -l/--log as it is, i.e. use "pgbench_log" as a prefix
>>>  - add --log-prefix=... (long option only) for changing this prefix
>>>
>>
>> I agree. It's better to add the separated option to specify the prefix
>> of log file instead of changing the existing behaviour. Attached
>> latest patch incorporated review comments.
>> Please review it.
>>
>
> Patch applies, compiles and works for me.
>

It works for me as well.


>
> This new option is for benchmarking, so "benchmarking_option_set" should
> be set to true.
>
> To simplify the code, I would suggest to initialize explicitely
> "logfile_prefix" to the default value which is then overriden when the
> option is set, which allows to get rid of the "prefix" variable later.
>
> There is no reason to change the documentation by breaking a line at
> another place if the text is not changed (where ... with 1).
>
> I would suggest to simplify a little bit the documentation:
>   "prefix of log file ..." ->
>   "default log file prefix that can be changed with ..."
>
> Otherwise the explanations seem clear enough to me. If a native English
> speaker could check them though, it would be nice.


For the explanation of the option --log-prefix, I feel it would be better
to say "Custom prefix for transaction log file. Default is pgbench_log"


-   pgbench_log.nnn, where
+
pgbench_log.nnn,
+   where pgbench_log is the prefix of log file
that can
+   be changed by specifying --log-prefix, and where

It could just say "the default prefix pgbench_log can be changed with
option --log-prefix, and " we need not use where again.

Also the error message is made similar to that of aggregate-interval but I
think the one in sampling-rate is better:

$ pgbench --log-prefix=chk -t 20
log file prefix (--log-prefix) is allowed only when actually logging
transactions

pgbench  --sampling-rate=1 -t 20
log sampling (--sampling-rate) is allowed only when logging transactions
(-l)


Thanks,

Beena Emerson

Have a Great Day!


Re: [HACKERS] Using a latch between a background worker process and a thread

2016-11-02 Thread Craig Ringer
On 2 November 2016 at 02:10, Robert Haas  wrote:
> On Tue, Nov 1, 2016 at 12:35 PM, Abbas Butt  
> wrote:
>> Hi,
>> Consider this situation:
>> 1. I have a background worker process.
>> 2. The process creates a latch, initializes it using InitLatch & resets it.
>> 3. It then creates a thread and passes the latch created in step 2 to it.
>> To pass it, the process uses the last argument of pthread_create.
>> 4. The thread blocks by calling WaitLatch.
>> 5. The process after some time sets the latch using SetLatch.
>>
>> The thread does not notice that the latch has been set and keeps waiting.
>>
>> My question is:
>> Are latches supposed to work between a process and a thread created by that
>> process?
>
> Nothing in the entire backend is guaranteed to work if you spawn
> multiple threads within the same process.
>
> Including this.

Yep.

You could have the main thread wait on the latch, then signal the
other threads via appropriate pthread primitives. But you must ensure
your other threads do nothing that calls into backend code. Including
things like atexit handlers. They need to coordinate with the main
thread to do everything PostgreSQL related, and you'd need to make
sure the main thread handles all signals. That's the default for Linux
- the main thread gets first chance at all signals and other threads'
sigmasks are only checked if the main thread has masked the signal,
but that means your other threads should be sure to mask all signals
used by PostgreSQL. Good luck doing that portably.

There are exceptions where you can call some backend functions and
macros from other threads. But you'd have to analyse each on a case by
case basis, and there's no guarantee they'll _stay_ safe.

I'd just avoid using threads in the backend if at all possible.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequences and pg_upgrade

2016-11-02 Thread Michael Paquier
On Mon, Oct 31, 2016 at 9:53 PM, Peter Eisentraut
 wrote:
> On 9/30/16 12:50 PM, Anastasia Lubennikova wrote:
>> The patches are good, no complaints.
>> But again, I have the same question.
>> I was confused, why do we always dump sequence data,
>> because I'd overlooked the --sequence-data key. I'd rather leave this
>> option,
>> because it's quite non intuitive behaviour...
>>   /* dump sequence data even in schema-only mode */
>
> Here are rebased patches.
>
> Regarding your question:  The initial patch had a separate option for
> this behavior, which was then used by pg_upgrade.  It was commented that
> this option is not useful outside of pg_upgrade, so it doesn't need to
> be exposed as a user-facing option.  I agreed with that and removed the
> option.  We can always add the option back easily if someone really
> wants it, but so far no use case has been presented.  So I suggest we
> proceed with this proposal ignoring whether this option is exposed or not.

I had a look at those fresh patches, and 0001 looks like a good thing.
This makes the separation between sequences and table data dump
cleaner. I ran some tests with pg_upgrade and 0002, and things are
clear. And +1 for the way done in the patch, aka no options of pg_dump
exposed to user, still keep the option tracking as a separate value.

One small thing here:
 static void
-getTableData(DumpOptions *dopt, TableInfo *tblinfo, int numTables, bool oids)
+getTableData(DumpOptions *dopt, TableInfo *tblinfo, int numTables,
bool oids, char relkind)
 {
int i;

for (i = 0; i < numTables; i++)
{
-   if (tblinfo[i].dobj.dump & DUMP_COMPONENT_DATA)
+   if (tblinfo[i].dobj.dump & DUMP_COMPONENT_DATA &&
+   (!relkind || tblinfo[i].relkind == relkind))
makeTableDataInfo(dopt, &(tblinfo[i]), oids)

One idea here would be to have an extra routine, getSequenceData and
not extend getTableData() with relkind as extra argument. I am fine
with the way patch does things, so I just switched the patch as ready
for committer.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers