Re: [HACKERS] Test code is worth the space

2015-08-13 Thread Magnus Hagander
On Thu, Aug 13, 2015 at 1:31 AM, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Aug 12, 2015 at 7:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  FWIW, I've objected in the past to tests that would significantly
  increase the runtime of make check, unless I thought they were
  especially valuable (which enumerating every minor behavior of a
  feature patch generally isn't IMO).  I still think that that's an
  important consideration: every second you add to make check is
  multiplied many times over when you consider how many developers
  run that how many times a day.
 
  We've talked about having some sort of second rank of tests that
  people wouldn't necessarily run before committing, and that would
  be allowed to eat more time than the core regression tests would.
  I think that might be a valuable direction to pursue if people start
  submitting very bulky tests.

 Maybe.  Adding a whole new test suite is significantly more
 administratively complex, because the BF client has to get updated to
 run it.  And if expected outputs in that test suite change very often
 at all, then committers will have to run it before committing anyway.



We could always do that the other way - meaning put everything in make
check, and invent a make quickcheck for developers with the smaller
subset.



 The value of a core regression suite that takes less time to run has
 to be weighed against the possibility that a better core regression
 suite might cause us to find more bugs before committing.  That could
 easily be worth the price in runtime.



Or have a quickcheck you run all the time and then run the bigger one
once before committing perhaps?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] count_nulls(VARIADIC any)

2015-08-13 Thread Pavel Stehule
2015-08-13 9:47 GMT+02:00 Shulgin, Oleksandr oleksandr.shul...@zalando.de:

 On Thu, Aug 13, 2015 at 9:25 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:



 2015-08-13 9:21 GMT+02:00 Marko Tiikkaja ma...@joh.to:

 On 8/13/15 9:18 AM, Shulgin, Oleksandr wrote:

 nnulls()


 I think I'd prefer num_nulls() over that.


 can be

 what about similar twin function num_nonulls()?


 Yes.  But I'm can't see any precedent for naming it like num_*...  And if
 anything, should it be num_nonnulls() then?


it is detail -  depends on final naming convention.


Re: [HACKERS] count_nulls(VARIADIC any)

2015-08-13 Thread Pavel Stehule
2015-08-13 9:21 GMT+02:00 Marko Tiikkaja ma...@joh.to:

 On 8/13/15 9:18 AM, Shulgin, Oleksandr wrote:

 nnulls()


 I think I'd prefer num_nulls() over that.


can be

what about similar twin function num_nonulls()?

Pavel



 .m



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] count_nulls(VARIADIC any)

2015-08-13 Thread Shulgin, Oleksandr
On Thu, Aug 13, 2015 at 9:25 AM, Pavel Stehule pavel.steh...@gmail.com
wrote:



 2015-08-13 9:21 GMT+02:00 Marko Tiikkaja ma...@joh.to:

 On 8/13/15 9:18 AM, Shulgin, Oleksandr wrote:

 nnulls()


 I think I'd prefer num_nulls() over that.


 can be

 what about similar twin function num_nonulls()?


Yes.  But I'm can't see any precedent for naming it like num_*...  And if
anything, should it be num_nonnulls() then?


Re: [HACKERS] PL/pgSQL, RAISE and error context

2015-08-13 Thread Pavel Stehule
Hi

2015-08-12 11:07 GMT+02:00 Marko Tiikkaja ma...@joh.to:

 On 8/12/15 9:36 AM, Pavel Stehule wrote:

 So, there is common agreement on this version.


 There are several instances of double semicolons.  Also,
 PsqlSettings.show_context doesn't look like a boolean to me.  For
 SHOW_CONTEXT, it would be good if the documentation mentioned the default
 value.

 I'm somewhat worried that this is hiding important context from some
 NOTICE or WARNING messages intended for novice users, but probably not
 worried enough to go through all of them.  +3/8 from me, I guess.


I fixed mentioned issues.

Regards

Pavel



 .m



libpq-context-filter-20150813-01.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-13 Thread Fabien COELHO



FWIW, I've objected in the past to tests that would significantly
increase the runtime of make check, unless I thought they were
especially valuable (which enumerating every minor behavior of a
feature patch generally isn't IMO).  I still think that that's an
important consideration: every second you add to make check is
multiplied many times over when you consider how many developers
run that how many times a day.


Sure. These are developer's tests run 50 times per day just to check that 
nothing was just completly broken. It's just a kind of test.


I agree that having ISN conversions tested everytime does not make much 
sense, especially as the source file is touched every two years. On the 
other hand, when I tried to fix ISN bugs and figure out that there was no 
single tests for the module, then it is hard to spot regressions, so the 
tests should be there even if they are not run often: Testing the obvious 
is useful when you start meddling in the code.



We've talked about having some sort of second rank of tests that
people wouldn't necessarily run before committing, and that would
be allowed to eat more time than the core regression tests would.
I think that might be a valuable direction to pursue if people start
submitting very bulky tests.


Yep.

For regression tests, the list of tests run is maintained in the 
*_schedule files. There could be a large_parallel_schedule which 
included more tests. This is already more or less the case, as there are 
big* targets which run numeric_big in addition to the others.

This could be expanded and taken into account by the build farm.

I recall test submissions which were rejected on the ground of 'it takes 1 
second of my time every day' and is 'not very useful as the feature is 
known to work'.


I think that a key change needed is that committers are more open to such 
additions and the discussion rather focus on (1) are the tests portable 
(2) do the test cover features not already tested (3) should they be in 
the default list of tests run under make check. But some should be 
accepted in *core* nevertheless, and not push out in the bin.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-13 Thread Peter Geoghegan
On Wed, Aug 12, 2015 at 11:23 PM, Magnus Hagander mag...@hagander.net wrote:
 The value of a core regression suite that takes less time to run has
 to be weighed against the possibility that a better core regression
 suite might cause us to find more bugs before committing.  That could
 easily be worth the price in runtime.


 Or have a quickcheck you run all the time and then run the bigger one once
 before committing perhaps?

I favor splitting the regression tests to add all the time and
before commit targets as you describe. I think that once the
facility is there, we can determine over time how expansive that
second category gets to be.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] count_nulls(VARIADIC any)

2015-08-13 Thread Marko Tiikkaja

On 8/13/15 9:18 AM, Shulgin, Oleksandr wrote:

nnulls()


I think I'd prefer num_nulls() over that.


.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] count_nulls(VARIADIC any)

2015-08-13 Thread Atri Sharma
On Thu, Aug 13, 2015 at 12:55 PM, Pavel Stehule pavel.steh...@gmail.com
wrote:



 2015-08-13 9:21 GMT+02:00 Marko Tiikkaja ma...@joh.to:

 On 8/13/15 9:18 AM, Shulgin, Oleksandr wrote:

 nnulls()


 I think I'd prefer num_nulls() over that.


 can be

 what about similar twin function num_nonulls()?


+1


[HACKERS] Views created by UNION ALL

2015-08-13 Thread Tatsuo Ishii
Views created by UNION ALL could be updated with both UPDATE and
DELETE operations. They are just transformed to UPDATE/DELETE
operation on base tables (or views). I think INSERT cannot be accepted
however.  For example, suppose we have v1:

CREATE VIEW v1 AS SELECT * FROM t1 UNION ALL SELECT * FROM t2;

Inserting into v1 is ambiguous because it can be transformed either
inserting into t1 or t2.

So I think we could make v1 auto updatable view.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] count_nulls(VARIADIC any)

2015-08-13 Thread Shulgin, Oleksandr
On Thu, Aug 13, 2015 at 2:19 AM, David G. Johnston 
david.g.johns...@gmail.com wrote:

 On Wed, Aug 12, 2015 at 4:11 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Peter Geoghegan p...@heroku.com writes:
  On Wed, Aug 12, 2015 at 10:30 AM, Alvaro Herrera
  alvhe...@2ndquadrant.com wrote:
  The name count_nulls() suggest an aggregate function to me, though.

  I thought the same.

 Ditto.  I'd be fine with this if we can come up with a name that
 doesn't sound like an aggregate.  The best I can do offhand is
 number_of_nulls(), which doesn't seem very pretty.


 nulls_in(a, b, c) IN (0, 3) - yes the repetition is not great...


How about these:

nulls_rank() (the analogy being 0 = rank = set size)
nnulls()

or just

nulls() (this one might be a bit confusing due to existing NULLS LAST/FIRST
syntax, though)

--
Alex


Re: [HACKERS] pg_dump quietly ignore missing tables - is it bug?

2015-08-13 Thread Pavel Stehule
Hi

2015-07-30 12:44 GMT+02:00 Heikki Linnakangas hlinn...@iki.fi:

 On 07/25/2015 07:08 PM, Pavel Stehule wrote:

 I am sending a new patch - without checking wildcard chars.


 The documentation says the option is called --strict-names, while the code
 has --strict-mode. I like --strict-names more, mode seems redundant, and
 it's not clear what it's strict about.


ok


 For symmetry, it would be good to also support this option in pg_restore.
 It seems even more useful there.


I'll do it


 Can we do better than issuing a separate query for each table/schema name?
 The performance of this isn't very important, but still it seems like you
 could fairly easily refactor the code to avoid that. Perhaps return an
 extra constant for part of the UNION to distinguish which result row came
 from which pattern, and check that at least one row is returned for each.


I did few tests and for 1K tables the union is faster about 50ms, but the
code is much more complex, for 10K tables, the union is significantly
slower (probably due planning) 2sec x 7sec. So if we are expecting backup
on not too slow network, then simple solution is winner - Postgres process
simple read queries quickly.

Regards

Pavel



 - Heikki




[HACKERS] [Proposal] Table partition + join pushdown

2015-08-13 Thread Taiki Kondo
Hi all,

I saw the email about the idea from KaiGai-san[1],
and I worked to implement this idea.

Now, I have implemented a part of this idea,
so I want to propose this feature.

Patch attached just shows my concept of this feature.
It works fine for EXPLAIN, but it returns wrong result for other operations, 
sadly.



Table partition + join pushdown
===

Motivation
--
To make join logic working more effectively,
it is important to make the size of relations smaller.

Especially in Hash-join, it is meaningful to make the inner relation smaller,
because smaller inner relation can be stored within smaller hash table.
This means that memory usage can be reduced when joining with big tables.


Design
--
It was mentioned by the email from KaiGai-san.
So I quote below here...

 begin quotation ---
Let's assume a table which is partitioned to four portions,
and individual child relations have constraint by hash-value
of its ID field.

  tbl_parent
   + tbl_child_0 ... CHECK(hash_func(id) % 4 = 0)
   + tbl_child_1 ... CHECK(hash_func(id) % 4 = 1)
   + tbl_child_2 ... CHECK(hash_func(id) % 4 = 2)
   + tbl_child_3 ... CHECK(hash_func(id) % 4 = 3)

If someone tried to join another relation with tbl_parent
using equivalence condition, like X = tbl_parent.ID, we
know inner tuples that does not satisfies the condition
  hash_func(X) % 4 = 0
shall be never joined to the tuples in tbl_child_0.
So, we can omit to load these tuples to inner hash table
preliminary, then it potentially allows to split the
inner hash-table.

Current typical plan structure is below:

  HashJoin
- Append
  - SeqScan on tbl_child_0
  - SeqScan on tbl_child_1
  - SeqScan on tbl_child_2
  - SeqScan on tbl_child_3
- Hash
  - SeqScan on other_table

It may be rewritable to:

  Append
   - HashJoin
 - SeqScan on tbl_child_0
 - Hash ... Filter: hash_func(X) % 4 = 0
   - SeqScan on other_table
   - HashJoin
 - SeqScan on tbl_child_1
 - Hash ... Filter: hash_func(X) % 4 = 1
   - SeqScan on other_table
   - HashJoin
 - SeqScan on tbl_child_2
 - Hash ... Filter: hash_func(X) % 4 = 2
   - SeqScan on other_table
   - HashJoin
 - SeqScan on tbl_child_3
 - Hash ... Filter: hash_func(X) % 4 = 3
   - SeqScan on other_table
 end quotation ---

In the quotation above, it was written that filter is set at Hash node.
But I implemented that filter is set at SeqScan node under Hash node.
In my opinion, filtering tuples is work of Scanner.

  Append
   - HashJoin
 - SeqScan on tbl_child_0
 - Hash
   - SeqScan on other_table ... Filter: hash_func(X) % 4 = 0
   - HashJoin
 - SeqScan on tbl_child_1
 - Hash
   - SeqScan on other_table ... Filter: hash_func(X) % 4 = 1
   - HashJoin
 - SeqScan on tbl_child_2
 - Hash
   - SeqScan on other_table ... Filter: hash_func(X) % 4 = 2
   - HashJoin
 - SeqScan on tbl_child_3
 - Hash
   - SeqScan on other_table ... Filter: hash_func(X) % 4 = 3


API
---
There are 3 new internal (static) functions to implement this feature.
try_hashjoin_pushdown(), which is main function of this feature,
is called from try_hashjoin_path(), and tries to push down HashPath
under the AppendPath.

To do so, this function does following operations.

  1. Check if this Hash-join can be pushed down under AppendPath
  2. To avoid an influence on other Path making operation,
 copy inner path's RelOptInfo and make new SeqScan path from it.
 At here, get CHECK() constraints from OUTER path, and convert its
 Var node according to join condition. And also convert Var nodes
 in join condition itself.
  3. Create new HashPath nodes between each sub-paths of AppendPath and
 inner path made above.
  4. When operations from 1 to 3 are done for each sub-paths,
 create new AppendPath which sub-paths are HashPath nodes made above.

get_replaced_clause_constr() is called from try_hashjoin_pushdown(),
and get_var_nodes_recurse() is called from get_replaced_cluase_constr().
These 2 functions help above operations.
(I may revise this part to use expression_tree_walker() and
 expression_tree_mutator().)

In patch attached, it has the following limitations.
  o It only works for hash-join operation.
(I want to support not only hash-join but also other logic.)
  o Join conditions must be = operator with int4 variables.
  o Inner path must be SeqScan.
(I want to support other path-node.)
  o And now, planner may not choose this plan,
because estimated costs are usually larger than original (non-pushdown) 
plan.

And also 1 internal (static) function, get_relation_constraints() defined in
plancat.c, is changed to global. This function will be called from
get_replaced_clause_constr() to get CHECK() constraints.


Usage
-
To use this feature, create partition tables and small table to join,
and run select operation with joining these tables.

For your convenience, I attach 

Re: [HACKERS] Warnings around booleans

2015-08-13 Thread Heikki Linnakangas

On 08/12/2015 03:46 PM, Stephen Frost wrote:

* Andres Freund (and...@anarazel.de) wrote:

On 2015-08-12 08:16:09 -0400, Stephen Frost wrote:

1) gin stores/queries some bools as GinTernaryValue.

Part of this is easy to fix, just adjust GinScanKeyData-entryRes to
be a GinTernaryValue (it's actually is compared against MAYBE).

What I find slightly worrysome is that in gin_tsquery_consistent()
checkcondition_gin (returning GinTernaryValue) is passed as a
callback that's expected to return bool. And the field
checkcondition_gin is returning (GinChkVal-check[i]) actually is a
ternary.


Is there a potential corruption issue from that..?


I honestly don't understand the gin code well enough to answer that.


Yeah, neither do I, so I've added Heikki.  Heikki, any idea as to the
impact of this?


It's harmless. gin_tsquery_consistent() places a boolean array as the 
'check' array, and therefore checkcondition_gin will also only return 
TRUEs and FALSEs, never MAYBEs. A comment to explain why that's OK would 
probably be in order though.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Warnings around booleans

2015-08-13 Thread Andres Freund
On 2015-08-13 13:27:33 +0200, Michael Meskes wrote:
  Playing around a bit lead to to find that this is caused by a wrong type
  declaration in two places. 'isarray' is declared as bool instead of enum
  ARRAY_TYPE in two places. This appears to be an oversight, perhaps
  caused by the boolean sounding name.
 
 I vaguely remember that it used to be bool back in the day.

I dug a bit back, but got tired of it around 2003 ;)

  Does this look right to you? If so, will you apply it?
 
 Yes and yes. Thanks for spotting, fixed in master and the back branches.

Thanks!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-08-13 Thread Syed, Rahila
Hello,

Autovacuum knows what % of a table needs to be cleaned - that is how it is 
triggered. 
When a vacuum runs we should calculate how many TIDs we will collect and 
therefore how many trips to the indexes we need for given memory.
We can use the VM to find out how many blocks we'll need to scan in the table. 
So overall, we know how many blocks we need to scan.

Total heap pages to be scanned can be obtained from VM as mentioned. To figure 
out number of index scans we need an estimate of dead tuples. 

IIUC, autovacuum acquires information that a table has to be cleaned by looking 
at pgstat entry for the table. i.e n_dead_tuples.
Hence,initial estimate of dead tuple TIDs can be made using n_dead_tuples in 
pgstat.
n_dead_tuples in pgstat table entry is the value  updated by last analyze and 
may not be up to date.
In cases where pgstat entry for table is NULL, number of dead tuples TIDs 
cannot be estimated.
In  such cases where TIDs cannot be estimated , we can start with an initial 
estimate of 1 index scan and later go on adding number of index pages to the 
total count of pages(heap+index)  if count of index scan exceeds.

Thank you,
Rahila Syed.

 

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Our trial to TPC-DS but optimizer made unreasonable plan

2015-08-13 Thread Greg Stark
On Thu, Aug 13, 2015 at 2:49 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
 In fact, cost of HashJoin underlying Sort node is:
 -  Hash Join  (cost=621264.91..752685.48 rows=1 width=132)

 On the other hands, NestedLoop on same place is:
 -  Nested Loop  (cost=0.00..752732.26 rows=1 width=132)

 Probably, small GUC adjustment may make optimizer to prefer HashJoin towards
 these kind of queries.

With that kind of discrepancy I doubt adjusting GUCs will be sufficient

 Do you have a good idea?

Do you have EXPLAIN ANALYZE from the plan that finishes? Are there any
row estimates that are way off?


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Our trial to TPC-DS but optimizer made unreasonable plan

2015-08-13 Thread Kouhei Kaigai
 On Thu, Aug 13, 2015 at 2:49 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
  In fact, cost of HashJoin underlying Sort node is:
  -  Hash Join  (cost=621264.91..752685.48 rows=1 width=132)
 
  On the other hands, NestedLoop on same place is:
  -  Nested Loop  (cost=0.00..752732.26 rows=1 width=132)
 
  Probably, small GUC adjustment may make optimizer to prefer HashJoin towards
  these kind of queries.
 
 With that kind of discrepancy I doubt adjusting GUCs will be sufficient
 
  Do you have a good idea?
 
 Do you have EXPLAIN ANALYZE from the plan that finishes? Are there any
 row estimates that are way off?

Yes, EXPLAIN ANALYZE is attached.

According to this, CTE year_total generates 384,208 rows. It is much smaller
than estimation (4.78M rows), however, filter's selectivity of CTE Scan was
not large as expectation.
For example, the deepest CTE Scan returns 37923 rows and 26314 rows, even though
40 rows were expected. On the next level, relations join between 11324 rows and
9952 rows, towards to estimation of 40rows x 8 rows.
If NestLoop is placed instead of HashJoin, it will make an explosion of the 
number
of loops.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com

[kaigai@ayu tpcds]$ (echo SET enable_nestloop=off;; echo EXPLAIN ANALYZE; cat 
query04.sql) | psql tpcds
SET

  QUERY PLAN

--
 Limit  (cost=1248761.93..1248761.93 rows=1 width=132) (actual 
time=10831.134..10831.134 rows=8 loops=1)
   CTE year_total
 -  Append  (cost=193769.66..496076.44 rows=4778919 width=220) (actual 
time=5510.862..10034.982 rows=384208 loops=1)
   -  HashAggregate  (cost=193769.66..226692.26 rows=2633808 
width=178) (actual time=5510.862..5654.366 rows=190581 loops=1)
 Group Key: customer.c_customer_id, customer.c_first_name, 
customer.c_last_name, customer.c_preferred_cust_flag, customer.c_birth_country, 
customer.c_login, customer.c_email_address, date_dim.d_year
 -  Custom Scan (GpuJoin)  (cost=14554.84..108170.90 
rows=2633808 width=178) (actual time=987.623..1221.769 rows=2685453 loops=1)
   Bulkload: On (density: 100.00%)
   Depth 1: Logic: GpuHashJoin, HashKeys: 
(ss_sold_date_sk), JoinQual: (ss_sold_date_sk = d_date_sk), nrows_ratio: 
0.95623338
   Depth 2: Logic: GpuHashJoin, HashKeys: (ss_customer_sk), 
JoinQual: (ss_customer_sk = c_customer_sk), nrows_ratio: 0.91441411
   -  Custom Scan (BulkScan) on store_sales  
(cost=0.00..96501.23 rows=2880323 width=38) (actual time=10.139..935.822 
rows=2880404 loops=1)
   -  Seq Scan on date_dim  (cost=0.00..2705.49 rows=73049 
width=16) (actual time=0.012..13.443 rows=73049 loops=1)
   -  Seq Scan on customer  (cost=0.00..4358.00 
rows=10 width=156) (actual time=0.004..18.978 rows=10 loops=1)
   -  HashAggregate  (cost=125474.72..143301.10 rows=1426111 
width=181) (actual time=2784.068..2882.514 rows=136978 loops=1)
 Group Key: customer_1.c_customer_id, customer_1.c_first_name, 
customer_1.c_last_name, customer_1.c_preferred_cust_flag, 
customer_1.c_birth_country, customer_1.c_login, customer_1.c_email_address, 
date_dim_1.d_year
 -  Custom Scan (GpuJoin)  (cost=14610.07..79126.11 
rows=1426111 width=181) (actual time=319.825..431.830 rows=1430939 loops=1)
   Bulkload: On (density: 100.00%)
   Depth 1: Logic: GpuHashJoin, HashKeys: 
(cs_bill_customer_sk), JoinQual: (c_customer_sk = cs_bill_customer_sk), 
nrows_ratio: 0.99446636
   Depth 2: Logic: GpuHashJoin, HashKeys: 
(cs_sold_date_sk), JoinQual: (cs_sold_date_sk = d_date_sk), nrows_ratio: 
0.98929483
   -  Custom Scan (BulkScan) on catalog_sales  
(cost=0.00..65628.43 rows=1441543 width=41) (actual time=9.649..260.027 
rows=1441548 loops=1)
   -  Seq Scan on customer customer_1  (cost=0.00..4358.00 
rows=10 width=156) (actual time=0.010..13.686 rows=10 loops=1)
   -  Seq Scan on date_dim date_dim_1  (cost=0.00..2705.49 
rows=73049 width=16) (actual time=0.004..9.383 rows=73049 loops=1)
   -  HashAggregate  (cost=69306.38..78293.88 rows=719000 width=181) 
(actual time=1435.470..1469.888 rows=56649 loops=1)
 Group Key: customer_2.c_customer_id, customer_2.c_first_name, 
customer_2.c_last_name, customer_2.c_preferred_cust_flag, 

Re: [HACKERS] [COMMITTERS] pgsql: Close some holes in BRIN page assignment

2015-08-13 Thread Andres Freund
On 2015-08-13 00:24:29 -0300, Alvaro Herrera wrote:
 Andres Freund wrote:
  On 2015-08-12 16:08:08 -0300, Alvaro Herrera wrote:
   Alvaro Herrera wrote:
Close some holes in BRIN page assignment
   
   buildfarm evidently didn't like this one :-(
  
  clang seems to see a (the?) problem:
 
 Ahh, right.  There's an identical problem in the other function
 (brin_doupdate); maybe the conditional there is more complicated than it
 wants to analyze.  I was trying randomized memory and wasn't finding
 anything ...
 
 BTW I looked around the buildfarm and couldn't find a single member that
 displayed these warnings :-(

I tripped on clang 3.7 (unreleased) and a fairly extensive set of
warning flags (-Weverything + -Wno-... of the stupid ones)...

FWIW, this very likely would have tripped in valgrind. Just running that
single test will not even be too slow.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Warnings around booleans

2015-08-13 Thread Michael Meskes
 Playing around a bit lead to to find that this is caused by a wrong type
 declaration in two places. 'isarray' is declared as bool instead of enum
 ARRAY_TYPE in two places. This appears to be an oversight, perhaps
 caused by the boolean sounding name.

I vaguely remember that it used to be bool back in the day.

 Does this look right to you? If so, will you apply it?

Yes and yes. Thanks for spotting, fixed in master and the back branches.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael.meskes at gmail dot com
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Warnings around booleans

2015-08-13 Thread Andres Freund
On 2015-08-13 14:28:58 +0300, Heikki Linnakangas wrote:
 On 08/12/2015 03:46 PM, Stephen Frost wrote:
 * Andres Freund (and...@anarazel.de) wrote:
 On 2015-08-12 08:16:09 -0400, Stephen Frost wrote:
 1) gin stores/queries some bools as GinTernaryValue.
 
 Part of this is easy to fix, just adjust GinScanKeyData-entryRes to
 be a GinTernaryValue (it's actually is compared against
 MAYBE).

That bit looks sane to you? That appears to be an actual misdeclaration
to me.

 What I find slightly worrysome is that in gin_tsquery_consistent()
 checkcondition_gin (returning GinTernaryValue) is passed as a
 callback that's expected to return bool. And the field
 checkcondition_gin is returning (GinChkVal-check[i]) actually is a
 ternary.
 
 Is there a potential corruption issue from that..?
 
 I honestly don't understand the gin code well enough to answer that.
 
 Yeah, neither do I, so I've added Heikki.  Heikki, any idea as to the
 impact of this?
 
 It's harmless. gin_tsquery_consistent() places a boolean array as the
 'check' array, and therefore checkcondition_gin will also only return TRUEs
 and FALSEs, never MAYBEs. A comment to explain why that's OK would probably
 be in order though.

Ok. As I get warnings here it seems best to add a cast when passing the
function (i.e. (bool (*) (void *, QueryOperand *)) checkcondition_gin))
and adding a comment to checkcondition_gin() explaining that it better
only return booleans?

For reference, those are the warnings.

/home/andres/src/postgresql/src/backend/access/gin/ginlogic.c: In function 
‘shimTriConsistentFn’:
/home/andres/src/postgresql/src/backend/access/gin/ginlogic.c:171:24: warning: 
comparison of constant ‘2’ with boolean expression is always false 
[-Wbool-compare]
   if (key-entryRes[i] == GIN_MAYBE)
^
/home/andres/src/postgresql/src/backend/utils/adt/tsginidx.c: In function 
‘gin_tsquery_consistent’:
/home/andres/src/postgresql/src/backend/utils/adt/tsginidx.c:287:13: warning: 
assignment from incompatible pointer type [-Wincompatible-pointer-types]
   gcv.check = check;
 ^
/home/andres/src/postgresql/src/backend/utils/adt/tsginidx.c:294:8: warning: 
passing argument 4 of ‘TS_execute’ from incompatible pointer type 
[-Wincompatible-pointer-types]
checkcondition_gin);
^
In file included from 
/home/andres/src/postgresql/src/backend/utils/adt/tsginidx.c:20:0:
/home/andres/src/postgresql/src/include/tsearch/ts_utils.h:107:13: note: 
expected ‘_Bool (*)(void *, QueryOperand *) {aka _Bool (*)(void *, struct 
anonymous *)}’ but argument is of type ‘GinTernaryValue (*)(void *, 
QueryOperand *) {aka char (*)(void *, struct anonymous *)}’
 extern bool TS_execute(QueryItem *curitem, void *checkval, bool calcnot,
 ^

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [DESIGN] ParallelAppend

2015-08-13 Thread Kouhei Kaigai
 On Fri, Aug 7, 2015 at 2:15 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
 
   On Sat, Aug 1, 2015 at 6:39 PM, Kouhei Kaigai kai...@ak.jp.nec.com 
   wrote:
   
 
  If we pull Funnel here, I think the plan shall be as follows:
Funnel
 -- SeqScan on rel1
 -- PartialSeqScan on rel2
 -- IndexScan on rel3
 

 So if we go this route, then Funnel should have capability
 to execute non-parallel part of plan as well, like in this
 case it should be able to execute non-parallel IndexScan on
 rel3 as well and then it might need to distinguish between
 parallel and non-parallel part of plans.  I think this could
 make Funnel node complex.

It is difference from what I plan now. In the above example,
Funnel node has two non-parallel aware node (rel1 and rel3)
and one parallel aware node, then three PlannedStmt for each
shall be put on the TOC segment. Even though the background
workers pick up a PlannedStmt from the three, only one worker
can pick up the PlannedStmt for rel1 and rel3, however, rel2
can be executed by multiple workers simultaneously.
  
   Okay, now I got your point, but I think the cost of execution
   of non-parallel node by additional worker is not small considering
   the communication cost and setting up an addional worker for
   each sub-plan (assume the case where out of 100-child nodes
   only few (2 or 3) nodes actually need multiple workers).
  
  It is a competition between traditional Append that takes Funnel
  children and the new appendable Funnel that takes parallel and
  non-parallel children. Probably, key factors are cpu_tuple_comm_cost,
  parallel_setup_cost and degree of selectivity of sub-plans.
  Both cases has advantage and disadvantage depending on the query,
  so we can never determine which is better without path consideration.
 
 Sure, that is what we should do, however the tricky part would be when
 the path for doing local scan is extremely cheaper than path for parallel
 scan for one of the child nodes.  For such cases, pulling up Funnel-node
 can incur more cost.  I think some of the other possible ways to make this
 work could be to extend Funnel so that it is capable of executing both 
 parallel
 and non-parallel nodes, have a new Funnel like node which has such a
 capability.

I think it is job of (more intelligent) planner but not in the first
version. If subplans of Append are mixture of nodes which has or does
not have worth of parallel execution, we will be able to arrange the
original form:

  Append
   + Scan on rel1 (large)
   + Scan on rel2 (large)
   + Scan on rel3 (middle)
   + Scan on rel4 (tiny)
   + Scan on rel5 (tiny)

to Funnel aware form, but partially:

  Append
   + Funnel
   |  + Scan on rel1 (large)
   |  + Scan on rel2 (large)
   |  + Scan on rel3 (large)  
   + Scan on rel4 (tiny)
   + Scan on rel5 (tiny)

It does not require special functionalities of Append/Funnel more
than what we have discussed, as long as planner is enough intelligent.
One downside of this approach is, plan tree tends to become more
complicated, thus makes logic to pushdown joins also becomes complicated.


Here is one other issue I found. Existing code assumes a TOC segment has
only one contents per node type, so it uses pre-defined key (like
PARALLEL_KEY_SCAN) per node type, however, it is problematic if we put
multiple PlannedStmt or PartialSeqScan node on a TOC segment.
My idea is enhancement of Plan node to have an unique identifier within
a particular plan trees. Once a unique identifier is assigned, we can
put individual information on the TOC segment, even if multiple
PartialSeqScan nodes are packed.
Did we have a discussion about this topic in the past?

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Warnings around booleans

2015-08-13 Thread Heikki Linnakangas

On 08/13/2015 02:41 PM, Andres Freund wrote:

On 2015-08-13 14:28:58 +0300, Heikki Linnakangas wrote:

On 08/12/2015 03:46 PM, Stephen Frost wrote:

* Andres Freund (and...@anarazel.de) wrote:

On 2015-08-12 08:16:09 -0400, Stephen Frost wrote:

1) gin stores/queries some bools as GinTernaryValue.

Part of this is easy to fix, just adjust GinScanKeyData-entryRes to
be a GinTernaryValue (it's actually is compared against
MAYBE).


That bit looks sane to you? That appears to be an actual misdeclaration
to me.


Yeah, changing entryRes to GinTernaryValue seems good to me.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: SCRAM authentication

2015-08-13 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Wed, Aug 12, 2015 at 9:36 PM, Stephen Frost sfr...@snowman.net wrote:
  Yes, the SCRAM implementation could be buggy.  But also, OpenSSL has
  repeatedly had security bugs that were due to forcing servers to
  downgrade to older protocols.  I wouldn't like us to start growing
  similar vulnerabilities, where SCRAM would have been just fine but an
  attack that involves forcing a downgrade to MD5 is possible.
 
  I agree that such similar vulnerabilities would be unfortunate, but
  the way to avoid that is to not implement the actual hashing or
  encryption algorithms ourselves and to stick to the protocol as defined
  in the specification.
 
 Nothing in that will protect us if the client can request a non-SCRAM
 form of authentication.

The exact same risk exists for OpenSSL, as you mention, but also for
Kerberos-based authentication systems as well.  The way to address those
risks is to not have an md5-based password verifier for the role.  If
only one password verifier exists per role then it makes moving off of
md5 more difficult.  I am not proposing a solution where both verifiers
exist forever and I specifically proposed that we even remove all MD5
based verifiers in the future, similar to how older encryption
algorithms are no longer supported by modern releases of MIT Kerberos.

  I don't think you are quite correct about the scenario where pg_authid
  is compromised.  Even if the hash stored there is functionally
  equivalent to the password itself as far as this instance of
  PostgreSQL is concerned, the same password may have been used for
  other services, so cracking it has a purpose.
 
  I attempted to address that also by stating that, should an attacker
  compromise a system with the goal of gaining the cleartext password,
  they would attempt the following, in order:
 
 What if they steal a pg_dump?  All of the password verifiers are
 there, but the live system is not.

From my previous list, 4, 5, and 6 are all data-at-rest attacks.  My
comments below the list also indicated my feeling about the data-at-rest
attacks, but to clarify, I strongly feel that an attacker would go after
the md5 hash far before even considering attacking either the SCRAM
password verifier or trying to use the combination of the two in a novel
way.  That isn't to say that there's no way the combination couldn't be
combined or that having both doesn't increase the risk- it certainly
does, but that exposure risk is really that we would then have two
algorithms on which we depend on to not be broken, for the period of
time that both are current.

More generally, however, the way to address these kinds of data-at-rest
attacks and loss of backup data is to have a password aging system where
the passwords are changed on a regular basis.  I certainly feel that we
should be looking to add that functionality and that we need to step up
and seriously move forward on implementing these capabilities that other
systems have had for decades or more.  We have excellent examples of
what enterprise authentication systems do and the kinds of capabilities
which they provide, both directly from examples such as Active Directory
and PAM, but also from requirements definitions used around the world
(eg: PCI compliance, NIST standards, and similar standards used by other
governments).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Test code is worth the space

2015-08-13 Thread David Steele

On 8/12/15 9:24 PM, Stephen Frost wrote:

* Michael Paquier (michael.paqu...@gmail.com) wrote:

On Thu, Aug 13, 2015 at 5:54 AM, Stephen Frost wrote:

The regression tests included in pgBackRest (available here:
https://github.com/pgmasters/backrest) go through a number of different
recovery tests.  There's vagrant configs for a few different VMs too
(CentOS 6, CentOS 7, Ubuntu 12.04 and Ubuntu 14.04) which is what we've
been testing.
We're working to continue expanding those tests and will also be adding
tests for replication and promotion in the future.  Eventually, we plan
to write a buildfarm module for pgBackRest, to allow it to be run in the
same manner as the regular buildfarm animals with the results posted.


Interesting. Do you mind if I pick up from it some ideas for the
in-core replication test suite based on TAP stuff? That's still in the
works for the next CF.


Certainly don't mind at all, entirely open source under the MIT
license.


Absolutely, and I'm always look to add new tests so some 
cross-pollination may be beneficial as well.


Currently the regression tests work with 9.0 - 9.5alpha1 (and back to 
8.3, actually).  The pause_on_recovery_target test is currently disabled 
for 9.5 and I have not implemented recovery_target_action or 
recovery_target = immediate yet.  I haven't tested alpha2 yet but it 
should work unless catalog or other IDs have changed.


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-13 Thread Andres Freund
On 2015-08-13 09:32:02 -0400, David Steele wrote:
 On 8/12/15 9:32 PM, Robert Haas wrote:
 On Wed, Aug 12, 2015 at 9:24 PM, Stephen Frost sfr...@snowman.net wrote:
 Certainly don't mind at all, entirely open source under the MIT
 license.
 
 Why not the PG license?  It would be nicer if we didn't have to worry
 about license contamination here.

I don't think MIT is particularly problematic, it's rather similar to a
3 clause BSD and both are pretty similar to PG's license.

 There are actually a few reasons I chose the MIT license:
 
 1) It's one of the most permissive licenses around.

 2) I originally had plans to extend backrest to other database systems.
 Nearly two years into development I don't think that sounds like a great
 idea anymore but it was the original plan.

I don't see the difference to/with postgres' license there.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Further fixes for degenerate outer join clauses.

2015-08-13 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 I'm not entirely sure what to do about this.  We could back-patch that
 patch into 9.0 and 9.1, but it's conceivable somebody would squawk about
 planner behavioral changes.  The only other idea that seems practical is
 to remove regression test cases that have platform-specific results in
 those branches.  Probably that wouldn't result in a real reduction in the
 quality of the test coverage for those branches (we could still execute
 the query, just not EXPLAIN it).  But it seems like a pretty ad-hoc
 answer, and the next case might be one that hurts more not to test.
 
 Thoughts?

Have an alternate file for those other cases, rather than remove the
test?  The complaint was about one buildfarm member, so hopefully that's
practical and doesn't require a lot of different permutations.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [COMMITTERS] pgsql: Further fixes for degenerate outer join clauses.

2015-08-13 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
 I'm not entirely sure what to do about this.  We could back-patch that
 patch into 9.0 and 9.1, but it's conceivable somebody would squawk about
 planner behavioral changes.  The only other idea that seems practical is
 to remove regression test cases that have platform-specific results in
 those branches.  Probably that wouldn't result in a real reduction in the
 quality of the test coverage for those branches (we could still execute
 the query, just not EXPLAIN it).  But it seems like a pretty ad-hoc
 answer, and the next case might be one that hurts more not to test.
 
 Thoughts?

 Have an alternate file for those other cases, rather than remove the
 test?  The complaint was about one buildfarm member, so hopefully that's
 practical and doesn't require a lot of different permutations.

I considered that but don't find it practical or attractive, especially
not if the only way to keep such a file updated is to wait and see whether
the buildfarm complains.

On the whole I'm leaning towards back-patching 33e99153e.  While the case
of exactly equal plan costs does come up in the regression tests (which
tend to inspect plans for queries on small simple tables), I think it's
relatively unlikely to happen with real-world data.
 
regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-13 Thread David Steele

On 8/12/15 9:32 PM, Robert Haas wrote:

On Wed, Aug 12, 2015 at 9:24 PM, Stephen Frost sfr...@snowman.net wrote:

* Michael Paquier (michael.paqu...@gmail.com) wrote:

Interesting. Do you mind if I pick up from it some ideas for the
in-core replication test suite based on TAP stuff? That's still in the
works for the next CF.


Certainly don't mind at all, entirely open source under the MIT
license.


Why not the PG license?  It would be nicer if we didn't have to worry
about license contamination here.


There are actually a few reasons I chose the MIT license:

1) It's one of the most permissive licenses around.

2) I originally had plans to extend backrest to other database systems. 
 Nearly two years into development I don't think that sounds like a 
great idea anymore but it was the original plan.


3) It's common for GitHub projects and backrest has lived there its 
entire life.


I'm not against a license change in theory though I can't see why it 
matters very much.


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] A \pivot command for psql

2015-08-13 Thread Daniel Verite
David Fetter wrote:

 That said, a thing in psql that could slice serialized output into
 columns would be handy as a broad, general part of reporting in
 psql

To avoid any confusion with server-side PIVOT, I suggest that the
currently proposed command in psql should have a different name
than \pivot.

The general idea is indeed pivoting, but what it precisely does
is project a resultset from a 3-column query onto cells in a
2D grid-like arrangement.

It differs significantly from what existing PIVOT SQL commands
do, looking closely at them it appears in particular that:

- they don't care about a leftmost column to hold row titles,
  whereas it's essential to the psql feature.

- as mentioned upthread, the need to enumerate in advance the output
  columns is a crucial point with the SQL pivot, whereas psql doesn't care,
  as it just displays a resultset that is already obtained.

- they operate with an aggregate function at their heart, whereas it's
  also irrelevant to the psql display feature whether there's an
  aggregate  in the query.

For a different name, I've thought of these alternatives that belong
to client-side vocabulary:

\rotate
\sheetview
\gridview
\grid3view (to insist that it works on 3 columns).
\matrix
\matview
\matrixview
\crosstab (at the risk of confusion with the contrib feature)

Opinions? I'd go for \rotate personally.

Also I'll try to demonstrate use case with concrete examples.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-13 Thread David Steele

On 8/13/15 9:55 AM, Andres Freund wrote:

On 2015-08-13 09:32:02 -0400, David Steele wrote:

On 8/12/15 9:32 PM, Robert Haas wrote:

On Wed, Aug 12, 2015 at 9:24 PM, Stephen Frost sfr...@snowman.net wrote:

Certainly don't mind at all, entirely open source under the MIT
license.


Why not the PG license?  It would be nicer if we didn't have to worry
about license contamination here.


I don't think MIT is particularly problematic, it's rather similar to a
3 clause BSD and both are pretty similar to PG's license.


There are actually a few reasons I chose the MIT license:

1) It's one of the most permissive licenses around.



2) I originally had plans to extend backrest to other database systems.
Nearly two years into development I don't think that sounds like a great
idea anymore but it was the original plan.


I don't see the difference to/with postgres' license there.


Perhaps just me but it was really about perception.  If I extended 
BackRest to work with MySQL (shudders) then I thought it would be weird 
if it used the PostgreSQL license.


Anyway, now BackRest is now pretty solidly pgBackRest and I don't have 
any immediate plans to port to other database systems so the point is 
probably moot.


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Further fixes for degenerate outer join clauses.

2015-08-13 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Stephen Frost sfr...@snowman.net writes:
  * Tom Lane (t...@sss.pgh.pa.us) wrote:
  I'm not entirely sure what to do about this.  We could back-patch that
  patch into 9.0 and 9.1, but it's conceivable somebody would squawk about
  planner behavioral changes.  The only other idea that seems practical is
  to remove regression test cases that have platform-specific results in
  those branches.  Probably that wouldn't result in a real reduction in the
  quality of the test coverage for those branches (we could still execute
  the query, just not EXPLAIN it).  But it seems like a pretty ad-hoc
  answer, and the next case might be one that hurts more not to test.
  
  Thoughts?
 
  Have an alternate file for those other cases, rather than remove the
  test?  The complaint was about one buildfarm member, so hopefully that's
  practical and doesn't require a lot of different permutations.
 
 I considered that but don't find it practical or attractive, especially
 not if the only way to keep such a file updated is to wait and see whether
 the buildfarm complains.

I agree, that's a bit unfortunate, but it strikes me as pretty unlikely
that we're ever going to change those tests or that a code change would
end up causing yet another different plan before 9.1 is completely out
of support in the next couple years.

 On the whole I'm leaning towards back-patching 33e99153e.  While the case
 of exactly equal plan costs does come up in the regression tests (which
 tend to inspect plans for queries on small simple tables), I think it's
 relatively unlikely to happen with real-world data.

I agree it's unlikely, but I don't particularly like changing our mind
on a back-patching decision 3 years later to satisfy our regression
tests.

Still, I don't feel particularly strongly about either side of this, so
I'm happy with you making the decision.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [COMMITTERS] pgsql: Further fixes for degenerate outer join clauses.

2015-08-13 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
 Stephen Frost sfr...@snowman.net writes:
 Have an alternate file for those other cases, rather than remove the
 test?  The complaint was about one buildfarm member, so hopefully that's
 practical and doesn't require a lot of different permutations.

 I considered that but don't find it practical or attractive, especially
 not if the only way to keep such a file updated is to wait and see whether
 the buildfarm complains.

 I agree, that's a bit unfortunate, but it strikes me as pretty unlikely
 that we're ever going to change those tests or that a code change would
 end up causing yet another different plan before 9.1 is completely out
 of support in the next couple years.

Meh.  9.0 will be dead in a month or two, but 9.1 still has a ways to go,
and I don't really want to be dealing with this issue every time Andreas'
fuzz testing finds another corner case :-(.  It was hard enough devising
regression tests that used stable table data for those cases as it was.

Frequently, when building a planner regression test, it's important that
a particular plan shape be selected (eg because what you're trying to test
is that setrefs.c postprocessing does the right thing with a given case).
So if we take supporting brolga's behavior as-is as a requirement, it's
not always going to be good enough to just maintain an alternate output
file; we'd have to try to adjust the test query to make it do the right
thing.  It's bad enough dealing with 32/64bit and endianness dependencies
for that, I don't want minute compiler codegen choices entering into it
as well.

 On the whole I'm leaning towards back-patching 33e99153e.  While the case
 of exactly equal plan costs does come up in the regression tests (which
 tend to inspect plans for queries on small simple tables), I think it's
 relatively unlikely to happen with real-world data.

 I agree it's unlikely, but I don't particularly like changing our mind
 on a back-patching decision 3 years later to satisfy our regression
 tests.

Actually, I'd take that the other way around: now that the patch has been
out for awhile, and we've not heard squawks that could be traced to it,
there's a much better argument that back-patching would be safe than there
was at the time.

There is certainly some risk in making changes that are only for ease of
maintenance and don't fix a provable bug ... but it's not like the other
changes I've committed into 9.0 and 9.1 lately don't change any
corner-case planner behaviors.  Besides, you could argue that it is a bug
if rebuilding with a different compiler can change the planner's behavior.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Further fixes for degenerate outer join clauses.

2015-08-13 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Stephen Frost sfr...@snowman.net writes:
  I agree, that's a bit unfortunate, but it strikes me as pretty unlikely
  that we're ever going to change those tests or that a code change would
  end up causing yet another different plan before 9.1 is completely out
  of support in the next couple years.
 
 Meh.  9.0 will be dead in a month or two, but 9.1 still has a ways to go,
 and I don't really want to be dealing with this issue every time Andreas'
 fuzz testing finds another corner case :-(.  It was hard enough devising
 regression tests that used stable table data for those cases as it was.
 
 Frequently, when building a planner regression test, it's important that
 a particular plan shape be selected (eg because what you're trying to test
 is that setrefs.c postprocessing does the right thing with a given case).
 So if we take supporting brolga's behavior as-is as a requirement, it's
 not always going to be good enough to just maintain an alternate output
 file; we'd have to try to adjust the test query to make it do the right
 thing.  It's bad enough dealing with 32/64bit and endianness dependencies
 for that, I don't want minute compiler codegen choices entering into it
 as well.

Certainly a good point.

  On the whole I'm leaning towards back-patching 33e99153e.  While the case
  of exactly equal plan costs does come up in the regression tests (which
  tend to inspect plans for queries on small simple tables), I think it's
  relatively unlikely to happen with real-world data.
 
  I agree it's unlikely, but I don't particularly like changing our mind
  on a back-patching decision 3 years later to satisfy our regression
  tests.
 
 Actually, I'd take that the other way around: now that the patch has been
 out for awhile, and we've not heard squawks that could be traced to it,
 there's a much better argument that back-patching would be safe than there
 was at the time.

I definitely agree with that when it comes to bug cases, but I'm a bit
more wary around planner changes.  Still, I don't recall any specific
complaints about plan changes from pre-9.2 to 9.2+ in such corner cases
and I do agree with you that real data would make this far less likely
to happen.

 There is certainly some risk in making changes that are only for ease of
 maintenance and don't fix a provable bug ... but it's not like the other
 changes I've committed into 9.0 and 9.1 lately don't change any
 corner-case planner behaviors.  Besides, you could argue that it is a bug
 if rebuilding with a different compiler can change the planner's behavior.

Good point.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] TAP tests are badly named

2015-08-13 Thread Andrew Dunstan



On 08/01/2015 07:13 PM, Andrew Dunstan wrote:


On 08/01/2015 04:44 PM, Noah Misch wrote:



--enable-tap-tests is a reasonable configuration setting, because it's
about whether or not we have a TAP testing framework available, but I
think we should stop calling the bin tests TAP tests and we should
change the test name in vcregress.pl to a more appropriate name. In 
the
buildfarm I'm calling the step bin-check: 
http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=crakedt=2015-07-30%2012%3A25%3A58stg=bin-check


Thoughts?
While lack of granularity in vcregress.pl is hostile, with so much 
about MSVC

development being hostile, this one is below my noise floor.



Speaking with my buildfarm hat on, it's well above mine. And these 
changes make the situation worse quite gratuitously. Anyway, I'll fix it.







here's what I propose.

cheers

andrew


diff --git a/src/tools/msvc/vcregress.pl b/src/tools/msvc/vcregress.pl
index d3d736b..b50a160 100644
--- a/src/tools/msvc/vcregress.pl
+++ b/src/tools/msvc/vcregress.pl
@@ -34,7 +34,7 @@ if (-e src/tools/msvc/buildenv.pl)
 
 my $what = shift || ;
 if ($what =~
-/^(check|installcheck|plcheck|contribcheck|modulescheck|ecpgcheck|isolationcheck|upgradecheck|tapcheck)$/i
+/^(check|installcheck|plcheck|contribcheck|modulescheck|ecpgcheck|isolationcheck|upgradecheck|bincheck)$/i
   )
 {
 	$what = uc $what;
@@ -61,7 +61,14 @@ unless ($schedule)
 	$schedule = parallel if ($what eq 'CHECK' || $what =~ /PARALLEL/);
 }
 
-$ENV{PERL5LIB} = $topdir/src/tools/msvc;$ENV{PERL5LIB};
+if ($ENV{PERL5LIB})
+{
+	$ENV{PERL5LIB} = $topdir/src/tools/msvc;$ENV{PERL5LIB};
+}
+else
+{
+	$ENV{PERL5LIB} = $topdir/src/tools/msvc;
+}
 
 my $maxconn = ;
 $maxconn = --max_connections=$ENV{MAX_CONNECTIONS}
@@ -81,7 +88,7 @@ my %command = (
 	CONTRIBCHECK   = \contribcheck,
 	MODULESCHECK   = \modulescheck,
 	ISOLATIONCHECK = \isolationcheck,
-	TAPCHECK   = \tapcheck,
+	BINCHECK   = \bincheck,
 	UPGRADECHECK   = \upgradecheck,);
 
 my $proc = $command{$what};
@@ -168,44 +175,46 @@ sub isolationcheck
 	exit $status if $status;
 }
 
-sub tapcheck
+sub tap_check
 {
-	InstallTemp();
+	die Tap tests not enabled in configuration
+	  unless $config-{tap_tests};
+
+	my $dir = shift;
+	chdir $dir;
 
 	my @args = ( prove, --verbose, t/*.pl);
-	my $mstat = 0;
 
+	# Reset those values, they may have been changed by another test.
+	# XXX is this true?
+	local %ENV = %ENV;
 	$ENV{PERL5LIB} = $topdir/src/test/perl;$ENV{PERL5LIB};
 	$ENV{PG_REGRESS} = $topdir/$Config/pg_regress/pg_regress;
 
+	$ENV{TESTDIR} = $dir;
+
+	system(@args);
+	my $status = $?  8;
+	return $status;
+}
+
+sub bincheck
+{
+	InstallTemp();
+
+	my $mstat = 0;
+
 	# Find out all the existing TAP tests by looking for t/ directories
 	# in the tree.
-	my $tap_dirs = [];
-	my @top_dir = ($topdir);
-	File::Find::find(
-		{   wanted = sub {
-/^t\z/s
-   push(@$tap_dirs, $File::Find::name);
-			  }
-		},
-		@top_dir);
+	my @bin_dirs = glob($topdir/src/bin/*);
 
 	# Process each test
-	foreach my $test_path (@$tap_dirs)
+	foreach my $dir (@$tap_dirs)
 	{
-		# Like on Unix make check-world, don't run the SSL test suite
-		# automatically.
-		next if ($test_path =~ /\/src\/test\/ssl\//);
-
-		my $dir = dirname($test_path);
-		chdir $dir;
-		# Reset those values, they may have been changed by another test.
-		$ENV{TESTDIR} = $dir;
-		system(@args);
-		my $status = $?  8;
-		$mstat ||= $status;
+		next unless -d $dir/t;
+		my $status = tap_check($dir);
+		exit $status if $status;
 	}
-	exit $mstat if $mstat;
 }
 
 sub plcheck

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump quietly ignore missing tables - is it bug?

2015-08-13 Thread Pavel Stehule
Hi

I am sending updated version

news:

* strict-names everywhere
* checking table names in pg_dump simplified - not necessary to create
single query
* pg_restore support

Regards

Pavel


2015-08-13 9:17 GMT+02:00 Pavel Stehule pavel.steh...@gmail.com:

 Hi

 2015-07-30 12:44 GMT+02:00 Heikki Linnakangas hlinn...@iki.fi:

 On 07/25/2015 07:08 PM, Pavel Stehule wrote:

 I am sending a new patch - without checking wildcard chars.


 The documentation says the option is called --strict-names, while the
 code has --strict-mode. I like --strict-names more, mode seems redundant,
 and it's not clear what it's strict about.


 ok


 For symmetry, it would be good to also support this option in pg_restore.
 It seems even more useful there.


 I'll do it


 Can we do better than issuing a separate query for each table/schema
 name? The performance of this isn't very important, but still it seems like
 you could fairly easily refactor the code to avoid that. Perhaps return an
 extra constant for part of the UNION to distinguish which result row came
 from which pattern, and check that at least one row is returned for each.


 I did few tests and for 1K tables the union is faster about 50ms, but the
 code is much more complex, for 10K tables, the union is significantly
 slower (probably due planning) 2sec x 7sec. So if we are expecting backup
 on not too slow network, then simple solution is winner - Postgres process
 simple read queries quickly.

 Regards

 Pavel



 - Heikki



diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
new file mode 100644
index 7467e86..7c071fb
*** a/doc/src/sgml/ref/pg_dump.sgml
--- b/doc/src/sgml/ref/pg_dump.sgml
*** PostgreSQL documentation
*** 545,550 
--- 545,566 
   /varlistentry
  
   varlistentry
+   termoption--strict-names//term
+   listitem
+para
+ Require that table and/or schema match at least one entity each.
+ Without any entity in the database to be dumped, an error message
+ is printed and dump is aborted.
+/para
+para
+ This option has no effect on the exclude table and schema patterns
+ (and also option--exclude-table-data/): not matching any entities
+ isn't considered an error.
+/para
+   /listitem
+  /varlistentry
+ 
+  varlistentry
termoption-T replaceable class=parametertable/replaceable/option/term
termoption--exclude-table=replaceable class=parametertable/replaceable/option/term
listitem
diff --git a/doc/src/sgml/ref/pg_restore.sgml b/doc/src/sgml/ref/pg_restore.sgml
new file mode 100644
index 97e3420..9df7f69
*** a/doc/src/sgml/ref/pg_restore.sgml
--- b/doc/src/sgml/ref/pg_restore.sgml
***
*** 432,437 
--- 432,448 
   /varlistentry
  
   varlistentry
+   termoption--strict-names//term
+   listitem
+para
+ Require that table and/or schema and/or function and/or trigger match
+ at least one entity each. Without any entity in the backup file, 
+ an error message is printed and restore is aborted.
+/para
+   /listitem
+  /varlistentry
+ 
+  varlistentry
termoption-T replaceable class=parametertrigger/replaceable/option/term
termoption--trigger=replaceable class=parametertrigger/replaceable/option/term
listitem
diff --git a/src/bin/pg_dump/dumputils.c b/src/bin/pg_dump/dumputils.c
new file mode 100644
index d7506e1..52b2b98
*** a/src/bin/pg_dump/dumputils.c
--- b/src/bin/pg_dump/dumputils.c
*** simple_string_list_append(SimpleStringLi
*** 1220,1225 
--- 1220,1226 
  		pg_malloc(offsetof(SimpleStringListCell, val) +strlen(val) + 1);
  
  	cell-next = NULL;
+ 	cell-touched = false;
  	strcpy(cell-val, val);
  
  	if (list-tail)
*** simple_string_list_member(SimpleStringLi
*** 1237,1243 
--- 1238,1260 
  	for (cell = list-head; cell; cell = cell-next)
  	{
  		if (strcmp(cell-val, val) == 0)
+ 		{
+ 			cell-touched = true;
  			return true;
+ 		}
  	}
  	return false;
  }
+ 
+ const char *
+ simple_string_list_not_touched(SimpleStringList *list)
+ {
+ 	SimpleStringListCell *cell;
+ 
+ 	for (cell = list-head; cell; cell = cell-next)
+ 	{
+ 		if (!cell-touched)
+ 			return cell-val;
+ 	}
+ 	return NULL;
+ }
diff --git a/src/bin/pg_dump/dumputils.h b/src/bin/pg_dump/dumputils.h
new file mode 100644
index b176746..9f31bbc
*** a/src/bin/pg_dump/dumputils.h
--- b/src/bin/pg_dump/dumputils.h
*** typedef struct SimpleOidList
*** 38,43 
--- 38,45 
  typedef struct SimpleStringListCell
  {
  	struct SimpleStringListCell *next;
+ 	bool		touched;/* true, when this string was searched
+   and touched */
  	char		val[FLEXIBLE_ARRAY_MEMBER];		/* null-terminated string here */
  } SimpleStringListCell;
  
*** extern void set_dump_section(const char
*** 103,107 
--- 105,111 
  
  extern void 

Re: [HACKERS] WIP: SCRAM authentication

2015-08-13 Thread Josh Berkus
On 08/12/2015 06:36 PM, Stephen Frost wrote:
 I attempted to address that also by stating that, should an attacker
 compromise a system with the goal of gaining the cleartext password,
 they would attempt the following, in order:
 
 1) attempt to compromise a superuser account, if not already done, and
 then modify the system to get the 'password' auth mechanism to be used
 whereby the password is sent in the clear
 
 2) change the existing password, or encourge the user to do so and
 somehow capture that activity
 
 3) social engineering attacks
 
 4) attempt to crack the md5 hash
 
 5) attempt to crack the SCRAM password verifier
 
 6) try to work out a way to use both the md5 hash and the SCRAM password
 verifier to figure out the password
 

I don't feel like you've correctly assessed the risk inherent in the
md5 auth method, which is that, having captured an md5auth string by
whatever means, and attacker can reuse that md5 string on other
databases in the network *without* cracking it.  That's the biggest risk
as long as md5 is present.

Aside from code complexity, the user security concern with a multiple
verifier per role approach is that the DBAs would never remember to
completely disable md5auth and would capture md5 hashes either in flight
or from backups.  This approach can be used to capture an md5hash from a
non-critical database which is poorly secured, and then re-use it
against an important database.

Now, the counter-argument to this is that a DBA is just as likely to
rememeber to remove md5 verifiers as she is to remember to remove roles
with md5auth.

Regardless of the approach we take, encouraging users to migrate is
going to be more of a matter of documentation, publicity, and
administrative tools than one of multiple verifiers vs. multiple roles.
 That is, giving DBAs the ability to see and log who's using what kind
of verifier, and what account has what verifier(s) available, will make
more of a difference.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] buildfarm does not test make check

2015-08-13 Thread Andrew Dunstan



On 08/13/2015 02:11 PM, Alvaro Herrera wrote:

So I added the brin isolation test back.  Because it needs the
pageinspect module, it can only work under make check, not
installcheck.  The problem is that this means buildfarm will not run it,
because it only runs installcheck :-(

I realized that the important modules that run under make check
already have custom buildfarm modules, namely pg_upgrade and
test_decoding.  I wonder if it would make more sense to have this be
customizable from the buildfarm server side, without having to resort to
writing a buildfarm module for each new thing.  I envision a file hosted
in the buildfarm server, perhaps a JSON blob, that lists modules that
need make check treatment.  Within the JSON object for each such
module there could live the details such as what files need to be
saved.  That way, we could add more things to test without requiring the
buildfarm client to be upgraded each time.

This approach seems to have worked very well to customize the branches
that each animal builds, which is why I suggest that it be used here
too.




The buildfarm server doesn't control anything the clients do except it 
provides them with a list of branches we are interested in. Even that is 
invisible to the main program, and only used by the wrapper 
run_branches.pl. The main program can run entirely offline and I'm going 
to resist moves to make it otherwise. (Yes it needs access to git, but 
even that can be finessed.)


If you want some way of specifying this, put it in the source, e.g. 
invent a directory src/test/buildfarm_support and put it there. That way 
committers have automatic access to it, which they certainly don't to 
the buildfarm server. I'd rather just make it a small piece of perl 
instead of JSON,though.


You're also going to have to handle the msvc side of things. That won't 
be trivial. See discussion elsewhere today about how we've got that 
wrong recently.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] buildfarm does not test make check

2015-08-13 Thread Alvaro Herrera
So I added the brin isolation test back.  Because it needs the
pageinspect module, it can only work under make check, not
installcheck.  The problem is that this means buildfarm will not run it,
because it only runs installcheck :-(

I realized that the important modules that run under make check
already have custom buildfarm modules, namely pg_upgrade and
test_decoding.  I wonder if it would make more sense to have this be
customizable from the buildfarm server side, without having to resort to
writing a buildfarm module for each new thing.  I envision a file hosted
in the buildfarm server, perhaps a JSON blob, that lists modules that
need make check treatment.  Within the JSON object for each such
module there could live the details such as what files need to be
saved.  That way, we could add more things to test without requiring the
buildfarm client to be upgraded each time.

This approach seems to have worked very well to customize the branches
that each animal builds, which is why I suggest that it be used here
too.

-- 
Álvaro Herrera33.5S 70.5W
The Gord often wonders why people threaten never to come back after they've
been told never to return (www.actsofgord.com)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Re-add BRIN isolation test

2015-08-13 Thread Alvaro Herrera
Andrew Dunstan wrote:

 On 08/13/2015 04:26 PM, Alvaro Herrera wrote:

 Argh, and this now broke MSVC :-(

 An immediate solution would probably be to add it to @contrib_excludes.

Ah, that seems simpler.  Pushed that, let's see if it cures the problem.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TAP tests are badly named

2015-08-13 Thread Michael Paquier
On Fri, Aug 14, 2015 at 1:18 AM, Andrew Dunstan and...@dunslane.net wrote:


 On 08/13/2015 12:03 PM, Michael Paquier wrote:

 On Fri, Aug 14, 2015 at 12:17 AM, Andrew Dunstan wrote:

 here's what I propose.

 This patch does not take into account that there may be other code
 paths than src/bin/ that may have TAP tests (see my pending patch to
 test pg_dump with extensions including dumpable relations for
 example). I guess that it is done on purpose, now what are we going to
 do about the following things:
 - for src/test/ssl, should we have a new target in vcregress? Like
 ssltest?
 - for the pending patch I just mentioned, what should we do then?
 Should we expect it to work under modulescheck?



 Of course it takes it into account.

Yes, sorry. I misread your patch, reading code late in the evening is
not a good idea :)

 What it does is let you add extra checks
 easily. But I am not going to accept the divergence in vcregress.pl from the
 way we run tests using the standard tools.

OK, this sounds fair to keep up with. And sorry for actually breaking
vcregress.pl regarding this consistency. Shouldn't we remove
upgradecheck and move it under the banner bincheck then? Perhaps
testing the upgrade made sense before but now it is located under
src/bin so it sounds weird to do things separately.

 Yes, ssl tests should have a new
 target, as should any other new set of tests. We don't have a single make
 target for tap tests and neither should we in vcregress.pl.

Actually it is not only ssl, I would think that all those targets
should run under the same banner:
ALWAYS_SUBDIRS = examples locale thread ssl
But that's a separate discussion...

+die Tap tests not enabled in configuration
+  unless $config-{tap_tests};
I think that you are missing to update a couple of places with this
parameter, one being getfakeconfig...@solution.pm, a second being
config_default.pl. Also, I would think that it is saner to simply
return here and not die, then log a message to user to tell him that
the test has been skipped. This is thinking about the module having
TAP tests that should be located under src/test/modules
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.

2015-08-13 Thread Alvaro Herrera
Andres Freund wrote:
 This is implemented using a new infrastructure called speculative
 insertion. It is an optimistic variant of regular insertion that first
 does a pre-check for existing tuples and then attempts an insert.  If a
 violating tuple was inserted concurrently, the speculatively inserted
 tuple is deleted and a new attempt is made.  If the pre-check finds a
 matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
 If the insertion succeeds without detecting a conflict, the tuple is
 deemed inserted.

Is there a reason why heap_finish_speculative and heap_abort_speculative
were made to take a HeapTuple rather than just an ItemPointer?  They
only use tuple-t_self in their code (except the former, which in an
assert verifies that the tuple is indeed speculative).

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Re-add BRIN isolation test

2015-08-13 Thread Andrew Dunstan



On 08/13/2015 04:26 PM, Alvaro Herrera wrote:

Alvaro Herrera wrote:

Re-add BRIN isolation test

This time, instead of using a core isolation test, put it on its own
test module; this way it can require the pageinspect module to be
present before running.

The module's Makefile is loosely modeled after test_decoding's, so that
it's easy to add further tests for either pg_regress or isolationtester
later.

Argh, and this now broke MSVC :-(

I'm tempted to change this block from Mkvcbuild.pm,

 else
 {
 croak Could not determine contrib module type for $n\n;
 }

(i.e. it doesn't find any of PROGRAM, MODULES, MODULE_big in the
Makefile) so that instead of raising an error it simply skips the module
altogether.  That's pretty much equivalent to what Make would do.  Maybe
restrict this behavior to within src/test/modules.




An immediate solution would probably be to add it to @contrib_excludes.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] buildfarm does not test make check

2015-08-13 Thread Andrew Dunstan



On 08/13/2015 02:52 PM, Alvaro Herrera wrote:

You're also going to have to handle the msvc side of things. That won't be
trivial. See discussion elsewhere today about how we've got that wrong
recently.

Oh my.  The pg_upgrade code in src/tools/msvc/vcregress.pl looks rather
unhealthy, and I don't see anything there to handle test_decoding, so I
assume that's done differently somehow.  (For pg_upgrade surely we should
not be using a shell script ...)

I don't have time to go through this right now, but it's on my list of
things to think about.





Don't worry about the existing cases. Just create something that handles 
new test cases in a generic way.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUG] CustomScan-custom_plans are not copied

2015-08-13 Thread Alvaro Herrera
Kouhei Kaigai wrote:
 Hello,
 
 The attached patch fixes up my careless misses when custom_plans field was 
 added.
 Please apply this fix.

This was applied by Noah as part of b8fe12a83622b.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] can't coax query planner into using all columns of a gist index

2015-08-13 Thread Gideon Dresdner
That did it! I certainly should have been able to figure that out on my
own. Thanks for the help!

Unfortunately, I'm still looking at rather slow queries across my entire
dataset. I might wind up having to find another solution.

Gideon.

On Wed, Aug 12, 2015 at 6:29 PM Tom Lane t...@sss.pgh.pa.us wrote:

 Gideon Dresdner gide...@gmail.com writes:
  I've created a small dump of my database that recreates the problem. I
 hope
  that this will help recreate the problem. It is attached. I'd be happy to
  hear if there is an easier way of doing this.

 Ah.  Now that I see the database schema, the problem is here:

 regression=# \d vcf
 ...
  chr   | smallint |
 ...

 So chr is smallint in one table and integer in the other.  That means
 the parser translates qcregions.chr = vcf.chr using the int42eq operator
 instead of int4eq --- and nobody's ever taught btree_gist about crosstype
 operators.  So the clause simply isn't considered indexable with this
 index.  If you change the query to qcregions.chr = vcf.chr::int then
 all is well.

 Personally I'd just change vcf.chr to integer --- it's not even saving you
 any space, with that table schema, because the next column has to be
 int-aligned anyway.

 regards, tom lane



Re: [HACKERS] buildfarm does not test make check

2015-08-13 Thread Alvaro Herrera
Andrew Dunstan wrote:

 The buildfarm server doesn't control anything the clients do except it
 provides them with a list of branches we are interested in. Even that is
 invisible to the main program, and only used by the wrapper run_branches.pl.
 The main program can run entirely offline and I'm going to resist moves to
 make it otherwise. (Yes it needs access to git, but even that can be
 finessed.)

Understood.

 If you want some way of specifying this, put it in the source, e.g. invent a
 directory src/test/buildfarm_support and put it there. That way committers
 have automatic access to it, which they certainly don't to the buildfarm
 server. I'd rather just make it a small piece of perl instead of
 JSON,though.

Makes plenty of sense, thanks.

 You're also going to have to handle the msvc side of things. That won't be
 trivial. See discussion elsewhere today about how we've got that wrong
 recently.

Oh my.  The pg_upgrade code in src/tools/msvc/vcregress.pl looks rather
unhealthy, and I don't see anything there to handle test_decoding, so I
assume that's done differently somehow.  (For pg_upgrade surely we should
not be using a shell script ...)

I don't have time to go through this right now, but it's on my list of
things to think about.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Re-add BRIN isolation test

2015-08-13 Thread Alvaro Herrera
Alvaro Herrera wrote:
 Re-add BRIN isolation test
 
 This time, instead of using a core isolation test, put it on its own
 test module; this way it can require the pageinspect module to be
 present before running.
 
 The module's Makefile is loosely modeled after test_decoding's, so that
 it's easy to add further tests for either pg_regress or isolationtester
 later.

Argh, and this now broke MSVC :-(

I'm tempted to change this block from Mkvcbuild.pm,

else
{
croak Could not determine contrib module type for $n\n;
}

(i.e. it doesn't find any of PROGRAM, MODULES, MODULE_big in the
Makefile) so that instead of raising an error it simply skips the module
altogether.  That's pretty much equivalent to what Make would do.  Maybe
restrict this behavior to within src/test/modules.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Multi-tenancy with RLS

2015-08-13 Thread Haribabu Kommi
This is regarding supporting of multi-tenancy in a single PostgreSQL instance
using the row level security feature. The main idea is to have the
row level security
enabled on system catalog tables, thus the user can get only the rows that are
either system objects or the user objects, where the user is the owner.


Example:

postgres=# create role test login;
postgres=# create role test1 login;

postgres=# \c postgres test
postgres= create table test(f1 int);
postgres= \d
   List of relations
 Schema | Name | Type  | Owner
+--+---+---
 public | test | table | test
(1 row)

postgres= \c postgres test1
postgres= create table test1(f1 int);
postgres= \d
   List of relations
 Schema | Name  | Type  | Owner
+---+---+---
 public | test1 | table | test1
(1 row)

postgres=# \c postgres test
postgres= \d
   List of relations
 Schema | Name | Type  | Owner
+--+---+---
 public | test | table | test
(1 row)


To enable row level security on system catalog tables, currently I
added a new database option to create/alter database. The syntax can
be changed later. Adding an option to database makes it easier for
users to enable/disable the row level security on system catalog
tables.

CREATE DATABASE USERDB WITH ROW LEVEL SECURITY = TRUE;
ALTER DATBASE USERDB WITH ROW LEVEL SECURITY = FALSE;

A new boolean column datrowlevelsecurity is added to pg_database
system catalog table to display the status of row level security on
that database.

Currently I just implemented the row level security is enabled only
for pg_class system table as a proof of concept. whenever the row
level security on the database is enabled/disabled, it internally
fires the create policy/remove policy commands using SPI interfaces.

Here I attached the proof concept patch.

Pending items:
1. Supporting of RLS on all system catalog tables
2. Instead of SPI interfaces, any better way to create/remove policies.

Any comments/suggestions regarding the way to achieve multi-tenancy in
PostgreSQL?

Regards,
Hari Babu
Fujitsu Australia


multi-tenancy_with_rls_poc.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-08-13 Thread Michael Paquier
On Fri, Aug 14, 2015 at 12:54 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Mon, Jun 29, 2015 at 10:11 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 On Wed, Mar 18, 2015 at 1:59 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 Feedback is of course welcome, but note that I am not seriously
 expecting any until we get into 9.6 development cycle and I am adding
 this patch to the next CF.

 I have moved this patch to CF 2015-09, as I have enough patches to
 take care of for now... Let's focus on Windows support and improvement
 of logging for TAP in the first round. That will be already a good
 step forward.

 OK, attached is a new version of this patch, that I have largely
 reworked to have more user-friendly routines for the tests. The number
 of tests is still limited still it shows what this facility can do:
 that's on purpose as it does not make much sense to code a complete
 and complicated set of tests as long as the core routines are not
 stable, hence let's focus on that first.
 I have not done yet tests on Windows, I am expecting some tricks
 needed for the archive and recovery commands generated for the tests.

Attached is v3. I have tested and fixed the tests such as they can run
on Windows. archive_command and restore_command are using Windows'
copy when needed. There was also a bug with the use of a hot standby
instead of a warm one, causing test 002 to fail.
I am rather happy with the shape of this patch now, so feel free to review it...
Regards,
-- 
Michael
diff --git a/src/bin/pg_rewind/RewindTest.pm b/src/bin/pg_rewind/RewindTest.pm
index 22e5cae..8dd0fb7 100644
--- a/src/bin/pg_rewind/RewindTest.pm
+++ b/src/bin/pg_rewind/RewindTest.pm
@@ -125,38 +125,6 @@ sub check_query
 	}
 }
 
-# Run a query once a second, until it returns 't' (i.e. SQL boolean true).
-sub poll_query_until
-{
-	my ($query, $connstr) = @_;
-
-	my $max_attempts = 30;
-	my $attempts = 0;
-	my ($stdout, $stderr);
-
-	while ($attempts  $max_attempts)
-	{
-		my $cmd = [ 'psql', '-At', '-c', $query, '-d', $connstr ];
-		my $result = run $cmd, '', \$stdout, '2', \$stderr;
-
-		chomp($stdout);
-		$stdout =~ s/\r//g if $Config{osname} eq 'msys';
-		if ($stdout eq t)
-		{
-			return 1;
-		}
-
-		# Wait a second before retrying.
-		sleep 1;
-		$attempts++;
-	}
-
-	# The query result didn't change in 30 seconds. Give up. Print the stderr
-	# from the last attempt, hopefully that's useful for debugging.
-	diag $stderr;
-	return 0;
-}
-
 sub append_to_file
 {
 	my ($filename, $str) = @_;
diff --git a/src/test/Makefile b/src/test/Makefile
index b713c2c..d6e51eb 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -17,7 +17,7 @@ SUBDIRS = regress isolation modules
 # We don't build or execute examples/, locale/, or thread/ by default,
 # but we do want make clean etc to recurse into them.  Likewise for ssl/,
 # because the SSL test suite is not secure to run on a multi-user system.
-ALWAYS_SUBDIRS = examples locale thread ssl
+ALWAYS_SUBDIRS = examples locale thread ssl recovery
 
 # We want to recurse to all subdirs for all standard targets, except that
 # installcheck and install should not recurse into the subdirectory modules.
diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm
index 4927d45..cc1287f 100644
--- a/src/test/perl/TestLib.pm
+++ b/src/test/perl/TestLib.pm
@@ -12,6 +12,7 @@ our @EXPORT = qw(
   configure_hba_for_replication
   start_test_server
   restart_test_server
+  poll_query_until
   psql
   slurp_dir
   slurp_file
@@ -219,6 +220,37 @@ END
 	}
 }
 
+sub poll_query_until
+{
+	my ($query, $connstr) = @_;
+
+	my $max_attempts = 30;
+	my $attempts = 0;
+	my ($stdout, $stderr);
+
+	while ($attempts  $max_attempts)
+	{
+		my $cmd = [ 'psql', '-At', '-c', $query, '-d', $connstr ];
+		my $result = run $cmd, '', \$stdout, '2', \$stderr;
+
+		chomp($stdout);
+		$stdout =~ s/\r//g if $Config{osname} eq 'msys';
+		if ($stdout eq t)
+		{
+			return 1;
+		}
+
+		# Wait a second before retrying.
+		sleep 1;
+		$attempts++;
+	}
+
+	# The query result didn't change in 30 seconds. Give up. Print the stderr
+	# from the last attempt, hopefully that's useful for debugging.
+	diag $stderr;
+	return 0;
+}
+
 sub psql
 {
 	my ($dbname, $sql) = @_;
diff --git a/src/test/recovery/.gitignore b/src/test/recovery/.gitignore
new file mode 100644
index 000..499fa7d
--- /dev/null
+++ b/src/test/recovery/.gitignore
@@ -0,0 +1,3 @@
+# Generated by test suite
+/regress_log/
+/tmp_check/
diff --git a/src/test/recovery/Makefile b/src/test/recovery/Makefile
new file mode 100644
index 000..16c063a
--- /dev/null
+++ b/src/test/recovery/Makefile
@@ -0,0 +1,17 @@
+#-
+#
+# Makefile for src/test/recovery
+#
+# Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/recovery/Makefile
+#

Re: [HACKERS] Test code is worth the space

2015-08-13 Thread Jim Nasby

On 8/13/15 1:31 AM, Peter Geoghegan wrote:

On Wed, Aug 12, 2015 at 11:23 PM, Magnus Hagander mag...@hagander.net wrote:

The value of a core regression suite that takes less time to run has
to be weighed against the possibility that a better core regression
suite might cause us to find more bugs before committing.  That could
easily be worth the price in runtime.



Or have a quickcheck you run all the time and then run the bigger one once
before committing perhaps?


I favor splitting the regression tests to add all the time and
before commit targets as you describe. I think that once the
facility is there, we can determine over time how expansive that
second category gets to be.


I don't know how many folks work in a github fork of Postgres, but 
anyone that does could run slow checks on every single push via 
Travis-CI. [1] is an example of that. That wouldn't work directly since 
it depends on Peter Eisentraut's scripts [2] that pull from 
apt.postgresql.org, but presumably it wouldn't be too hard to get tests 
running directly out of a Postgres repo.


[1] https://travis-ci.org/decibel/variant
[2] See wget URLs in 
https://github.com/decibel/variant/blob/master/.travis.yml

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] Using contrib modules in check (Re: pgsql: Fix BRIN to use SnapshotAny during summarization)

2015-08-13 Thread Alvaro Herrera
Alvaro Herrera wrote:
 Tom Lane wrote:

  (If I were tasked with fixing it, I'd be tempted to rewrite it to do
  all the work in one call and return a tuplestore; the alternative
  seems to be to try to keep the index open across multiple calls,
  which would be a mess.)
 
 Here's a patch doing that.

Pushed, thanks.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TAP tests are badly named

2015-08-13 Thread Michael Paquier
On Fri, Aug 14, 2015 at 12:17 AM, Andrew Dunstan wrote:
 here's what I propose.

This patch does not take into account that there may be other code
paths than src/bin/ that may have TAP tests (see my pending patch to
test pg_dump with extensions including dumpable relations for
example). I guess that it is done on purpose, now what are we going to
do about the following things:
- for src/test/ssl, should we have a new target in vcregress? Like ssltest?
- for the pending patch I just mentioned, what should we do then?
Should we expect it to work under modulescheck?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TAP tests are badly named

2015-08-13 Thread Andrew Dunstan



On 08/13/2015 12:03 PM, Michael Paquier wrote:

On Fri, Aug 14, 2015 at 12:17 AM, Andrew Dunstan wrote:

here's what I propose.

This patch does not take into account that there may be other code
paths than src/bin/ that may have TAP tests (see my pending patch to
test pg_dump with extensions including dumpable relations for
example). I guess that it is done on purpose, now what are we going to
do about the following things:
- for src/test/ssl, should we have a new target in vcregress? Like ssltest?
- for the pending patch I just mentioned, what should we do then?
Should we expect it to work under modulescheck?



Of course it takes it into account. What it does is let you add extra 
checks easily. But I am not going to accept the divergence in 
vcregress.pl from the way we run tests using the standard tools. Yes, 
ssl tests should have a new target, as should any other new set of 
tests. We don't have a single make target for tap tests and neither 
should we in vcregress.pl.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-08-13 Thread Michael Paquier
On Mon, Jun 29, 2015 at 10:11 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Wed, Mar 18, 2015 at 1:59 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 Feedback is of course welcome, but note that I am not seriously
 expecting any until we get into 9.6 development cycle and I am adding
 this patch to the next CF.

 I have moved this patch to CF 2015-09, as I have enough patches to
 take care of for now... Let's focus on Windows support and improvement
 of logging for TAP in the first round. That will be already a good
 step forward.

OK, attached is a new version of this patch, that I have largely
reworked to have more user-friendly routines for the tests. The number
of tests is still limited still it shows what this facility can do:
that's on purpose as it does not make much sense to code a complete
and complicated set of tests as long as the core routines are not
stable, hence let's focus on that first.
I have not done yet tests on Windows, I am expecting some tricks
needed for the archive and recovery commands generated for the tests.
Regards,
-- 
Michael
diff --git a/src/bin/pg_rewind/RewindTest.pm b/src/bin/pg_rewind/RewindTest.pm
index 22e5cae..8dd0fb7 100644
--- a/src/bin/pg_rewind/RewindTest.pm
+++ b/src/bin/pg_rewind/RewindTest.pm
@@ -125,38 +125,6 @@ sub check_query
 	}
 }
 
-# Run a query once a second, until it returns 't' (i.e. SQL boolean true).
-sub poll_query_until
-{
-	my ($query, $connstr) = @_;
-
-	my $max_attempts = 30;
-	my $attempts = 0;
-	my ($stdout, $stderr);
-
-	while ($attempts  $max_attempts)
-	{
-		my $cmd = [ 'psql', '-At', '-c', $query, '-d', $connstr ];
-		my $result = run $cmd, '', \$stdout, '2', \$stderr;
-
-		chomp($stdout);
-		$stdout =~ s/\r//g if $Config{osname} eq 'msys';
-		if ($stdout eq t)
-		{
-			return 1;
-		}
-
-		# Wait a second before retrying.
-		sleep 1;
-		$attempts++;
-	}
-
-	# The query result didn't change in 30 seconds. Give up. Print the stderr
-	# from the last attempt, hopefully that's useful for debugging.
-	diag $stderr;
-	return 0;
-}
-
 sub append_to_file
 {
 	my ($filename, $str) = @_;
diff --git a/src/test/Makefile b/src/test/Makefile
index b713c2c..d6e51eb 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -17,7 +17,7 @@ SUBDIRS = regress isolation modules
 # We don't build or execute examples/, locale/, or thread/ by default,
 # but we do want make clean etc to recurse into them.  Likewise for ssl/,
 # because the SSL test suite is not secure to run on a multi-user system.
-ALWAYS_SUBDIRS = examples locale thread ssl
+ALWAYS_SUBDIRS = examples locale thread ssl recovery
 
 # We want to recurse to all subdirs for all standard targets, except that
 # installcheck and install should not recurse into the subdirectory modules.
diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm
index 4927d45..cc1287f 100644
--- a/src/test/perl/TestLib.pm
+++ b/src/test/perl/TestLib.pm
@@ -12,6 +12,7 @@ our @EXPORT = qw(
   configure_hba_for_replication
   start_test_server
   restart_test_server
+  poll_query_until
   psql
   slurp_dir
   slurp_file
@@ -219,6 +220,37 @@ END
 	}
 }
 
+sub poll_query_until
+{
+	my ($query, $connstr) = @_;
+
+	my $max_attempts = 30;
+	my $attempts = 0;
+	my ($stdout, $stderr);
+
+	while ($attempts  $max_attempts)
+	{
+		my $cmd = [ 'psql', '-At', '-c', $query, '-d', $connstr ];
+		my $result = run $cmd, '', \$stdout, '2', \$stderr;
+
+		chomp($stdout);
+		$stdout =~ s/\r//g if $Config{osname} eq 'msys';
+		if ($stdout eq t)
+		{
+			return 1;
+		}
+
+		# Wait a second before retrying.
+		sleep 1;
+		$attempts++;
+	}
+
+	# The query result didn't change in 30 seconds. Give up. Print the stderr
+	# from the last attempt, hopefully that's useful for debugging.
+	diag $stderr;
+	return 0;
+}
+
 sub psql
 {
 	my ($dbname, $sql) = @_;
diff --git a/src/test/recovery/.gitignore b/src/test/recovery/.gitignore
new file mode 100644
index 000..499fa7d
--- /dev/null
+++ b/src/test/recovery/.gitignore
@@ -0,0 +1,3 @@
+# Generated by test suite
+/regress_log/
+/tmp_check/
diff --git a/src/test/recovery/Makefile b/src/test/recovery/Makefile
new file mode 100644
index 000..16c063a
--- /dev/null
+++ b/src/test/recovery/Makefile
@@ -0,0 +1,17 @@
+#-
+#
+# Makefile for src/test/recovery
+#
+# Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/recovery/Makefile
+#
+#-
+
+subdir = src/test/recovery
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
diff --git a/src/test/recovery/README b/src/test/recovery/README
new file mode 100644
index 000..20b98e0
--- /dev/null
+++ b/src/test/recovery/README
@@ -0,0 +1,19 @@
+src/test/recovery/README
+
+Regression tests for recovery and replication