Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Torsten Zuehlsdorff

Hello,

On 29.09.2015 00:34, Tom Lane wrote:

Jim Nasby  writes:

On 9/28/15 11:43 AM, Andres Freund wrote:

It has been stated pretty clearly in this thread by a number of senior
community people that we're not going to use a closed source system.



GitLab OTOH is released under a MIT license, so it is an option. I don't
know how it compares to other suggested options, but if YUriy wants to
propose it it's at least a viable option.


I think a more accurate summary of what's been said is that we won't
consider putting any important functionality on proprietary platforms,
of which closed-source tools would be a subset.  The intention of the
community is that we'll be around for as long as there's a critical mass
of people interested in maintaining Postgres.  We will not be dependent
on any one company, and that's why e.g. github is out.  (A lot of smaller
open-source projects don't have the luxury of rejecting such options ...
but we do, and we will.)

Now, running gitlab on community-owned hardware would potentially be an
option, if we find gitlab attractive from a functionality standpoint.
The question I'd have about that is whether it has a real development
community, or is open-source in name only.  If github did go belly up,
would we find ourselves maintaining the gitlab code all by ourselves?
That might not be the end of the world, but it wouldn't be a good use
of community time either.


I ported GitLab to run on FreeBSD. From this progress i can say, that 
there is an active community. Many of the features (and bugfixes) came 
directly from the community.


I'm not for or against GitLab. I just wanted to point this out.

As mentioned in another Thread Bugzilla is used by LibreOffice and 
Linux. I want to add FreeBSD to the list.


Greetings,
Torsten



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-09-29 Thread Etsuro Fujita

On 2015/09/29 13:55, Kouhei Kaigai wrote:

From: pgsql-hackers-ow...@postgresql.org
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Etsuro Fujita
On 2015/09/29 9:13, Kouhei Kaigai wrote:



From: pgsql-hackers-ow...@postgresql.org
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
On Mon, Sep 28, 2015 at 3:34 AM, Kouhei Kaigai  wrote:



The attached patch allows FDW driver to handle EPQ recheck by its own
preferable way, even if it is alternative local join or ExecQual to
the expression being pushed down.



Thanks.  I was all set to commit this, or at least part of it, when I
noticed that we already have an FDW callback called RefetchForeignRow.
We seem to be intending that this new callback should refetch the row
from the foreign server and verify that any pushed-down quals apply to
it.  But why can't RefetchForeignRow do that?  That seems to be what
it's for.



I thought the same thing [1].  While I thought it was relatively easy to
make changes to RefetchForeignRow that way for the foreign table case
(scanrelid>0), I was not sure how hard it would be to do so for the
foreign join case (scanrelid==0).  So, I proposed to leave that changes
for 9.6.  I'll have a rethink on this issue along the lines of that
approach.



Even if base relation case, is it really easy to do?

RefetchForeignRow() does not take ForeignScanState as its argument,
so it is not obvious to access its private field, isn't it?
ExecRowMark contains "rti" field, so it might be feasible to find out
the target PlanState using walker routine recently supported, although
it is not a simple enough.
Unless we don't have reference to the private field, it is not feasible
to access expression that was pushed down to the remote-side, therefore,
it does not allow to apply proper rechecks here.

In addition, it is problematic when scanrelid==0 because we have no
relevant ForeignScanState which represents the base relations, even
though ExecRowMark is associated with a particular base relation.
In case of scanrelid==0, EPQ recheck routine also have to ensure
the EPQ tuple is visible towards the join condition in addition to
the qualifier of base relation. These information is also stored within
private data field, so it has to have a reference to the private data
of ForeignScanState of the remote join (scanrelid==0) which contains
the target relation.

Could you introduce us (1) how to access private data field of
ForeignScanState from the RefetchForeignRow callback? (2) why it
is reasonable to implement than the callback on ForeignRecheck().


For the foreign table case (scanrelid>0), I imagined an approach 
different than yours.  In that case, I thought the issue would be 
probably addressed by just modifying the remote query performed in 
RefetchForeignRow, which would be of the form "SELECT ctid, * FROM 
remote table WHERE ctid = $1", so that the modified query would be of 
the form "SELECT ctid, * FROM remote table WHERE ctid = $1 AND *remote 
quals*".


For the foreign join case (scanrelid==0), in my vision, I think we would 
need some changes not only to RefetchForeignRow but to the existing 
EvalPlanQual machinery in the core.  I've not had a clear image yet, though.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unclear about row-level security USING vs. CHECK

2015-09-29 Thread Charles Clavadetscher
I had not seen this.

> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org 
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
> Sent: Montag, 28. September 2015 21:43
> To: Stephen Frost 
> Cc: Peter Eisentraut ; pgsql-hackers 
> ; Charles Clavadetscher
> 
> Subject: Re: [HACKERS] unclear about row-level security USING vs. CHECK
> 
> On Mon, Sep 28, 2015 at 3:15 PM, Stephen Frost  wrote:
> > I listed out the various alternatives but didn't end up getting any
> > responses to it.  I'm still of the opinion that the documentation is the
> > main thing which needs improving here, but we can also change CREATE
> > POLICY, et al, to require an explicit WITH CHECK clause for the commands
> > where that makes sense if that's the consensus.
> 
> My vote is to remove the behavior where USING flows over to WITH
> CHECK.  So you only get a WITH CHECK policy if you explicitly specify
> one.
> 
> If there's some other consensus, OK, but tempus fugit.

If the behaviof of USING doesn't flow to WITH CHECK is the same as making WITH 
CHECK mandatory for ALL and UPDATE, I guess. Otherwise there would be a 
partially unspecified behavior. Or am I misunderstanding your idea?

Charles





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Torsten Zuehlsdorff

On 29.09.2015 05:54, Tom Lane wrote:

Stephen Frost  writes:

* Jim Nasby (jim.na...@bluetreble.com) wrote:

2 years ago is when they first released the enterprise edition,
which according to [1] had "The most important new feature is that
you can now add members to groups of projects."



It needed a lot more than a single feature.


Just going to their primary web page, and noting that the first line gives
links to "Features" and "Pricing" (but not "Documentation"), and the
second line makes it clear that there's a big difference between the
"Community Edition" and the "Enterprise Edition", is already enough to
turn me off.  We've seen that model before (mysql...) and it doesn't bode
well in the long run.

Further poking shows no evidence of any decent email integration, to name
just one of the things that have been mentioned as requirements in this
thread.


That is a fair point. First steps into this direction are done with 
version 8.0. This was released a week ago.



On the other hand, they are big on LDAP logins, and even
two-factor authentication.  (We need this for an issue tracker that's
supposed to provide visibility and easier reporting to people outside the
project?)


Login methods are well supported. There are various login strategies 
supported.



And JIRA integration, which seems to be an anti-feature to some
on this thread.


It is not only JIRA. Jira is one of a long list. Many like the Jenkins 
integration to support CI for example.



And they'd sure love to be in charge of our code repo.


Mh - i'm not a native speaker. I didn't understand this line.


And the main, possibly only, metadata they can track about an issue is
"assignee" and "milestone".


Indeed - GitLab is *not* a bugtracker. It's an web based git repository 
manager. It also offers issue tracking, but this is not the main idea of 
GitLab. Therefore i doubt that its the best choice for the community, too.


Greetings,
Torsten



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Table partition + join pushdown

2015-09-29 Thread Taiki Kondo
Hello, KaiGai-san

Thank you for your introduction and comments.

> * Suppose we focus on only HashJoin in the first version?
> This patch also add support on NestLoop and MergeJoin, however, NestLoop
> has no valuable scenario without parallel execution capability, and the
> most valuable scenario on MergeJoin is reduction of rows prior to Sort.
> Once input rows gets sorted, it is less attractive to filter out rows.

I agree that handling for NestLoop doesn't make sense in this timing.
But I think that handling for MergeJoin still makes sense in this timing.

In my v1 patch, I implemented that the additional filter is used for
qualification at same place as join filter, same as NestLoop.
It is not useful indeed. I agree with you at this point.

I think, and you also mentioned, large factor of cost estimation for MergeJoin 
is
Sort under MergeJoin, so I believe additional filtering at Sort is a good 
choice for
this situation, as same way at Hash under HashJoin.

Furthermore, I think it is better way that the additional filtering shall be
"added" to Scan node under each child (pushed-down) Join nodes, because we
don't have to implement additional qualification at Join nodes and
we only have to implement simply concatenating original and additional
RestrictInfos for filtering.

As a mere idea, for realizing this way, I think we have to implement 
copyObject()
for Scan path nodes and use ppi_clauses for this usage.

What is your opinion?


> * MultiExecHash() once put slot on outer_slot then move it to inner_slot
> This patch add set_hash_references() to replace varno in the expression
> of Hash->filterqual to OUTER_VAR. Why not INNER_VAR?
> If Var nodes would be initialized t oreference inner_slot, you don't need
> to re-assign slot.

The node under Hash node is connected as the OUTER node. This implementation 
may be
from implementation of set_dummy_tlist_references() commonly used by Material,
Sort, Unique, SetOp, and Hash.

And I was faced a problem when I was implementing EXPLAIN for the additional 
filter.
I implemented same as you mentioned above, then error occurred in running 
EXPLAIN.
I think EXPLAIN expects expression's varno is same as the position that the 
under node
is connected to; i.e. if it is connected to OUTER, varno must be OUTER_VAR.


> It seems to me it is not a fair estimation because inner_path_rows means
> number of rows already filtered out, but filter_qual shall be applied to
> all the inner input rows.

OK. I'll fix it.


Best regards,
--
Taiki Kondo

NEC Solution Innovators, Ltd.



-Original Message-
From: pgsql-hackers-ow...@postgresql.org 
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kouhei Kaigai
Sent: Tuesday, September 29, 2015 11:46 AM
To: Taiki Kondo
Cc: Akio Iwaasa; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [Proposal] Table partition + join pushdown

> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Taiki Kondo
> Sent: Thursday, September 24, 2015 8:06 PM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Iwaasa Akio(岩浅 晃郎); pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] [Proposal] Table partition + join pushdown
> 
> Hello, KaiGai-san.
> 
> Thank you for your comment, and sorry for late response.
> 
> The attached patch is completely rewritten from previous patch[1], at your
> suggestion[2].
> Please find attached.
>
Thanks for your work, and let me introduce purpose of the work briefly,
because the last submission was August.

His work intends (1) to save resource consumption on tables join at this
moment, and (2) to provide an infrastructure of one parallel join scenario
once Funnel node gets capable.

Even if we construct partition tables, it is unavailable to utilize to
filter out candidate rows of join. In the result, size of Hash table
may grow more than necessity and it causes unnecessary nBatch increase.

Below is the scenario this project tries to tackle. In case when tables
join takes partitioned table on one side, usually, we once need to run
entire partitioned table unless we cannot drop particular child tables.

  Join  cond (x = y)
-> Append
  -> SeqScan on tbl_child_0  ... CHECK (hash_func(x) % 4 = 0)
  -> SeqScan on tbl_child_1  ... CHECK (hash_func(x) % 4 = 1)
  -> SeqScan on tbl_child_2  ... CHECK (hash_func(x) % 4 = 2)
  -> SeqScan on tbl_child_3  ... CHECK (hash_func(x) % 4 = 3)
-> Hash
  -> SeqScan on other_table

However, CHECK() constraint assigned on child tables give us hint which
rows in other side are never related to this join.
For example, all the rows in other_table to be joined with tbl_child_0
should have multiple number of 4 on hash_func(y). We may be able to omit
unrelated rows from the hash-table in this case, then it eventually allows
to reduce the size of hash table.

In case of INNER_JOIN, we can rewrite the query execution plan as below.

  Append
-> HashJoin  cond (x = y)
  -> SeqScan on 

Re: [HACKERS] track_commit_timestamp and COMMIT PREPARED

2015-09-29 Thread Petr Jelinek

On 2015-09-29 13:44, Fujii Masao wrote:

On Tue, Sep 29, 2015 at 12:05 PM, Alvaro Herrera
 wrote:

Petr Jelinek wrote:

On 2015-09-02 16:14, Fujii Masao wrote:

On Wed, Aug 5, 2015 at 2:16 AM, Robert Haas  wrote:

On Mon, Aug 3, 2015 at 10:31 AM, Fujii Masao  wrote:

track_commit_timestamp tracks COMMIT PREPARED as expected in standby server,
but not in master server. Is this intentional? It should track COMMIT PREPARED
even in master? Otherwise, we cannot use commit_timestamp feature to check
the replication lag properly while we use 2PC.


That sounds like it must be a bug.  I think you should add it to the
open items list.


Attached fixes this. It includes advancement of replication origin as well.
I didn't feel like doing refactor of commit code this late in 9.5 cycle
though, so I went with the code duplication + note in xact.c.


Thanks, your proposed behavior looks reasonable.  I didn't like the
existing coding nor the fact that with your patch we'd have two copies
of it, so I changed a bit instead to be more understandable.  Hopefully I
didn't break too many things.  This patch includes the patch for the
other commitTS open item too.


-#define RecoveryRequiresBoolParameter(param_name, currValue, masterValue) \
-do { \
-bool _currValue = (currValue); \
-bool _masterValue = (masterValue); \
-if (_currValue != _masterValue) \
-ereport(ERROR, \
-(errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
- errmsg("hot standby is not possible because it
requires \"%s\" to be same on master and standby (master has \"%s\",
standby has \"%s\")", \
-param_name, \
-_masterValue ? "true" : "false", \
-_currValue ? "true" : "false"))); \
-} while(0)

This code should not be deleted because there is still the caller of
the macro function.



Looks like Alvaro didn't merge the second patch correctly, the only 
caller should have been removed as well.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-09-29 Thread Kouhei Kaigai




> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Etsuro Fujita
> Sent: Tuesday, September 29, 2015 8:00 PM
> To: Kaigai Kouhei(海外 浩平); Robert Haas
> Cc: PostgreSQL-development; 花田茂
> Subject: Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
> 
> On 2015/09/29 17:49, Kouhei Kaigai wrote:
> >> From: pgsql-hackers-ow...@postgresql.org
> >> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Etsuro Fujita
> 
> >>> RefetchForeignRow() does not take ForeignScanState as its argument,
> >>> so it is not obvious to access its private field, isn't it?
> >>> ExecRowMark contains "rti" field, so it might be feasible to find out
> >>> the target PlanState using walker routine recently supported, although
> >>> it is not a simple enough.
> >>> Unless we don't have reference to the private field, it is not feasible
> >>> to access expression that was pushed down to the remote-side, therefore,
> >>> it does not allow to apply proper rechecks here.
> 
> >>> Could you introduce us (1) how to access private data field of
> >>> ForeignScanState from the RefetchForeignRow callback?
> 
> >> For the foreign table case (scanrelid>0), I imagined an approach
> >> different than yours.  In that case, I thought the issue would be
> >> probably addressed by just modifying the remote query performed in
> >> RefetchForeignRow, which would be of the form "SELECT ctid, * FROM
> >> remote table WHERE ctid = $1", so that the modified query would be of
> >> the form "SELECT ctid, * FROM remote table WHERE ctid = $1 AND *remote
> >> quals*".
> 
> Sorry, I forgot to add "FOR UPDATE" to the before/after queries.
> 
> > My question is how to pull expression of the remote query.
> > It shall be stored at somewhere private field of ForeignScanState,
> > however, RefetchForeignRow does not have direct access to the
> > relevant ForeignScanState node.
> > It is what I asked at the question (1).
> 
> I imagined the following steps to get the remote query string: (1)
> create the remote query string and store it in fdw_private during
> postgresGetForeignPlan, (2) extract the string from fdw_private and
> store it in erm->ermExtra during postgresBeginForeignScan, and (3)
> extract the string from erm->ermExtra in postgresRefetchForeignRow.
> 
> > Also note that EvalPlanQualFetchRowMarks() will raise an error
> > if RefetchForeignRow callback returned NULL tuple.
> > Is it right or expected behavior?
> 
> IIUC, I think that that behavior is reasonable.
> 
> > It looks to me this callback is designed to pull out a particular
> > tuple identified by the remote-row-id, regardless of the qualifier
> > checks based on the latest value.
> 
> Because erm->markType==ROW_MARK_REFERENCE, I don't think that that
> behavior would cause any problem.  Maybe I'm missing something, though.
>
Really?

ExecLockRows() calls EvalPlanQualFetchRowMarks() to fill up EPQ tuple
slot prior to EvalPlanQualNext(), because these tuples are referenced
during EPQ rechecks.
The purpose of EvalPlanQualNext() is evaluate whether the current bunch
of rows are visible towards the qualifiers of underlying scan/join.
Then, if not visible, it *ignores* the current tuples, as follows.

/*
 * Now fetch any non-locked source rows --- the EPQ logic knows how to
 * do that.
 */
EvalPlanQualSetSlot(>lr_epqstate, slot);
EvalPlanQualFetchRowMarks(>lr_epqstate);   <--- LOAD REMOTE ROWS

/*
 * And finally we can re-evaluate the tuple.
 */
slot = EvalPlanQualNext(>lr_epqstate); <--- EVALUATE 
QUALIFIERS
if (TupIsNull(slot))
{
/* Updated tuple fails qual, so ignore it and go on */
goto lnext;   <-- IGNORE THE ROW, NOT RAISE AN ERROR
}

What happen if RefetchForeignRow raise an error in case when the latest
row exists but violated towards the "remote quals" ?
This is the case to be ignored, unlike the case when remote row identified
by row-id didn't exist.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving test coverage of extensions with pg_dump

2015-09-29 Thread Michael Paquier
On Sat, Sep 26, 2015 at 10:22 PM, Andres Freund  wrote:
> On 2015-09-26 21:54:46 +0900, Michael Paquier wrote:
>> On Wed, Sep 23, 2015 at 1:04 AM, Alvaro Herrera 
>> wrote:
>> > We discussed this in some other thread, not long ago.  I looked briefly
>> > in the archives but couldn't find it.  I think the conclusion was
>> > something along the lines of "hmm, tough".
>
>> That's hours, even days of fun ahead for sure.
>
> To me that's a somewhat serious bug, and something that we probably need
> to address at some point.

Well, addressing it at the root of the problem is... Let's say...
Really tough. I have put my head on this stuff for a couple of hours
and tried to draw up a couple of ways to fix this. First, even if
pg_dump relies heavily on the assumption that attributes need to be
ordered by attnum, I thought that it would have been possible to use
attinhcount to order  the columns in the same way when taking the dump
from a source db and a target (where dump of source has been restored
to). This would work if there is no more than 1 level of child
relations, but with grand-child relations the order gets messed up
again.

Locating a fix on the backend side would make things for pg_dump
easier, an idea would be to simply reorder the attribute numbers when
a column is added to a parent table. For example with something like
that:
CREATE TABLE test_parent (c1 integer, c2 integer);
CREATE TABLE test_child_1 (c3 integer) inherits (test_parent);
CREATE TABLE test_child_2 (c4 integer) inherits (test_child_1);
ALTER TABLE test_parent ADD COLUMN c5 integer;
We get the following attribute ordering:
=# SELECT c.relname, a.attname, a.attnum
FROM pg_attribute a
  JOIN pg_class c ON (a.attrelid = c.oid)
WHERE a.attrelid IN ('test_parent'::regclass,
 'test_child_1'::regclass,
 'test_child_2'::regclass)
  AND a.attnum > 0 ORDER BY c.relname, a.attnum;
   relname| attname | attnum
--+-+
 test_child_1 | c1  |  1
 test_child_1 | c2  |  2
 test_child_1 | c3  |  3
 test_child_1 | c5  |  4
 test_child_2 | c1  |  1
 test_child_2 | c2  |  2
 test_child_2 | c3  |  3
 test_child_2 | c4  |  4
 test_child_2 | c5  |  5
 test_parent  | c1  |  1
 test_parent  | c2  |  2
 test_parent  | c5  |  3
(12 rows)

Now, what we could do on a child relation when a new attribute in its
parent relation is to increment the attnum of the existing columns
with attinhcount = 0 by 1, and insert the new column at the end of the
existing ones where attinhcount > 0, to give something like that:
   relname| attname | attnum
--+-+
 test_child_1 | c1  |  1
 test_child_1 | c2  |  2
 test_child_1 | c5  |  3
 test_child_1 | c3  |  4
 test_child_2 | c1  |  1
 test_child_2 | c2  |  2
 test_child_2 | c3  |  3
 test_child_2 | c5  |  4
 test_child_2 | c4  |  5
 test_parent  | c1  |  1
 test_parent  | c2  |  2
 test_parent  | c5  |  3
(12 rows)
Looking at tablecmds.c, this could be intuitively done as a part of
ATExecAddColumn by scanning the attributes of the child relation from
the end. But it is of course not that simple, a lot of code paths rely
on the attnum of the current attributes. One is CookedConstraint, but
that's a can of worms for back branches.

Another bandaid solution, that would be less invasive for a backpatch
is to reproduce the sequence of DDL commands within the dump itself:
we would need to dump attinhcount in getTableAttrs and use it to guess
what are the columns on the child relations that have been added on a
parent relation after the children have been created. This would not
solve the case of data-only dumps containing INSERT queries that have
no column names on databases restored from past schema dumps though.

Perhaps you did not look at the last patch I sent on this thread, but
I changed it so as a schedule is used with a call to pg_regress.
That's a more scalable approach as you were concerned about as we can
plug in more easily new modules and new regression tests without
modifying the perl script used to kick the tests, though it does not
run the main regression test suite and it actually cannot run it,
except with a modified schedule or with a filter of the child-parent
tables. Patch is attached for consideration, which looks like a good
base for future support, feel free to disagree :)
Thanks,
-- 
Michael
From a95e5bc29bebb2017399c777aee0123f5d4c8a15 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Tue, 29 Sep 2015 21:33:50 +0900
Subject: [PATCH] Add test facility to check dump/restore with extensions

The test added compares a dump taken from a source database containing a set
of extensions and a target database after dumping the contents from source and
restore them 

Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Steve Crawford
On Mon, Sep 28, 2015 at 4:41 PM, Jim Nasby  wrote:

> Note that since they also offer a hosted solution we should use that to
> play with instead of trying to install it at this point.
>
> Integrating the issue tracker looks like it's just a call to this API:
> http://doc.gitlab.com/ce/api/issues.html#new-issue. I don't normally do
> web development myself so I'd rather not figuring out how to setup a copy
> of the website to hack on, but if no one else wants to try it I can take a
> stab at it.
>
> Presumably mirroring our git repository would work the same as it does for
> mirroring to GitHub. My guess is that would be enough to get the basic
> git/issue tracker integration working.
>
> Commitfest could be tied in as well. Presumably each commitfest would be a
> milestone (http://doc.gitlab.com/ce/api/milestones.html) and each
> submission an issue.
>
>
One of the issues identified with  Github is that it is closed and
commercial which goes against the expressed desires of the PostgreSQL
community. As such, it's important to note that Gitlab seems to be in the
"freemium" model with a "community" and an "enterprise" version so for
comparison purposes we should only look at the features in the open-source
community version:
https://about.gitlab.com/features/#compare

Cheers,
Steve


Re: [HACKERS] Rework the way multixact truncations work

2015-09-29 Thread Alvaro Herrera
Robert Haas wrote:
> On Mon, Sep 28, 2015 at 10:48 PM, Jim Nasby  wrote:
> > Maybe I'm confused, but I thought the whole purpose of this was to get rid
> > of the risk associated with that calculation in favor of explicit truncation
> > boundaries in the WAL log.
> 
> Yes.  But if the master hasn't been updated yet, then we still need to
> do something based on a calculation.

Right.

> > Even if that's not the case, ISTM that being big and in your face about a
> > potential data corruption bug is a good thing, as long as the DBA has a way
> > to "hit the snooze button".
> 
> Panicking the standby because the master hasn't been updated does not
> seem like a good thing to me in any way.

If we had a way to force the master to upgrade, I think it would be good
because we have a mechanism to get rid of the legacy truncation code;
but as I said several messages ago this doesn't actually work which is
why I dropped the idea of panicking.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-09-29 Thread Etsuro Fujita

On 2015/09/29 17:49, Kouhei Kaigai wrote:

From: pgsql-hackers-ow...@postgresql.org
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Etsuro Fujita



RefetchForeignRow() does not take ForeignScanState as its argument,
so it is not obvious to access its private field, isn't it?
ExecRowMark contains "rti" field, so it might be feasible to find out
the target PlanState using walker routine recently supported, although
it is not a simple enough.
Unless we don't have reference to the private field, it is not feasible
to access expression that was pushed down to the remote-side, therefore,
it does not allow to apply proper rechecks here.



Could you introduce us (1) how to access private data field of
ForeignScanState from the RefetchForeignRow callback?



For the foreign table case (scanrelid>0), I imagined an approach
different than yours.  In that case, I thought the issue would be
probably addressed by just modifying the remote query performed in
RefetchForeignRow, which would be of the form "SELECT ctid, * FROM
remote table WHERE ctid = $1", so that the modified query would be of
the form "SELECT ctid, * FROM remote table WHERE ctid = $1 AND *remote
quals*".


Sorry, I forgot to add "FOR UPDATE" to the before/after queries.


My question is how to pull expression of the remote query.
It shall be stored at somewhere private field of ForeignScanState,
however, RefetchForeignRow does not have direct access to the
relevant ForeignScanState node.
It is what I asked at the question (1).


I imagined the following steps to get the remote query string: (1) 
create the remote query string and store it in fdw_private during 
postgresGetForeignPlan, (2) extract the string from fdw_private and 
store it in erm->ermExtra during postgresBeginForeignScan, and (3) 
extract the string from erm->ermExtra in postgresRefetchForeignRow.



Also note that EvalPlanQualFetchRowMarks() will raise an error
if RefetchForeignRow callback returned NULL tuple.
Is it right or expected behavior?


IIUC, I think that that behavior is reasonable.


It looks to me this callback is designed to pull out a particular
tuple identified by the remote-row-id, regardless of the qualifier
checks based on the latest value.


Because erm->markType==ROW_MARK_REFERENCE, I don't think that that 
behavior would cause any problem.  Maybe I'm missing something, though.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] track_commit_timestamp and COMMIT PREPARED

2015-09-29 Thread Fujii Masao
On Tue, Sep 29, 2015 at 12:05 PM, Alvaro Herrera
 wrote:
> Petr Jelinek wrote:
>> On 2015-09-02 16:14, Fujii Masao wrote:
>> >On Wed, Aug 5, 2015 at 2:16 AM, Robert Haas  wrote:
>> >>On Mon, Aug 3, 2015 at 10:31 AM, Fujii Masao  wrote:
>> >>>track_commit_timestamp tracks COMMIT PREPARED as expected in standby 
>> >>>server,
>> >>>but not in master server. Is this intentional? It should track COMMIT 
>> >>>PREPARED
>> >>>even in master? Otherwise, we cannot use commit_timestamp feature to check
>> >>>the replication lag properly while we use 2PC.
>> >>
>> >>That sounds like it must be a bug.  I think you should add it to the
>> >>open items list.
>>
>> Attached fixes this. It includes advancement of replication origin as well.
>> I didn't feel like doing refactor of commit code this late in 9.5 cycle
>> though, so I went with the code duplication + note in xact.c.
>
> Thanks, your proposed behavior looks reasonable.  I didn't like the
> existing coding nor the fact that with your patch we'd have two copies
> of it, so I changed a bit instead to be more understandable.  Hopefully I
> didn't break too many things.  This patch includes the patch for the
> other commitTS open item too.

-#define RecoveryRequiresBoolParameter(param_name, currValue, masterValue) \
-do { \
-bool _currValue = (currValue); \
-bool _masterValue = (masterValue); \
-if (_currValue != _masterValue) \
-ereport(ERROR, \
-(errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
- errmsg("hot standby is not possible because it
requires \"%s\" to be same on master and standby (master has \"%s\",
standby has \"%s\")", \
-param_name, \
-_masterValue ? "true" : "false", \
-_currValue ? "true" : "false"))); \
-} while(0)

This code should not be deleted because there is still the caller of
the macro function.

@@ -5321,7 +5333,7 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
 /* Set the transaction commit timestamp and metadata */
 TransactionTreeSetCommitTsData(xid, parsed->nsubxacts, parsed->subxacts,
commit_time, origin_id,
-   false);
+   false, true);

Why does xact_redo_commit always set replaying_xlog and write_xlog to
false and true, respectively? ISTM that they should be opposite...

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] track_commit_timestamp and COMMIT PREPARED

2015-09-29 Thread Petr Jelinek

On 2015-09-29 05:05, Alvaro Herrera wrote:

Petr Jelinek wrote:

On 2015-09-02 16:14, Fujii Masao wrote:

On Wed, Aug 5, 2015 at 2:16 AM, Robert Haas  wrote:

On Mon, Aug 3, 2015 at 10:31 AM, Fujii Masao  wrote:

track_commit_timestamp tracks COMMIT PREPARED as expected in standby server,
but not in master server. Is this intentional? It should track COMMIT PREPARED
even in master? Otherwise, we cannot use commit_timestamp feature to check
the replication lag properly while we use 2PC.


That sounds like it must be a bug.  I think you should add it to the
open items list.


Attached fixes this. It includes advancement of replication origin as well.
I didn't feel like doing refactor of commit code this late in 9.5 cycle
though, so I went with the code duplication + note in xact.c.


Thanks, your proposed behavior looks reasonable.  I didn't like the
existing coding nor the fact that with your patch we'd have two copies
of it, so I changed a bit instead to be more understandable.  Hopefully I
didn't break too many things.  This patch includes the patch for the
other commitTS open item too.



Looks good. It does change the logic slightly - previous code didn't 
advance session origin lsn if origin timestamp wasn't set, your code 
does, but I think the new behavior is better.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: index-only scans with partial indexes

2015-09-29 Thread Kyotaro HORIGUCHI
Hi,

At Sat, 26 Sep 2015 18:00:33 +0200, Tomas Vondra  
wrote in <5606c121.10...@2ndquadrant.com>
> Hi,
> 
> On 09/26/2015 01:28 PM, Tomas Vondra wrote:
> 
> > The patch does not change the check_index_only implementation - it
> > still needs to check the clauses, just like in v1 of the patch. To
> > make this re-check unnecessary, we'd have to stick the remaining
> > clauses somewhere, so that check_index_only can use only the filtered
> > list (instead of walking through the complete list of restrictions).
> 
> After thinking about this a bit more, I realized we already have a
> good place for keeping this list - IndexClauseSet. So I've done that,

The place looks fine but it might be better that rclauses have
baserestrictinfo itself when indpred == NIL. It would make the
semantics of rclauses more consistent.

> and removed most of the code from check_index_only - it still needs to
> decide whether to use the full list of restrictions (regular indexes)
> or the filtered list (for partial indexes).

The change above allows to reduce the modification still left in
check_index_only.

> Calling predicate_implied_by in match_clause_to_index makes the check
> a bit earlier, compared to the v1 of the patch. So theoretically there
> might be cases where we'd interrupt the execution between those
> places, and the v1 would not pay the price for the call. But those
> places are fairly close together, so I find that unlikely and I've
> been unable to reproduce a plausible example of such regression
> despite trying.
> 
> The only regression test this breaks is "aggregates", and I believe
> that's actually correct because it changes the plans like this:
> 
>->  Index Only Scan Backward using minmaxtest2i on minmaxtest2
>  Index Cond: (f1 IS NOT NULL)
>->  Index Only Scan using minmaxtest3i on minmaxtest3
> -Index Cond: (f1 IS NOT NULL)
> 
> i.e. it removes the Index Condition from scans of minmaxtest3i, which
> is perfectly sensible because the index is defined like this:
> 
> create index minmaxtest3i on minmaxtest3(f1) where f1 is not null;
> 
> So it's partial index and that condition is implied by the predicate
> (it's actually exactly the same).

Agreed. It looks an unexpected-but-positive product.

> I haven't fixed those regression tests for now.
> 
> >
> > Or perhaps we can do the check in match_clause_to_index, pretty much
> > for free? If the clause is not implied by the index predicate (which
> > we know thanks to the fix), and we don't assign it to any of the
> > index columns, it means we can'd use IOS, no?
> 
> After thinking about this a bit more, this is clearly nonsense. It
> fails on conditions referencing multiple columns (WHERE a=b) which
> can't be assigned to a single index column. So the logic would have to
> be much more complicated, effectively doing what check_index_only is
> doing just a tiny bit later.

cost_index() seems to need to be fixed. It would count excluded
clauses in estimate.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On-demand running query plans using auto_explain and signals

2015-09-29 Thread Shulgin, Oleksandr
On Tue, Sep 29, 2015 at 12:28 AM, Jim Nasby 
wrote:

> On 9/18/15 5:05 AM, Shulgin, Oleksandr wrote:
>
>>
>> So this has to be the responsibility of the reply sending backend in the
>> end: to create and release the DSM *at some point*.
>>
>
> What's wrong with just releasing it at the end of the statement? When the
> statement is done there's no point to reading it asynchronously anymore.


That was only one of the problems in signals-based design, and it has other
more significant problems.  The current approach does exactly that:
allocate the segment before running the plan, release at the end of the
statement.

--
Alex


Re: [HACKERS] On-demand running query plans using auto_explain and signals

2015-09-29 Thread Shulgin, Oleksandr
On Tue, Sep 29, 2015 at 12:57 PM, Andres Freund  wrote:

> On 2015-09-25 19:13:13 +0200, Shulgin, Oleksandr wrote:
> > the auto_explain contrib module.  I now propose the most simple thing
> > possible in my opinion: a new GUC option for auto_explain.  It will make
> > every backend, in which the module is loaded via *_preload_libraries
> > mechanism or using CREATE EXTENSION/LOAD commands, to actively publish
> the
> > plans of queries in dynamic shared memory as long as
> > auto_explain.publish_plans is turned on.
>
> Wait. You're proposing that every query does this unconditionally? For
> every single query? That sounds prohibitively expensive to me.
>

Only if the extension is loaded AND the option is turned on.  Likely you
don't want to do this for an OLTP database, but for other types of load it
might make sense.  Also, I think it should be possible to toggle this on a
per-process basis.

Finally, we can put in a cost threshold, so the plans only get published if
they have total_cost >= publish_plan_cost_threshod.

Also, do you mean expensive in terms of CPU or the dynamic shared memory?

--
Alex


Re: [HACKERS] Rework the way multixact truncations work

2015-09-29 Thread Robert Haas
On Mon, Sep 28, 2015 at 10:48 PM, Jim Nasby  wrote:
> Maybe I'm confused, but I thought the whole purpose of this was to get rid
> of the risk associated with that calculation in favor of explicit truncation
> boundaries in the WAL log.

Yes.  But if the master hasn't been updated yet, then we still need to
do something based on a calculation.

> Even if that's not the case, ISTM that being big and in your face about a
> potential data corruption bug is a good thing, as long as the DBA has a way
> to "hit the snooze button".

Panicking the standby because the master hasn't been updated does not
seem like a good thing to me in any way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rework the way multixact truncations work

2015-09-29 Thread Joel Jacobson
On Tue, Sep 22, 2015 at 3:20 PM, Andres Freund  wrote:
> What I've tested is the following:
> * continous burning of multis, both triggered via members and offsets
> * a standby keeping up when the primary is old
> * a standby keeping up when the primary is new
> * basebackups made while a new primary is under load
> * verified that we properly PANIC when a truncation record is replayed
>   in an old standby.

Are these test scripts available somewhere?
I understand they might be undocumented and perhaps tricky to set it all up,
but I would be very interested in them anyway,
think you could push them somewhere?

Thanks a lot for working on this!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unclear about row-level security USING vs. CHECK

2015-09-29 Thread Dean Rasheed
On 28 September 2015 at 20:15, Stephen Frost  wrote:
> I listed out the various alternatives but didn't end up getting any
> responses to it.  I'm still of the opinion that the documentation is the
> main thing which needs improving here, but we can also change CREATE
> POLICY, et al, to require an explicit WITH CHECK clause for the commands
> where that makes sense if that's the consensus.
>

My vote would be to keep it as-is.

It feels perfectly natural to me. USING clauses add to the query's
WHERE clause controlling which existing rows you can SELECT, UPDATE or
DELETE. WITH CHECK clauses control what new data you can add via
INSERT or UPDATE. UPDATE allows both, but most of the time I expect
you'll want them to be the same.

So having the WITH CHECK clause default to being the same as the USING
clause for UPDATE matches what I expect to be the most common usage.
Users granted permission to update a subset of the table's rows
probably don't want to give those rows away. More advanced use-cases
are still supported, but the simplest/most common case is the default,
which means that you don't have to supply the same expression twice.

I agree that the documentation could be improved.

As things stand, you have to read quite a lot of text on the CREATE
POLICY page before you get to the description of how the USING and
WITH CHECK expressions interact. I'd suggest rewording the 2nd
paragraph where these clauses are first introduced. Perhaps something
like:

"""
A policy grants the ability to SELECT, INSERT, UPDATE, or DELETE rows
which match the relevant policy expression. For SELECT, UPDATE and
DELETE, the USING expression from the policy is combined with the
query's WHERE clause to control which existing table rows can be
retrieved, updated or deleted. For INSERT and UPDATE, the WITH CHECK
expression is used to constrain what new data can be added to the
table. A policy that applies to UPDATE may have both USING and WITH
CHECK expressions, which may be different from one another, but if
they are the same, the WITH CHECK expression can be omitted and the
USING expression will be used automatically in its place.

Policy expressions may be any expressions that evaluate to give a
result of type boolean. When a USING expression returns true for a
given row then the query is allowed to act upon that row, while rows
for which the expression returns false or null are skipped. When a
WITH CHECK expression returns true for a new row then the system
allows that row to be added to the table, but if the expression
returns false or null an error is raised.
"""

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On-demand running query plans using auto_explain and signals

2015-09-29 Thread Andres Freund
On 2015-09-25 19:13:13 +0200, Shulgin, Oleksandr wrote:
> the auto_explain contrib module.  I now propose the most simple thing
> possible in my opinion: a new GUC option for auto_explain.  It will make
> every backend, in which the module is loaded via *_preload_libraries
> mechanism or using CREATE EXTENSION/LOAD commands, to actively publish the
> plans of queries in dynamic shared memory as long as
> auto_explain.publish_plans is turned on.

Wait. You're proposing that every query does this unconditionally? For
every single query? That sounds prohibitively expensive to me.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-09-29 Thread Kouhei Kaigai
> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Etsuro Fujita
> Sent: Tuesday, September 29, 2015 4:36 PM
> To: Kaigai Kouhei(海外 浩平); Robert Haas
> Cc: PostgreSQL-development; 花田茂
> Subject: Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
> 
> On 2015/09/29 13:55, Kouhei Kaigai wrote:
> >> From: pgsql-hackers-ow...@postgresql.org
> >> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Etsuro Fujita
> >> On 2015/09/29 9:13, Kouhei Kaigai wrote:
> 
>  From: pgsql-hackers-ow...@postgresql.org
>  [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
>  On Mon, Sep 28, 2015 at 3:34 AM, Kouhei Kaigai  
>  wrote:
> 
> > The attached patch allows FDW driver to handle EPQ recheck by its own
> > preferable way, even if it is alternative local join or ExecQual to
> > the expression being pushed down.
> 
>  Thanks.  I was all set to commit this, or at least part of it, when I
>  noticed that we already have an FDW callback called RefetchForeignRow.
>  We seem to be intending that this new callback should refetch the row
>  from the foreign server and verify that any pushed-down quals apply to
>  it.  But why can't RefetchForeignRow do that?  That seems to be what
>  it's for.
> 
> >> I thought the same thing [1].  While I thought it was relatively easy to
> >> make changes to RefetchForeignRow that way for the foreign table case
> >> (scanrelid>0), I was not sure how hard it would be to do so for the
> >> foreign join case (scanrelid==0).  So, I proposed to leave that changes
> >> for 9.6.  I'll have a rethink on this issue along the lines of that
> >> approach.
> 
> > Even if base relation case, is it really easy to do?
> >
> > RefetchForeignRow() does not take ForeignScanState as its argument,
> > so it is not obvious to access its private field, isn't it?
> > ExecRowMark contains "rti" field, so it might be feasible to find out
> > the target PlanState using walker routine recently supported, although
> > it is not a simple enough.
> > Unless we don't have reference to the private field, it is not feasible
> > to access expression that was pushed down to the remote-side, therefore,
> > it does not allow to apply proper rechecks here.
> >
> > In addition, it is problematic when scanrelid==0 because we have no
> > relevant ForeignScanState which represents the base relations, even
> > though ExecRowMark is associated with a particular base relation.
> > In case of scanrelid==0, EPQ recheck routine also have to ensure
> > the EPQ tuple is visible towards the join condition in addition to
> > the qualifier of base relation. These information is also stored within
> > private data field, so it has to have a reference to the private data
> > of ForeignScanState of the remote join (scanrelid==0) which contains
> > the target relation.
> >
> > Could you introduce us (1) how to access private data field of
> > ForeignScanState from the RefetchForeignRow callback? (2) why it
> > is reasonable to implement than the callback on ForeignRecheck().
> 
> For the foreign table case (scanrelid>0), I imagined an approach
> different than yours.  In that case, I thought the issue would be
> probably addressed by just modifying the remote query performed in
> RefetchForeignRow, which would be of the form "SELECT ctid, * FROM
> remote table WHERE ctid = $1", so that the modified query would be of
> the form "SELECT ctid, * FROM remote table WHERE ctid = $1 AND *remote
> quals*".
>
My question is how to pull expression of the remote query.
It shall be stored at somewhere private field of ForeignScanState,
however, RefetchForeignRow does not have direct access to the
relevant ForeignScanState node.
It is what I asked at the question (1).

Also note that EvalPlanQualFetchRowMarks() will raise an error
if RefetchForeignRow callback returned NULL tuple.
Is it right or expected behavior?
It looks to me this callback is designed to pull out a particular
tuple identified by the remote-row-id, regardless of the qualifier
checks based on the latest value.

> For the foreign join case (scanrelid==0), in my vision, I think we would
> need some changes not only to RefetchForeignRow but to the existing
> EvalPlanQual machinery in the core.  I've not had a clear image yet, though.
>
If people agree with FDW remote join is incomplete feature in v9.5,
the attached fix-up is the minimum requirement from the standpoint
of custom-scan/join.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


pgsql-fdw-epq-recheck.v3.patch
Description: pgsql-fdw-epq-recheck.v3.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rework the way multixact truncations work

2015-09-29 Thread Andres Freund
On 2015-09-28 21:48:00 -0500, Jim Nasby wrote:
> On 9/28/15 8:49 PM, Robert Haas wrote:
> >If at some point we back-patch this further, then it potentially
> >becomes a live issue, but I would like to respectfully inquire what
> >exactly you think making it a PANIC would accomplish?  There are a lot
> >of scary things about this patch, but the logic for deciding whether
> >to perform a legacy truncation is solid as far as I know.
> 
> Maybe I'm confused, but I thought the whole purpose of this was to get rid
> of the risk associated with that calculation in favor of explicit truncation
> boundaries in the WAL log.

> Even if that's not the case, ISTM that being big and in your face about a
> potential data corruption bug is a good thing, as long as the DBA has a way
> to "hit the snooze button".

So we'd end up with a guc that everyone has to set while they
upgrade. That seems like a pointless hassle.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Comment update to pathnode.c

2015-09-29 Thread Robert Haas
On Tue, Sep 29, 2015 at 1:55 AM, Etsuro Fujita
 wrote:
> Thanks for the comments!  Attached is an updated version of the patch.

Committed and back-patched to 9.5.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Merlin Moncure
On Wed, Sep 23, 2015 at 1:18 PM, Kam Lasater  wrote:
> Hello,
>
> Last night I heard that Postgres had no issue/bug tracker. At first I
> thought the guy was trolling me. Seriously, how could this be. Certainly a
> mature open source project that is the database for a measurable percentage
> of the internet would have an issue tracker.
>
> Sadly, I was not being trolled. I'm new around here so I will keep the
> preaching to a minimum and cut right to the chase...
>
> ___It is time for an issue tracker___

This thread is depressing me.  We use all these fancy tools at $work
and I'm not sure they are much of an improvement over informal
processes run by capable people.   I regularly advocate, pretty much
to the wind, to use JIRA less and email *more*.   The main benefit of
the system, various reports to corporate taskmasters, seems pretty
much irrelevant here.  If you're advocating introduction of new
tooling with all the associated processes and complexities, can you
point to specific breakdowns in the project and exactly how said
tooling would have helped the situation?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5: Can't connect with PGSSLMODE=require on Windows

2015-09-29 Thread Asif Naeem
Thank you Tom. The issue seems not reproducible anymore with latest PG95
source code (commit 60fcee9e5e77dc748a9787fae34328917683b95e) Windows build
i.e.

C:\PG\postgresql\pg95_with_openssl>bin\psql.exe -d postgres -h
> 172.16.141.232
> psql (9.5alpha2)
> WARNING: Console code page (437) differs from Windows code page (1252)
>  8-bit characters might not work correctly. See psql reference
>  page "Notes for Windows users" for details.
> SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384,
> bits: 256, compression: off)
> Type "help" for help.
> postgres=# select version();
>  version
> -
>  PostgreSQL 9.5alpha2, compiled by Visual C++ build 1800, 64-bit
> (1 row)


Regards,
Muhammad Asif Naeem

On Tue, Sep 29, 2015 at 3:03 AM, Tom Lane  wrote:

> Thom Brown  writes:
> > With 9.5 alpha 2 on Windows 8 (64-bit), trying to require SSL results
> > in a blocking error:
>
> I've pushed a patch for this; can you verify it on Windows?
>
> regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] row_security GUC, BYPASSRLS

2015-09-29 Thread Adam Brightwell
> I've attached a patch to implement it.  It's not fully polished but it's
> sufficient for comment, I believe.  Additional comments, documentation
> and regression tests are to be added, if we have agreement on the
> grammer and implementation approach.

I have given the first patch a quick review.  I think I agree with the
grammar, it makes sense to me.  At first I didn't really like NO
FORCE, but I couldn't think of anything better.

One comment:

  if (pg_class_ownercheck(relid, user_id))
- return RLS_NONE_ENV;
+ {
+ if (relforcerowsecurity)
+ return RLS_ENABLED;
+ else
+ return RLS_NONE_ENV;
+ }

Is the 'else' even necessary in this block?

Other than that, the approach looks good to me.

The patch applied cleanly against master (758fcfd).  As well
'check-world' was successful (obviously understanding that more
related tests are necessary).

> I have a hard time with this.  We're not talking about the application
> logic in this case, we're talking about the guarantees which the
> database is making to the user, be it an application or an individual.
>
> I've included a patch (the second in the set attached) which adds a
> SECURITY_NOFORCE_RLS bit which has a much tighter scope- it only applies
> after the regular owner check is done.  This reduces the risk of it
> being set mistakenly dramatically, I believe.  Further, the cascaded
> case is correctly handled.  This also needs more polish and regression
> tests, but I wanted to solicit for comment before moving forward with
> that since we don't have a consensus about if this should be done.

I'm not sure that I understand the previous concerns around this bit
well enough to speak intelligently on this point.  However, with that
said, I believe the changes made by this patch make sense.

-Adam

-- 
Adam Brightwell - adam.brightw...@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-29 Thread Nikolay Shaplov
В письме от 26 сентября 2015 20:57:25 пользователь Michael Paquier написал:

> > So I would consider two options: Either move t_infomask/t_infomask2
> > details
> > into storage.sgml or leave as it is.
> 
> The documentation redirects the reader to look at htup_details.h (the
> documentation is actually incorrect, I sent a separate patch), and I
> see no point in duplicating such low-level descriptions, that would be
> double maintenance for the same result.

What do you think about this? (I've changed only heap_tuple_items 
documentation there)



-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..e4bc1af 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,7 +5,7 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \
 	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
 	pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..52a0663 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -29,7 +29,14 @@
 #include "funcapi.h"
 #include "utils/builtins.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/rel.h"
+#include "catalog/pg_type.h"
 
+Datum split_tuple_data(char *tuple_data, uint16 tuple_data_len,
+			TupleDesc tuple_desc, uint16 t_infomask,
+			uint16 t_infomask2, bits8 *t_bits, bool
+			do_detoast, int error_level);
 
 /*
  * bits_to_text
@@ -91,7 +98,7 @@ heap_page_items(PG_FUNCTION_ARGS)
 		if (raw_page_size < SizeOfPageHeaderData)
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-  errmsg("input page too small (%d bytes)", raw_page_size)));
+	 errmsg("input page too small (%d bytes)", raw_page_size)));
 
 		fctx = SRF_FIRSTCALL_INIT();
 		mctx = MemoryContextSwitchTo(fctx->multi_call_memory_ctx);
@@ -122,8 +129,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		HeapTuple	resultTuple;
 		Datum		result;
 		ItemId		id;
-		Datum		values[13];
-		bool		nulls[13];
+		Datum		values[14];
+		bool		nulls[14];
 		uint16		lp_offset;
 		uint16		lp_flags;
 		uint16		lp_len;
@@ -155,6 +162,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		{
 			HeapTupleHeader tuphdr;
 			int			bits_len;
+			bytea		*tuple_data_bytea;
+			int			tuple_data_len;
 
 			/* Extract information from the tuple header */
 
@@ -168,6 +177,14 @@ heap_page_items(PG_FUNCTION_ARGS)
 			values[9] = UInt32GetDatum(tuphdr->t_infomask);
 			values[10] = UInt8GetDatum(tuphdr->t_hoff);
 
+			/* Copy raw tuple data into bytea attribute */
+			tuple_data_len = lp_len - tuphdr->t_hoff;
+			tuple_data_bytea = (bytea *) palloc(tuple_data_len + VARHDRSZ);
+			SET_VARSIZE(tuple_data_bytea, tuple_data_len + VARHDRSZ);
+			memcpy(VARDATA(tuple_data_bytea), (char *) tuphdr + tuphdr->t_hoff,
+	 tuple_data_len);
+			values[13] = PointerGetDatum(tuple_data_bytea);
+			nulls[13] = false;
 			/*
 			 * We already checked that the item is completely within the raw
 			 * page passed to us, with the length given in the line pointer.
@@ -208,7 +225,7 @@ heap_page_items(PG_FUNCTION_ARGS)
 			 */
 			int			i;
 
-			for (i = 4; i <= 12; i++)
+			for (i = 4; i <= 13; i++)
 nulls[i] = true;
 		}
 
@@ -223,3 +240,223 @@ heap_page_items(PG_FUNCTION_ARGS)
 	else
 		SRF_RETURN_DONE(fctx);
 }
+
+PG_FUNCTION_INFO_V1(tuple_data_split);
+Datum
+tuple_data_split(PG_FUNCTION_ARGS)
+{
+	Oidrel_oid;
+	bytea		   *raw_data;
+	uint16			t_infomask;
+	uint16			t_infomask2;
+	text		   *t_bits_str;
+	RelationData   *rel;
+	TupleDesc		tuple_desc;
+	bool			do_detoast = false;
+	interror_level = ERROR;
+
+	bits8		   *t_bits = NULL;
+	Datum			res;
+
+	rel_oid		= PG_GETARG_OID(0);
+	raw_data	= PG_GETARG_BYTEA_P(1);
+	t_infomask	= PG_GETARG_INT16(2);
+	t_infomask2	= PG_GETARG_INT16(3);
+	t_bits_str	= PG_ARGISNULL(4) ? NULL : PG_GETARG_TEXT_P(4);
+	if (PG_NARGS()>=6)
+		do_detoast = PG_GETARG_BOOL(5);
+	if (PG_NARGS()>=7)
+		error_level = PG_GETARG_BOOL(6) ? WARNING : ERROR;
+
+	if (!superuser())
+		ereport(ERROR,
+(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+	(errmsg("must be superuser to use raw page functions";
+
+	/*
+	 * Here we converting t_bits string back to the bits8 array,
+	 * as it really is in the tuple header
+	 */
+	if (t_infomask & HEAP_HASNULL)
+	{
+		int		bits_str_len;
+		int		bits_len;
+		char   *p;
+		int		off;
+		char	byte = 0;
+
+		bits_len = (t_infomask2 & HEAP_NATTS_MASK) / 8 + 1;
+		if (!t_bits_str)
+			ereport(ERROR,
+	(errcode(ERRCODE_DATA_CORRUPTED),
+	 errmsg("t_bits argument is NULL, though we expect it to be NOT NULL and %i character long",
+			bits_len 

Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Steve Crawford
On Tue, Sep 29, 2015 at 7:16 AM, David Fetter  wrote:

> ...What we're not fine with is depending on a proprietary system, no
> matter what type of license, as infrastructure...
>
>
Exactly. Which is why I was warning about latching onto features only
available in the closed enterprise version.

Cheers,
Steve


Re: [HACKERS] 9.5: Can't connect with PGSSLMODE=require on Windows

2015-09-29 Thread Tom Lane
Asif Naeem  writes:
> Thank you Tom. The issue seems not reproducible anymore with latest PG95
> source code (commit 60fcee9e5e77dc748a9787fae34328917683b95e) Windows build

Thanks for testing!  I've marked this issue resolved in the 9.5 open-items
list.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread David Fetter
On Tue, Sep 29, 2015 at 07:01:16AM -0700, Steve Crawford wrote:
> On Mon, Sep 28, 2015 at 4:41 PM, Jim Nasby  wrote:
> 
> > Note that since they also offer a hosted solution we should use that to
> > play with instead of trying to install it at this point.
> >
> > Integrating the issue tracker looks like it's just a call to this API:
> > http://doc.gitlab.com/ce/api/issues.html#new-issue. I don't normally do
> > web development myself so I'd rather not figuring out how to setup a copy
> > of the website to hack on, but if no one else wants to try it I can take a
> > stab at it.
> >
> > Presumably mirroring our git repository would work the same as it does for
> > mirroring to GitHub. My guess is that would be enough to get the basic
> > git/issue tracker integration working.
> >
> > Commitfest could be tied in as well. Presumably each commitfest would be a
> > milestone (http://doc.gitlab.com/ce/api/milestones.html) and each
> > submission an issue.
> >
> >
> One of the issues identified with  Github is that it is closed and
> commercial which goes against the expressed desires of the PostgreSQL
> community. As such, it's important to note that Gitlab seems to be in the
> "freemium" model with a "community" and an "enterprise" version so for
> comparison purposes we should only look at the features in the open-source
> community version:
> https://about.gitlab.com/features/#compare

Pardon me for getting off in the weeds here, but the PostgreSQL
community is just fine with proprietary closed-source software.  There
are probably more billion-dollar proprietary closed-source forks of
PostgreSQL than of any other open source software, certainly if you
normalize to the number of contributors.  We fully intend to continue
spawning those proprietary closed-source forks.

What we're not fine with is depending on a proprietary system, no
matter what type of license, as infrastructure.  We've been burned
that way before, and we have no intention of getting burned again that
way again.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] track_commit_timestamp and COMMIT PREPARED

2015-09-29 Thread Alvaro Herrera
Petr Jelinek wrote:
> On 2015-09-29 13:44, Fujii Masao wrote:
> >On Tue, Sep 29, 2015 at 12:05 PM, Alvaro Herrera
> > wrote:

> >-#define RecoveryRequiresBoolParameter(param_name, currValue, masterValue) \
> >-do { \
> >-bool _currValue = (currValue); \
> >-bool _masterValue = (masterValue); \
> >-if (_currValue != _masterValue) \
> >-ereport(ERROR, \
> >-(errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
> >- errmsg("hot standby is not possible because it
> >requires \"%s\" to be same on master and standby (master has \"%s\",
> >standby has \"%s\")", \
> >-param_name, \
> >-_masterValue ? "true" : "false", \
> >-_currValue ? "true" : "false"))); \
> >-} while(0)
> >
> >This code should not be deleted because there is still the caller of
> >the macro function.
> 
> Looks like Alvaro didn't merge the second patch correctly, the only caller
> should have been removed as well.

filterdiff loses hunks once in a while, which is why I stopped using it
quite some time ago :-(  I eyeballed its output to ensure it hadn't
dropped any hunk, but apparently I missed that one.

I guess I should file a bug report.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: index-only scans with partial indexes

2015-09-29 Thread Tomas Vondra

Hello,

On 09/29/2015 12:27 PM, Kyotaro HORIGUCHI wrote:

Hi,

At Sat, 26 Sep 2015 18:00:33 +0200, Tomas Vondra  wrote 
in <5606c121.10...@2ndquadrant.com>

Hi,

On 09/26/2015 01:28 PM, Tomas Vondra wrote:


The patch does not change the check_index_only implementation - it
still needs to check the clauses, just like in v1 of the patch. To
make this re-check unnecessary, we'd have to stick the remaining
clauses somewhere, so that check_index_only can use only the filtered
list (instead of walking through the complete list of restrictions).


After thinking about this a bit more, I realized we already have a
good place for keeping this list - IndexClauseSet. So I've done that,


The place looks fine but it might be better that rclauses have
baserestrictinfo itself when indpred == NIL. It would make the
semantics of rclauses more consistent.


and removed most of the code from check_index_only - it still needs to
decide whether to use the full list of restrictions (regular indexes)
or the filtered list (for partial indexes).


The change above allows to reduce the modification still left in
check_index_only.


I'm not sure I understand what change you suggest? Could you explain?

The change in check_index_only is effectively just (a) comment update
and (b) choice of the right list of clauses. I'd say both changes are 
needed, although (b) could happen outside check_index_only (i.e. we 
could do the check elsewhere). Is that what you mean?




cost_index() seems to need to be fixed. It would count excluded
clauses in estimate.


Hmm, good point. The problem is that extract_nonindex_conditions uses 
baserel->baserestrictinfo again, i.e. it does not skip the implied 
clauses. So we may either stick the filtered clauses somewhere (for 
example in the IndexPath), teach extract_nonindex_conditions to use 
predicate_implied_by. I'd say the first option is better. Agreed?


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unclear about row-level security USING vs. CHECK

2015-09-29 Thread Adam Brightwell
> My vote would be to keep it as-is.

Same for me.

> It feels perfectly natural to me. USING clauses add to the query's
> WHERE clause controlling which existing rows you can SELECT, UPDATE or
> DELETE. WITH CHECK clauses control what new data you can add via
> INSERT or UPDATE. UPDATE allows both, but most of the time I expect
> you'll want them to be the same.

I agree.  In the current uses cases I have been experimenting with,
this approach has made the most sense.

> So having the WITH CHECK clause default to being the same as the USING
> clause for UPDATE matches what I expect to be the most common usage.

I agree.

> Users granted permission to update a subset of the table's rows
> probably don't want to give those rows away. More advanced use-cases
> are still supported, but the simplest/most common case is the default,
> which means that you don't have to supply the same expression twice.

Yes, I agree.  IMO, having to supply the same expression twice just
seems cumbersome and unnecessary.  While I'd certainly agree that
documentation could always be improved, I have found the current
behavior to be fairly intuitive and easily understood by most (if not
all) DBA's I have spoken with about it.

-Adam

-- 
Adam Brightwell - adam.brightw...@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ON CONFLICT issues around whole row vars,

2015-09-29 Thread Andres Freund
On 2015-09-24 17:25:21 +0200, Andres Freund wrote:
> Stuff I want to fix by tomorrow:
> * Whole row var references to exclude
> * wrong offsets for columns after dropped ones
> * INSTEAD DO UPDATE for tables with oids
>
> Do you know of anything else?

So, took a bit longer than "tomorrow. I fought for a long while with a
mysterious issue, which turned out to be separate bug: The excluded
relation was affected by row level security policies, which doesn't make
sense.

My proposal in this WIP patch is to make it a bit clearer that
'EXCLUDED' isn't a real relation. I played around with adding a
different rtekind, but that's too heavy a hammer. What I instead did was
to set relkind to composite - which seems to signal pretty well that
we're not dealing with a real relation. That immediately fixes the RLS
issue as fireRIRrules has the following check:
if (rte->rtekind != RTE_RELATION ||
rte->relkind != RELKIND_RELATION)
continue;
It also makes it relatively straightforward to fix the system column
issue by adding an additional relkind check to scanRTEForColumn's system
column handling.

WRT to the wholerow issue: There's currently two reasons we need a
targetlist entry for excluded wholerow vars: 1) setrefs.c errors out
without - that can relativley easily be worked around 2) ruleutils.c
expects an entry in the child tlist. That could also be worked around,
but it's a bit more verbose.  I'm inclined to not go the pullup route
but instead simply unconditionally add a wholerow var to the excluded
tlist.

Peter, what do you think?

Andres
>From f11fc12993beabf4d280b979c062682b6612c989 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Tue, 29 Sep 2015 17:08:36 +0200
Subject: [PATCH] wip-upsert

---
 src/backend/parser/analyze.c  |  76 +++
 src/backend/parser/parse_relation.c   |   7 +-
 src/test/regress/expected/insert_conflict.out | 101 ++
 src/test/regress/expected/rules.out   |  18 ++---
 src/test/regress/sql/insert_conflict.sql  |  56 ++
 src/test/regress/sql/rules.sql|   2 +-
 6 files changed, 234 insertions(+), 26 deletions(-)

diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index a0dfbf9..528825a 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -891,33 +891,81 @@ transformOnConflictClause(ParseState *pstate,
 	/* Process DO UPDATE */
 	if (onConflictClause->action == ONCONFLICT_UPDATE)
 	{
+		Relation targetrel = pstate->p_target_relation;
+		Var	   *var;
+		TargetEntry *te;
+		int		attno;
+
 		/*
 		 * All INSERT expressions have been parsed, get ready for potentially
 		 * existing SET statements that need to be processed like an UPDATE.
 		 */
 		pstate->p_is_insert = false;
 
+		/*
+		 * Add range table entry for the EXCLUDED pseudo relation; relkind is
+		 * set to composite to signal that we're not dealing with an actual
+		 * relation.
+		 */
 		exclRte = addRangeTableEntryForRelation(pstate,
-pstate->p_target_relation,
+targetrel,
 makeAlias("excluded", NIL),
 false, false);
+		exclRte->relkind = RELKIND_COMPOSITE_TYPE;
 		exclRelIndex = list_length(pstate->p_rtable);
 
 		/*
-		 * Build a targetlist for the EXCLUDED pseudo relation. Out of
-		 * simplicity we do that here, because expandRelAttrs() happens to
-		 * nearly do the right thing; specifically it also works with views.
-		 * It'd be more proper to instead scan some pseudo scan node, but it
-		 * doesn't seem worth the amount of code required.
-		 *
-		 * The only caveat of this hack is that the permissions expandRelAttrs
-		 * adds have to be reset. markVarForSelectPriv() will add the exact
-		 * required permissions back.
+		 * Build a targetlist for the EXCLUDED pseudo relation. Have to be
+		 * careful to use resnos that correspond to attnos of the underlying
+		 * relation.
+		 */
+		Assert(pstate->p_next_resno == 1);
+		for (attno = 0; attno < targetrel->rd_rel->relnatts; attno++)
+		{
+			Form_pg_attribute attr = targetrel->rd_att->attrs[attno];
+			char		*name;
+
+			if (attr->attisdropped)
+			{
+/*
+ * can't use atttypid here, but it doesn't really matter
+ * what type the Const claims to be.
+ */
+var = (Var *) makeNullConst(INT4OID, -1, InvalidOid);
+name = "";
+			}
+			else
+			{
+var = makeVar(exclRelIndex, attno + 1,
+			  attr->atttypid, attr->atttypmod,
+			  attr->attcollation,
+			  0);
+var->location = -1;
+
+name = NameStr(attr->attname);
+			}
+
+			Assert(pstate->p_next_resno == attno + 1);
+			te = makeTargetEntry((Expr *) var,
+ pstate->p_next_resno++,
+ name,
+ false);
+
+			/* don't require select access yet */
+			exclRelTlist = lappend(exclRelTlist, te);
+		}
+
+		/*
+		 * Additionally add a whole row tlist entry for EXCLUDED. That's
+		 * really 

Re: [HACKERS] [PATCH v2] GSSAPI encryption support

2015-09-29 Thread Robbie Harwood
Robbie Harwood  writes:

 Michael Paquier  writes:

> Well, the issue is still here: login through gssapi fails with
> your patch, not with HEAD. This patch is next on my review list by
> the way so I'll see what I can do about it soon even if I am in
> the US for Postgres Open next week. Still, how did you test it? I
> am just creating by myself a KDC, setting up a valid credential
> with kinit, and after setting up Postgres for this purpose the
> protocol communication just fails.
>
> I have no issues, no sync loss; nothing is amiss as far as I can see.
> If there is actually a problem here, I need more information from you.
> At the very least, as previously mentioned, I need to know what
> messages went over the wire to/from the server before it occurred, and
> what command (if it it made it to command processing) it was in the
> midst of sending.

Any follow-up on this?  I'd really like my code to be bug-free.


signature.asc
Description: PGP signature


Re: [HACKERS] Box type equality

2015-09-29 Thread Tom Lane
Stanislav Kelvich  writes:
> I've faced an issue with Box type comparison that exists almost for a five 
> years.

Try twenty-five years.  The code's been like that since Berkeley.

>   That can be fixed by b-tree equality for boxes, but we need some
>   decisions there.

The problem with inventing a btree opclass for boxes is much more
fundamental than fuzzy comparisons, unfortunately.  Btree requires a
linear sort order, and there's no plausible linear ordering of boxes,
unless you compare areas which won't give the equality semantics you want.

We could perhaps invent an exact-equality operator and construct just
a hash opclass for it, no btree.

In any case I think it would be a mistake to consider only boxes; all
the built-in geometric types have related issues.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Box type equality

2015-09-29 Thread Jeff Anton

The Box type is the oldest non-linear type in the Postgres system.
We used it as the template for extensibility in the original system 
about thirty years ago.  We had R-trees for box indexing.  If you want 
fuzzy box matching, that seems possible with R-trees and some creativity 
by say matching an R-tree with a little larger box and using containment 
and maybe also not contained by a smaller box.  This is the idea behind 
strategies.  That you can use existing operations to build a new operation.


If you have to force this onto B-tree's I think you will have to choose 
one edge to index on (i.e. one of the four values) then fuzzy match that 
through the index and have a secondary condition to further restrict the 
matches.


As with all the geometric types, you can use containment boxes for them 
and have the secondary condition checks.


It's all just a few lines of code as Stonebraker used to say.

Jeff Anton


On 09/29/15 08:43, Tom Lane wrote:

Stanislav Kelvich  writes:

I've faced an issue with Box type comparison that exists almost for a five 
years.


Try twenty-five years.  The code's been like that since Berkeley.


   That can be fixed by b-tree equality for boxes, but we need some
   decisions there.


The problem with inventing a btree opclass for boxes is much more
fundamental than fuzzy comparisons, unfortunately.  Btree requires a
linear sort order, and there's no plausible linear ordering of boxes,
unless you compare areas which won't give the equality semantics you want.

We could perhaps invent an exact-equality operator and construct just
a hash opclass for it, no btree.

In any case I think it would be a mistake to consider only boxes; all
the built-in geometric types have related issues.

regards, tom lane





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On-demand running query plans using auto_explain and signals

2015-09-29 Thread Simon Riggs
On 25 September 2015 at 12:13, Shulgin, Oleksandr <
oleksandr.shul...@zalando.de> wrote:


> I now believe that use of ProcessInterrupts() in the recently proposed
> design as well as manipulation of proc latch due to use of shared memory
> queue are major blockers.
>
> In order to simplify things up, I've taken a step back and looked again at
> the auto_explain contrib module.  I now propose the most simple thing
> possible in my opinion: a new GUC option for auto_explain.  It will make
> every backend, in which the module is loaded via *_preload_libraries
> mechanism or using CREATE EXTENSION/LOAD commands, to actively publish the
> plans of queries in dynamic shared memory as long as
> auto_explain.publish_plans is turned on.
>
> The greatest part for me, is that this approach doesn't require handling
> of signals and is isolated in an external module, so it can be readily used
> with the current server versions, no need to wait for >= 9.6.
>

This is a major change of direction, so the thread title no longer applies
at all.

The requirement is to be able to see the output of EXPLAIN ANALYZE for a
running process. Ideally, we would dump the diagnostic output for any
process that runs longer than a specific timeout, so we can decide whether
to kill it, or just save that for later diagnostics.

I'm interested in helping the original goal happen. Dumping an EXPLAIN,
without ANALYZE info, is not at all interesting since it has no value for
immediate action or post-facto diagnosis, sorry to say - making it happen
for every backend just makes it even less useful.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] GinPageIs* don't actually return a boolean

2015-09-29 Thread Alvaro Herrera
Andres Freund wrote:

> I went through all headers in src/include and checked for macros
> containing [^&]&[^&] and checked whether they have this hazard. Found a
> fair number.
> 
> That patch also changes !! tests into != 0 style.

I don't think you pushed this, did you?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: index-only scans with partial indexes

2015-09-29 Thread Tomas Vondra

On 09/29/2015 04:57 PM, Tomas Vondra wrote:

Hello,

On 09/29/2015 12:27 PM, Kyotaro HORIGUCHI wrote:


...



cost_index() seems to need to be fixed. It would count excluded
clauses in estimate.


Hmm, good point. The problem is that extract_nonindex_conditions uses
baserel->baserestrictinfo again, i.e. it does not skip the implied
clauses. So we may either stick the filtered clauses somewhere (for
example in the IndexPath), teach extract_nonindex_conditions to use
predicate_implied_by. I'd say the first option is better. Agreed?


And the attached patch v4 should do the trick - it adds 'indexrinfos' to 
IndexPath and uses it in cost_index().


CREATE TABLE t AS SELECT i AS a, i AS b, i AS c
FROM generate_series(1,1000) s(i);

CREATE INDEX idx ON t(a) WHERE b > 1000;

Then

SELECT a FROM t WHERE b > 1000 AND a < 1000; /* size(qpquals) = 0 */
SELECT a FROM t WHERE b > 1000 AND c < 1000; /* size(qpquals) = 1 */
SELECT a FROM t WHERE c > 1000 AND c < 1000; /* size(qpquals) = 2 */

and so on. Which seems correct I believe.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index d107d76..5feb965 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -86,6 +86,7 @@
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planmain.h"
+#include "optimizer/predtest.h"
 #include "optimizer/restrictinfo.h"
 #include "parser/parsetree.h"
 #include "utils/lsyscache.h"
@@ -345,7 +346,7 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count)
 		path->path.rows = path->path.param_info->ppi_rows;
 		/* qpquals come from the rel's restriction clauses and ppi_clauses */
 		qpquals = list_concat(
-	   extract_nonindex_conditions(baserel->baserestrictinfo,
+	   extract_nonindex_conditions(path->indexrinfos,
    path->indexquals),
 			  extract_nonindex_conditions(path->path.param_info->ppi_clauses,
 		  path->indexquals));
@@ -354,7 +355,7 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count)
 	{
 		path->path.rows = baserel->rows;
 		/* qpquals come from just the rel's restriction clauses */
-		qpquals = extract_nonindex_conditions(baserel->baserestrictinfo,
+		qpquals = extract_nonindex_conditions(path->indexrinfos,
 			  path->indexquals);
 	}
 
@@ -558,6 +559,7 @@ extract_nonindex_conditions(List *qual_clauses, List *indexquals)
 			continue;			/* simple duplicate */
 		if (is_redundant_derived_clause(rinfo, indexquals))
 			continue;			/* derived from same EquivalenceClass */
+
 		/* ... skip the predicate proof attempts createplan.c will try ... */
 		result = lappend(result, rinfo);
 	}
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 9da5444..5f30aaa 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -59,6 +59,7 @@ typedef struct
 	bool		nonempty;		/* True if lists are not all empty */
 	/* Lists of RestrictInfos, one per index column */
 	List	   *indexclauses[INDEX_MAX_KEYS];
+	List	   *rclauses;		/* clauses not implied by predicate */
 } IndexClauseSet;
 
 /* Per-path data used within choose_bitmap_and() */
@@ -129,7 +130,7 @@ static PathClauseUsage *classify_index_clause_usage(Path *path,
 static Relids get_bitmap_tree_required_outer(Path *bitmapqual);
 static void find_indexpath_quals(Path *bitmapqual, List **quals, List **preds);
 static int	find_list_position(Node *node, List **nodelist);
-static bool check_index_only(RelOptInfo *rel, IndexOptInfo *index);
+static bool check_index_only(RelOptInfo *rel, IndexOptInfo *index, List *clauses);
 static double get_loop_count(PlannerInfo *root, Index cur_relid, Relids outer_relids);
 static double adjust_rowcount_for_semijoins(PlannerInfo *root,
 			  Index cur_relid,
@@ -866,6 +867,7 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
 	double		loop_count;
 	List	   *orderbyclauses;
 	List	   *orderbyclausecols;
+	List	   *rinfos;
 	List	   *index_pathkeys;
 	List	   *useful_pathkeys;
 	bool		found_lower_saop_clause;
@@ -1013,13 +1015,16 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
 		orderbyclausecols = NIL;
 	}
 
+	/* for partial indexes, use the non-implied RestrictInfo nodes */
+	rinfos = (index->indpred != NIL) ? clauses->rclauses : rel->baserestrictinfo;
+
 	/*
 	 * 3. Check if an index-only scan is possible.  If we're not building
 	 * plain indexscans, this isn't relevant since bitmap scans don't support
 	 * index data retrieval anyway.
 	 */
 	index_only_scan = (scantype != ST_BITMAPSCAN &&
-	   check_index_only(rel, index));
+	   check_index_only(rel, index, rinfos));
 
 	/*
 	 * 4. Generate an indexscan path if there are relevant restriction clauses
@@ -1033,6 +1038,7 @@ 

[HACKERS] Box type equality

2015-09-29 Thread Stanislav Kelvich
Hi.

I've faced an issue with Box type comparison that exists almost for a five 
years.

> create table t (b box);
CREATE TABLE
> select count(*), b from t group by b;
ERROR:  could not identify an equality operator for type box

As mentioned in 
http://www.postgresql.org/message-id/pine.lnx.4.64.1012040051500.12...@sn.sai.msu.ru

  That can be fixed by b-tree equality for boxes, but we need some decisions 
there. We can compare 
floats up to certain threshold or strictly, and box equality can be defined as 
coordinate equality or as equality of areas. 
  In a case of threshold-based comparison we will not lose equality due to 
rounding errors, for example 
applying commutative functions in different order will preserve equality, but 
we can face non-transitive equalities, like 
box1 == box2, box2 == box3, but box1 != box3 and GROUP BY can produce strange 
results.
  With strict comparison equality is transitive, but we can lose that equality 
due to rounding.

Now in geo_decls.h we have:

#define EPSILON 1.0E-06
#define FPeq(A,B)   (fabs((A) - (B)) <= EPSILON)

and equality means equal areas.

But for GROUP BY it better be opposite: equality of coordinates and strict 
comparison.

Simplest workaround in my perspective is to add another operator for box type 
(e.g. ==) that will perform strict comparison
of coordinates and use it in b-tree ops. 

Any objections/suggestions?

-
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Andrew Dunstan



On 09/29/2015 10:55 AM, Steve Crawford wrote:
On Tue, Sep 29, 2015 at 7:16 AM, David Fetter > wrote:


...What we're not fine with is depending on a proprietary system, no
matter what type of license, as infrastructure...


Exactly. Which is why I was warning about latching onto features only 
available in the closed enterprise version.






Like Tom, I more or less promised myself not to get terribly involved in 
this discussion. Oh, well.


I'm not a fan of the "free sample" model of software. What happens when 
you want a feature that's in the not-free edition of the software? I 
think gitlab simply doesn't suit us for a number of reasons, and that 
seems to be the emerging consensus.


The only viable possibilities seem to me to be bugzilla and debbugs. 
Both are dedicated trackers, unquestionably open source, have long 
pedigrees, are very likely to stay around, and are or can be integrated 
with email systems. I have not personally used debbugs, so I favour 
bugzilla simply on the ground of familiarity, but I know other people 
dislike it. I will tell a small story about it - about 14 years ago I 
was given responsibility for an extra team following a corporate merger. 
They had been using a proprietary bug tracker while we had been using 
bugzilla. We decided to switch them to bugzilla so they sould integrate 
with the tem from the company I had been working for, and they bitched 
and moaned about it something fierce. Later the company decided to 
standardize on the proprietary system, and the same people bitched and 
moaned far more loudly at being made to give up bugzilla, which they 
found much more friendly. And in those days it wasn't as nice or capable 
as it is now.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Arguable RLS security bug, EvalPlanQual() paranoia

2015-09-29 Thread Stephen Frost
* Adam Brightwell (adam.brightw...@crunchydatasolutions.com) wrote:
> On Mon, Aug 3, 2015 at 6:21 PM, Peter Geoghegan  wrote:
> > On Mon, Aug 3, 2015 at 3:07 PM, Stephen Frost  wrote:
> >> Thoughts?  Trying to keep it straight-forward and provide a simple
> >> solution for users to be able to address the issue, if they're worried
> >> about it.  Perhaps this, plus an additional paragraph which goes into
> >> more detail about exactly what's going on?
> >
> > I'm still thinking about it, but I think you have the right idea here.
> 
> I have attached a patch for review that I believe addresses the
> documentation side of this issue.
> 
> Thoughts or comments?

I'm not convinced this is the right place, but at a minimum it should be
referenced from the RLS documentation.  Further, it should be noted that
users who have direct SQL access can control what the isolation level
is for their transaction.

Also, isn't it possible to avoid this by locking the records?  If the
locking fails or blocks then you know another user has those records
locked and you don't update or you wait until you hold the lock.
Assuming that works (I don't immediately see why it wouldn't..), we
should provide an example.

Thanks!

Stephen


signature.asc
Description: Digital signature


[HACKERS] [PATCH 6/6] pg_basebackup: add a single-tar output format.

2015-09-29 Thread Joshua Elsasser
This will write one single tar file containing all tablespaces, and
can be written to stdout.
---
 src/bin/pg_basebackup/pg_basebackup.c | 282 --
 1 file changed, 269 insertions(+), 13 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c 
b/src/bin/pg_basebackup/pg_basebackup.c
index e29e466..b3534cb 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -29,6 +29,7 @@
 #include "getopt_long.h"
 #include "libpq-fe.h"
 #include "pqexpbuffer.h"
+#include "common/fe_memutils.h"
 #include "pgtar.h"
 #include "pgtime.h"
 #include "receivelog.h"
@@ -71,7 +72,9 @@ typedef struct TarParser {
 
 /* Global options */
 static char *basedir = NULL;
+static char *outpath = NULL;
 static TablespaceList tablespace_dirs = {NULL, NULL};
+static char *default_basetablespace = NULL;
 static char *xlog_dir = "";
 static char format = 'p';  /* p(lain)/t(ar) */
 static char *label = "pg_basebackup base backup";
@@ -117,15 +120,19 @@ static void usage(void);
 static void disconnect_and_exit(int code);
 static void verify_dir_is_empty_or_create(char *dirname);
 static void progress_report(int tablespacenum, const char *filename, bool 
force);
+static char *get_base_tablespace_path(void);
 
 static void OpenTarFile(TarStream *tarfile, const char *path);
 static void CloseTarFile(TarStream *tarfile);
-static void TarInsertRecoveryConf(TarStream *stream);
+static void TarInsertRecoveryConf(TarStream *stream, const char *prefix);
+static void TarInsertDirectory(TarStream *stream, const char *path, mode_t 
mode);
 static void IterateAndWriteTar(TarParser *tp, char *inbuf, int buflen,
   bool 
(*callback)(char *, void *), void *cbarg);
 static bool TarIterSkipRecoveryConf(char *h, void *arg);
+static bool TarIterRenameForTablespace(char *h, void *arg);
 
 static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
+static void ReceiveAndAppendTarFile(TarStream *tarfile, PGconn *conn, PGresult 
*res, int rownum);
 static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum);
 static void GenerateRecoveryConf(PGconn *conn);
 static void WriteRecoveryConf(void);
@@ -259,7 +266,7 @@ usage(void)
printf(_("  %s [OPTION]...\n"), progname);
printf(_("\nOptions controlling the output:\n"));
printf(_("  -D, --pgdata=DIRECTORY receive base backup into 
directory\n"));
-   printf(_("  -F, --format=p|t   output format (plain (default), 
tar)\n"));
+   printf(_("  -F, --format=p|t|s output format (plain (default), tar, 
single-tar)\n"));
printf(_("  -r, --max-rate=RATEmaximum transfer rate to transfer 
data directory\n"
  " (in kB/s, or use suffix \"k\" or 
\"M\")\n"));
printf(_("  -R, --write-recovery-conf\n"
@@ -746,6 +753,39 @@ parse_max_rate(char *src)
return (int32) result;
 }
 
+
+/*
+ * Returns the path of the server's data directory. The returned string is
+ * malloc'd.
+ */
+static char *
+get_base_tablespace_path(void)
+{
+   PGconn *sqlconn;
+   PGresult *res;
+   char *dir;
+
+   sqlconn = GetConnection(false);
+   if (!sqlconn)
+   /* Error message already written in GetConnection() */
+   disconnect_and_exit(1);
+
+   res = PQexec(sqlconn, "SELECT setting FROM pg_catalog.pg_settings "
+"WHERE name = 'data_directory';");
+   if (PQresultStatus(res) != PGRES_TUPLES_OK)
+   {
+   fprintf(stderr, _("%s: could not get server data_directory: 
%s"),
+   progname, PQerrorMessage(sqlconn));
+   disconnect_and_exit(1);
+   }
+
+   dir = pg_strdup(PQgetvalue(res, 0, 0));
+   PQclear(res);
+   PQfinish(sqlconn);
+   return dir;
+}
+
+
 /*
  * Write a piece of tar data
  */
@@ -891,7 +931,7 @@ CloseTarFile(TarStream *tarfile)
  * Write a recovery.conf file into the tar stream.
  */
 static void
-TarInsertRecoveryConf(TarStream *stream)
+TarInsertRecoveryConf(TarStream *stream, const char *prefix)
 {
static char zerobuf[512];
charheader[512];
@@ -901,6 +941,8 @@ TarInsertRecoveryConf(TarStream *stream)
recoveryconfcontents->len,
0600, 04000, 02000,
time(NULL));
+   if (prefix != NULL)
+   TarIterRenameForTablespace(header, (void *)prefix);
 
padding = ((recoveryconfcontents->len + 511) & ~511) - 
recoveryconfcontents->len;
 
@@ -912,6 +954,29 @@ TarInsertRecoveryConf(TarStream *stream)
 
 
 /*
+ * Write a directory into the tar stream.
+ */
+static void
+TarInsertDirectory(TarStream *stream, const char *path, mode_t mode)
+{
+   char   hdr[512];
+
+   /* set file type to directory */
+   mode = (mode & 0) | 04;
+
+   

Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Tom Lane
Alvaro Herrera  writes:
> Merlin Moncure wrote:
>> I've read this email about three times now and it's not clear at all
>> to me what a issue/bug tracker brings to the table.

> In my opinion, this thread is about a bug tracker, not a patch tracker.
> We already have a patch tracking system which works very well.

Mumble ... the CF app works, but I'm not sure if it works "very well".
There's some ease-of-use issues I think, though we've now damped them
down to the point where the only really major stumbling block is getting
a patch into the CF app initially.

One thing to think about if we do adopt a bug tracker is how will it
interact with the CF app.  Ideally, patches that represent responses
to issues in the BT would be tracked more or less automatically by
both apps, if indeed we don't merge them altogether.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for improving buildfarm robustness

2015-09-29 Thread Joe Conway
On 09/29/2015 01:48 PM, Alvaro Herrera wrote:
> Joe Conway wrote:
>> On 09/29/2015 12:47 PM, Tom Lane wrote:
>>> We could possibly add additional checks, like trying to verify that
>>> pg_control has the same inode number it used to.  But I'm afraid that
>>> would add portability issues and false-positive hazards that would
>>> outweigh the value.
>>
>> Not sure you remember the incident, but I think years ago that would
>> have saved me some heartache ;-)
> 
> I remember it, but I'm not sure it would have helped you.  As I recall,
> your trouble was that after a reboot the init script decided to initdb
> the mount point -- postmaster wouldn't have been running at all ...

Right, which the init script non longer does as far as I'm aware, so
hopefully will never happen again to anyone.

But it was still a case where the postmaster started on one copy of
PGDATA (the newly init'd copy), and then the contents of the real PGDATA
was swapped in (when the filesystem was finally mounted), causing
corruption to the production data.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


[HACKERS] [PATCH 5/6] pg_basebackup: allow GetConnection() to make non-replication connections.

2015-09-29 Thread Joshua Elsasser
---
 src/bin/pg_basebackup/pg_basebackup.c  | 4 ++--
 src/bin/pg_basebackup/pg_receivexlog.c | 4 ++--
 src/bin/pg_basebackup/pg_recvlogical.c | 4 ++--
 src/bin/pg_basebackup/streamutil.c | 6 +++---
 src/bin/pg_basebackup/streamutil.h | 2 +-
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c 
b/src/bin/pg_basebackup/pg_basebackup.c
index ccd0890..e29e466 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -454,7 +454,7 @@ StartLogStreamer(char *startpos, uint32 timeline, char 
*sysidentifier)
 #endif
 
/* Get a second connection */
-   param->bgconn = GetConnection();
+   param->bgconn = GetConnection(true);
if (!param->bgconn)
/* Error message already written in GetConnection() */
exit(1);
@@ -1652,7 +1652,7 @@ BaseBackup(void)
/*
 * Connect in replication mode to the server
 */
-   conn = GetConnection();
+   conn = GetConnection(true);
if (!conn)
/* Error message already written in GetConnection() */
exit(1);
diff --git a/src/bin/pg_basebackup/pg_receivexlog.c 
b/src/bin/pg_basebackup/pg_receivexlog.c
index 0c322d1..3c61372 100644
--- a/src/bin/pg_basebackup/pg_receivexlog.c
+++ b/src/bin/pg_basebackup/pg_receivexlog.c
@@ -285,7 +285,7 @@ StreamLog(void)
 * Connect in replication mode to the server
 */
if (conn == NULL)
-   conn = GetConnection();
+   conn = GetConnection(true);
if (!conn)
/* Error message already written in GetConnection() */
return;
@@ -533,7 +533,7 @@ main(int argc, char **argv)
/*
 * Obtain a connection before doing anything.
 */
-   conn = GetConnection();
+   conn = GetConnection(true);
if (!conn)
/* error message already written in GetConnection() */
exit(1);
diff --git a/src/bin/pg_basebackup/pg_recvlogical.c 
b/src/bin/pg_basebackup/pg_recvlogical.c
index 93f61c3..faf7cbf 100644
--- a/src/bin/pg_basebackup/pg_recvlogical.c
+++ b/src/bin/pg_basebackup/pg_recvlogical.c
@@ -216,7 +216,7 @@ StreamLogicalLog(void)
 * Connect in replication mode to the server
 */
if (!conn)
-   conn = GetConnection();
+   conn = GetConnection(true);
if (!conn)
/* Error message already written in GetConnection() */
return;
@@ -856,7 +856,7 @@ main(int argc, char **argv)
 * helps to get more precise error messages about authentification,
 * required GUC parameters and such.
 */
-   conn = GetConnection();
+   conn = GetConnection(true);
if (!conn)
/* Error message already written in GetConnection() */
exit(1);
diff --git a/src/bin/pg_basebackup/streamutil.c 
b/src/bin/pg_basebackup/streamutil.c
index 2c963b6..74cfb5b 100644
--- a/src/bin/pg_basebackup/streamutil.c
+++ b/src/bin/pg_basebackup/streamutil.c
@@ -50,7 +50,7 @@ PGconn   *conn = NULL;
  * call exit(1) directly.
  */
 PGconn *
-GetConnection(void)
+GetConnection(bool replication)
 {
PGconn *tmpconn;
int argcount = 7;   /* dbname, replication, 
fallback_app_name,
@@ -104,10 +104,10 @@ GetConnection(void)
}
 
keywords[i] = "dbname";
-   values[i] = dbname == NULL ? "replication" : dbname;
+   values[i] = dbname == NULL ? (replication ? "replication" : "postgres") 
: dbname;
i++;
keywords[i] = "replication";
-   values[i] = dbname == NULL ? "true" : "database";
+   values[i] = replication ? (dbname == NULL ? "true" : "database") : 
"false";
i++;
keywords[i] = "fallback_application_name";
values[i] = progname;
diff --git a/src/bin/pg_basebackup/streamutil.h 
b/src/bin/pg_basebackup/streamutil.h
index b95f83f..21a6331 100644
--- a/src/bin/pg_basebackup/streamutil.h
+++ b/src/bin/pg_basebackup/streamutil.h
@@ -28,7 +28,7 @@ extern char *replication_slot;
 /* Connection kept global so we can disconnect easily */
 extern PGconn *conn;
 
-extern PGconn *GetConnection(void);
+extern PGconn *GetConnection(bool replication);
 
 /* Replication commands */
 extern bool CreateReplicationSlot(PGconn *conn, const char *slot_name,
-- 
2.3.0



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Arguable RLS security bug, EvalPlanQual() paranoia

2015-09-29 Thread Adam Brightwell
On Mon, Aug 3, 2015 at 6:21 PM, Peter Geoghegan  wrote:
> On Mon, Aug 3, 2015 at 3:07 PM, Stephen Frost  wrote:
>> Thoughts?  Trying to keep it straight-forward and provide a simple
>> solution for users to be able to address the issue, if they're worried
>> about it.  Perhaps this, plus an additional paragraph which goes into
>> more detail about exactly what's going on?
>
> I'm still thinking about it, but I think you have the right idea here.

I have attached a patch for review that I believe addresses the
documentation side of this issue.

Thoughts or comments?

Thanks,
Adam

--
Adam Brightwell - adam.brightw...@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com


transaction-isolation-rls-docs.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for improving buildfarm robustness

2015-09-29 Thread Tom Lane
I wrote:
> Josh Berkus  writes:
>> Give me source with the change, and I'll put it on a cheap, low-bandwith
>> AWS instance and hammer the heck out of it.  That should raise any false
>> positives we can expect.

> Here's a draft patch against HEAD (looks like it will work on 9.5 or
> 9.4 without modifications, too).

BTW: in addition to whatever AWS testing Josh has in mind, it'd be good if
someone tried it on Windows.  AFAIK, the self-kill() should work in the
postmaster on Windows, but that should be checked.  Also, does the set of
errnos it checks cover typical deletion cases on Windows?  Try both
removal of $PGDATA in toto and removal of just pg_control or just global/.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LISTEN denial of service with aborted transaction

2015-09-29 Thread Matt Newell
On Monday, September 28, 2015 07:22:26 PM Tom Lane wrote:
> Matt Newell  writes:
> > 1. When a connection issues it's first LISTEN command, in
> > Exec_ListenPreCommit QUEUE_BACKEND_POS(MyBackendId) = QUEUE_TAIL;
> > this causes the connection to iterate through every notify queued in the
> > slru, even though at that point I believe the connection can safely
> > ignore any notifications from transactions that are already committed,
> > and if I understand correctly notifications aren't put into the slru
> > until precommit, so wouldn't it be safe to do:
> > QUEUE_BACKEND_POS(MyBackendId) = QUEUE_HEAD;
> > inside Async_Listen?
> 
> Per the comment in Exec_ListenPreCommit, the goal here is to see entries
> that have been made and not yet committed.  If you don't do it like this,
> then a new listener has the possibility of receiving notifications from
> one transaction, while not seeing notifications from another one that
> committed later, even though longer-established listeners saw outputs from
> both transactions.  We felt that that sort of application-visible
> inconsistency would be a bad thing.
> 
Right,  QUEUE_HEAD can't be used, however...

> > If that's not safe, then could a new member be added to
> > AsyncQueueControl that points to the first uncommitted QueuePosition
> > (wouldn't have to be kept completely up to date).
> 
> Err ... isn't that QUEUE_TAIL already?  I've not studied this code in
> detail recently, though.
> 
QUEUE_TAIL is the oldest notification.  My idea is to keep a somewhat up-to-
date pointer to the oldest uncommitted notification, which can be used as a 
starting point when a connection issues it's first listen.  Just as the 
current code will advance a backend's pointer past any committed notifications 
when it calls asyncQueueReadAllNotifications in Exec_ListenPreCommit with no
registered listeners.

I came up with a proof of concept and it appears to work as expected. With 
500,000 notifications queued a listen is consistently under .5ms, while my 
9.4.4 install is taking 70ms, and the speedup should be much greater on a
busy server where there is more lock contention.

Attached is the patch and the ugly test script.


> > 2. Would it be possible when a backend is signaled to check if it is idle
> > inside an aborted transaction, and if so process the notifications and put
> > any that match what the backend is listening on into a local list.
> 
> Possibly.  sinval catchup notification works like that, so you could maybe
> invent a similar mechanism for notify.  Beware that we've had to fix
> performance issues with sinval catchup; you don't want to release a whole
> bunch of processes to do work at the same time.

I'll have to think about this more but i don't envision it causing such as 
scenario.

The extra processing that will happen at signaling time would have been done 
anyway when the aborted transaction is finally rolled back, the only extra 
work is copying the relevant notifications to local memory. 

Regardless it might not be worthwhile since my patch for #1 seems to address 
the symptom that bit me.

Matt Newell
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index 3b71174..e89ece5 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -196,12 +196,24 @@ typedef struct QueuePosition
 #define QUEUE_POS_EQUAL(x,y) \
 	 ((x).page == (y).page && (x).offset == (y).offset)
 
+/* Returns 1 if x is larger than y, 0 if equal, else -1 */
+#define QUEUE_POS_COMPARE(x,y) \
+	(((x).page > (y).page) ? 1 : \
+	  ((x).page < (y).page) ? -1 : \
+	((x).offset > (y).offset ? 1 : ((x).offset == (y).offset ? 0 : -1)))
+
 /* choose logically smaller QueuePosition */
 #define QUEUE_POS_MIN(x,y) \
 	(asyncQueuePagePrecedes((x).page, (y).page) ? (x) : \
 	 (x).page != (y).page ? (y) : \
 	 (x).offset < (y).offset ? (x) : (y))
 
+/* choose logically smaller QueuePosition */
+#define QUEUE_POS_MAX(x,y) \
+	(asyncQueuePagePrecedes((x).page, (y).page) ? (y) : \
+	 (x).page != (y).page ? (x) : \
+	 (x).offset < (y).offset ? (y) : (x))
+
 /*
  * Struct describing a listening backend's status
  */
@@ -217,12 +229,13 @@ typedef struct QueueBackendStatus
  * The AsyncQueueControl structure is protected by the AsyncQueueLock.
  *
  * When holding the lock in SHARED mode, backends may only inspect their own
- * entries as well as the head and tail pointers. Consequently we can allow a
- * backend to update its own record while holding only SHARED lock (since no
- * other backend will inspect it).
+ * entries as well as the head, tail, and firstUncommitted pointers. 
+ * Consequently we can allow a backend to update its own record while holding
+ * only SHARED lock (since no other backend will inspect it).
  *
  * When holding the lock in EXCLUSIVE mode, backends can inspect the entries
- * of other backends and also change the head and tail pointers.
+ * of other backends and also change the head, tail and 

Re: [HACKERS] Idea for improving buildfarm robustness

2015-09-29 Thread Joe Conway
On 09/29/2015 12:47 PM, Tom Lane wrote:
> We could possibly add additional checks, like trying to verify that
> pg_control has the same inode number it used to.  But I'm afraid that
> would add portability issues and false-positive hazards that would
> outweigh the value.

Not sure you remember the incident, but I think years ago that would
have saved me some heartache ;-)

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Alvaro Herrera
Oleg Bartunov wrote:
 
> me too, but I have to mention one problem we should have in mind - it's
> independency from political games (sanctions).  Think about developers from
> Russia, who one day may be blocked by Github, for example.

That's a very good point.  I think Github and other sites are already
blocked in countries like India and Cuba.  This becomes more pressing as
commercial entities are formed in countries like Russia.  Surely we do
not want PostgresPro developers to be unable to interact with our
bugtracker ...

We've done an excellent job of keeping our servers far away from any
individual jurisdictions, going back many years ago when Marc Fournier
decided to host our stuff in Panama for precisely this reason.  Nowadays
for us it is reasonably simple to move stuff around in case there's
trouble in any particular country.  I have a very hard time believing we
would tie ourselves down for a bug tracker; hosting whatever in our own
infrastructure is probably the only reasonable option for us at this
point.  As Tom said, lesser projects cannot afford this luxury, but
we're not giving that away in a jiffy.

IANAL

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On-demand running query plans using auto_explain and signals

2015-09-29 Thread Simon Riggs
On 29 September 2015 at 20:52, Shulgin, Oleksandr <
oleksandr.shul...@zalando.de> wrote:

> On Tue, Sep 29, 2015 at 8:34 PM, Simon Riggs 
> wrote:
>
>> On 29 September 2015 at 12:52, Shulgin, Oleksandr <
>> oleksandr.shul...@zalando.de> wrote:
>>
>>
>>> Hitting a process with a signal and hoping it will produce a meaningful
>>> response in all circumstances without disrupting its current task was way
>>> too naive.
>>>
>>
>> Hmm, I would have to disagree, sorry. For me the problem was dynamically
>> allocating everything at the time the signal is received and getting into
>> problems when that caused errors.
>>
>
> What I mean is that we need to move the actual EXPLAIN run out of
> ProcessInterrupts().  It can be still fine to trigger the communication
> with a signal.
>

Good


> * INIT - Allocate N areas of memory for use by queries, which can be
>> expanded/contracted as needed. Keep a freelist of structures.
>> * OBSERVER - When requested, gain exclusive access to a diagnostic area,
>> then allocate the designated process to that area, then send a signal
>> * QUERY - When signal received dump an EXPLAIN ANALYZE to the allocated
>> diagnostic area, (set flag to show complete, set latch on observer)
>> * OBSERVER - process data in diagnostic area and then release area for
>> use by next observation
>>
>> If the EXPLAIN ANALYZE doesn't fit into the diagnostic chunk, LOG it as a
>> problem and copy data only up to the size defined. Any other ERRORs that
>> are caused by this process cause it to fail normally.
>>
>
> Do you envision problems if we do this with a newly allocated DSM every
> time instead of pre-allocated area?  This will have to revert the workflow,
> because only the QUERY knows the required segment size:
>

That's too fiddly; we need to keep it simple by using just fixed sizes.


> OBSERVER - sends a signal and waits for its proc latch to be set
> QUERY - when signal is received allocates a DSM just big enough to fit the
> EXPLAIN plan, then locates the OBSERVER(s) and sets its latch (or their
> latches)
>
> The EXPLAIN plan should already be produced somewhere in the executor, to
> avoid calling into explain.c from ProcessInterrupts().
>
> That allows the observer to be another backend, or it allows the query
>> process to perform self-observation based upon a timeout (e.g. >1 hour) or
>> a row limit (e.g. when an optimizer estimate is seen to be badly wrong).
>>
>
> Do you think there is one single best place in the executor code where
> such a check could be added?  I have very little idea about that.
>

 Fairly simple.

Main problem is knowing how to handle nested calls to the executor.

I'll look at the patch.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


[HACKERS] Add pg_basebackup single tar output format

2015-09-29 Thread Josh Elsasser
Hi. I have a need to pipe the output from pg_basebackup for a
multi-tablespace cluster into another program without spooling to
disk. Seeing as the current -F tar output format can't do that, I've
made an attempt at implementing that myself.

As a side effect I've refactored the some of the pg_basebackup code
for readability and reusability, as well as implemented support for
longer filenames in tar output (not used by default for
compatibility). There is also a fix for a bug where pg_basebackup will
drop a zero-length file at the end of a tablespace's tar stream.

I've put my changes up as a series of relatively small commits on this
branch of a github fork:

https://github.com/jre/postgres/commits/single-tar

Comments and suggestions are welcome.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unclear about row-level security USING vs. CHECK

2015-09-29 Thread Stephen Frost
* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
> On 28 September 2015 at 20:15, Stephen Frost  wrote:
> > I listed out the various alternatives but didn't end up getting any
> > responses to it.  I'm still of the opinion that the documentation is the
> > main thing which needs improving here, but we can also change CREATE
> > POLICY, et al, to require an explicit WITH CHECK clause for the commands
> > where that makes sense if that's the consensus.
> 
> My vote would be to keep it as-is.

That's my feeling on it as well, particularly as...

> It feels perfectly natural to me. USING clauses add to the query's
> WHERE clause controlling which existing rows you can SELECT, UPDATE or
> DELETE. WITH CHECK clauses control what new data you can add via
> INSERT or UPDATE. UPDATE allows both, but most of the time I expect
> you'll want them to be the same.

exactly this.  Many people are going to want them to be the same and not
supporting a single-expression syntax is going to frustrate them, to no
particularly good end, in my view.  The "USING AND WITH CHECK"
technically solves that but feels very odd to me.

> So having the WITH CHECK clause default to being the same as the USING
> clause for UPDATE matches what I expect to be the most common usage.
> Users granted permission to update a subset of the table's rows
> probably don't want to give those rows away. More advanced use-cases
> are still supported, but the simplest/most common case is the default,
> which means that you don't have to supply the same expression twice.

Agreed.

> I agree that the documentation could be improved.
> 
> As things stand, you have to read quite a lot of text on the CREATE
> POLICY page before you get to the description of how the USING and
> WITH CHECK expressions interact. I'd suggest rewording the 2nd
> paragraph where these clauses are first introduced. Perhaps something
> like:
> 
> """
> A policy grants the ability to SELECT, INSERT, UPDATE, or DELETE rows
> which match the relevant policy expression. For SELECT, UPDATE and
> DELETE, the USING expression from the policy is combined with the
> query's WHERE clause to control which existing table rows can be
> retrieved, updated or deleted. For INSERT and UPDATE, the WITH CHECK
> expression is used to constrain what new data can be added to the
> table. A policy that applies to UPDATE may have both USING and WITH
> CHECK expressions, which may be different from one another, but if
> they are the same, the WITH CHECK expression can be omitted and the
> USING expression will be used automatically in its place.
> 
> Policy expressions may be any expressions that evaluate to give a
> result of type boolean. When a USING expression returns true for a
> given row then the query is allowed to act upon that row, while rows
> for which the expression returns false or null are skipped. When a
> WITH CHECK expression returns true for a new row then the system
> allows that row to be added to the table, but if the expression
> returns false or null an error is raised.
> """

I'm not convinced that this really helps, but I don't have anything
dramatically better yet either.  I'll try to come up with something
though.

Thanks!

Stephen


signature.asc
Description: Digital signature


[HACKERS] [PATCH 4/6] pg_basebackup: don't lose a zero-length file at the end of a tablespace.

2015-09-29 Thread Joshua Elsasser
After a complete tar header was buffered it would only be processed
during the next iteration of the read loop. A zero-length file such as
a directory had no data to read, so the loop would exit without ever
having processed the file.
---
 src/bin/pg_basebackup/pg_basebackup.c | 238 +++---
 1 file changed, 134 insertions(+), 104 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c 
b/src/bin/pg_basebackup/pg_basebackup.c
index f73dd38..ccd0890 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -60,6 +60,15 @@ typedef struct TarStream {
bool keepopen;
 } TarStream;
 
+typedef struct TarParser {
+   chartarhdr[512];
+   TarStream   *stream;
+   boolin_tarhdr;
+   boolskip_file;
+   size_t  tarhdrsz;
+   size_t  filesz;
+} TarParser;
+
 /* Global options */
 static char *basedir = NULL;
 static TablespaceList tablespace_dirs = {NULL, NULL};
@@ -112,6 +121,10 @@ static void progress_report(int tablespacenum, const char 
*filename, bool force)
 static void OpenTarFile(TarStream *tarfile, const char *path);
 static void CloseTarFile(TarStream *tarfile);
 static void TarInsertRecoveryConf(TarStream *stream);
+static void IterateAndWriteTar(TarParser *tp, char *inbuf, int buflen,
+  bool 
(*callback)(char *, void *), void *cbarg);
+static bool TarIterSkipRecoveryConf(char *h, void *arg);
+
 static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
 static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum);
 static void GenerateRecoveryConf(PGconn *conn);
@@ -899,6 +912,120 @@ TarInsertRecoveryConf(TarStream *stream)
 
 
 /*
+ * Process the individual files inside the TAR stream and pass their headers
+ * to a callback which can modify or chose to skip them. The stream consists
+ * of a header and zero or more chunks, all 512 bytes long. The stream from
+ * the server is broken up into smaller pieces, so we have to track the size
+ * of the files to find the next header structure.
+ */
+static void
+IterateAndWriteTar(TarParser *tp, char *inbuf, int buflen,
+  bool (*callback)(char *, void *), void 
*cbarg)
+{
+   int rr = buflen;
+   int pos = 0;
+
+   while (rr > 0)
+   {
+   if (tp->in_tarhdr)
+   {
+   /*
+* We're currently reading a header structure inside the
+* TAR stream, i.e. the file metadata.
+*/
+   if (tp->tarhdrsz < 512)
+   {
+   /*
+* Copy the header structure into tarhdr in 
case the
+* header is not aligned to 512 bytes or it's 
not
+* returned in whole by the last PQgetCopyData 
call.
+*/
+   int hdrleft;
+   int bytes2copy;
+
+   hdrleft = 512 - tp->tarhdrsz;
+   bytes2copy = (rr > hdrleft ? hdrleft : rr);
+
+   memcpy(>tarhdr[tp->tarhdrsz], inbuf + pos, 
bytes2copy);
+
+   rr -= bytes2copy;
+   pos += bytes2copy;
+   tp->tarhdrsz += bytes2copy;
+   }
+
+   if (tp->tarhdrsz == 512)
+   {
+   /*
+* We have the complete header structure in 
tarhdr, let the
+* callback possibly modify it or chose to skip 
the file. Find
+* out the size of the file padded to the next 
multiple of 512
+*/
+   int padding;
+
+   tp->skip_file = callback(tp->tarhdr, cbarg);
+
+   sscanf(>tarhdr[124], "%11o", (unsigned int 
*) >filesz);
+
+   padding = ((tp->filesz + 511) & ~511) - 
tp->filesz;
+   tp->filesz += padding;
+
+   /* Next part is the file, not the header */
+   tp->in_tarhdr = false;
+
+   /*
+* If we're not skipping the file, write the tar
+* header unmodified.
+*/
+   if (!tp->skip_file)
+   writeTarData(tp->stream, tp->tarhdr, 
512);
+   }
+   }

Re: [HACKERS] PATCH: use foreign keys to improve join estimates v1

2015-09-29 Thread David Rowley
On 29 September 2015 at 01:59, Tomas Vondra 
wrote:

>
> CREATE TABLE f (id1 INT, id2 INT, PRIMARY KEY (id1, id2));
>
> CREATE TABLE d1 (id1 INT, id2 INT, FOREIGN KEY (id1, id2) REFERENCES
> f(id1, id2));
> CREATE TABLE d2 (id1 INT, id2 INT, FOREIGN KEY (id1, id2) REFERENCES
> f(id1, id2));
>
> INSERT INTO f SELECT i, i FROM generate_series(1,100) s(i);
>
> INSERT INTO d1 SELECT i, i FROM generate_series(1,10) s(i);
> INSERT INTO d2 SELECT i, i FROM generate_series(1,30) s(i);
>
> now, both pair-wise joins (f JOIN d1) and (f JOIN d2) are estimated
> perfectly accurately, but as soon as the query involves both of them, this
> happens:
>
> SELECT * FROM f JOIN d1 ON (f.id1 = d1.id1 AND f.id2 = d1.id2)
> JOIN d2 ON (f.id1 = d2.id1 AND f.id2 = d2.id2);
>
>
>
This is a near perfect example of what I was trying to explain about being
unable to have guarantees about there being 1.0 matching tuples in the
referenced relation.

If we run the above with join_collapse_limit = 1, then we'll first join f
to d1, which will give us 10 tuples. (With IDs ranging from 1 to 10)

If we now perform estimates to join d2 to (f, d1), we don't have all of the
referenced tuples in (f, d1), so how do we estimate that? Remember that d2
has IDs 11 to 30 that won't be found in the "referenced" relation.

What if we had populated d1 with:

INSERT INTO d1 SELECT i, i FROM generate_series(91,100) s(i);

The  join will find exactly 0 tuples between the join of (f, d1) -> d2.

Is my logic here wrong somehow?

-- 
 David Rowley   http://www.2ndQuadrant.com/

 PostgreSQL Development, 24x7 Support, Training & Services


[HACKERS] Re: BRIN Scan: Optimize memory allocation in function 'bringetbitmap'

2015-09-29 Thread zhangjinyu
I forgot disable-c-assert last test.  
The following show the test result when disable-c-assert.
After optimize code (warm run)
postgres=# select count(*) from lineitem where l_orderkey=1;
 count
---
 6
(1 row)

Time: 327.143 ms
Before optimizing code (warm run)
postgres=# select count(*) from lineitem where l_orderkey=1;
 count
---
 6
(1 row)

Time: 404.323 ms

Here is the patch.
  patch_optimize_mem.patch_optimize_mem

  

Jinyu Zhang



--
View this message in context: 
http://postgresql.nabble.com/BRIN-Scan-Optimize-memory-allocation-in-function-bringetbitmap-tp5867536p5867980.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add pg_basebackup single tar output format

2015-09-29 Thread Michael Paquier
On Wed, Sep 30, 2015 at 7:16 AM, Joshua Elsasser  wrote:
> On Tue, Sep 29, 2015 at 11:50:37PM +0200, Andres Freund wrote:
>> On 2015-09-29 14:38:11 -0700, Josh Elsasser wrote:
>> > I've put my changes up as a series of relatively small commits on this
>> > branch of a github fork:
>> >
>> > https://github.com/jre/postgres/commits/single-tar
>> >
>> > Comments and suggestions are welcome.

Great!

>>
>> Please post them to the list, we want patches to be in our archives.
>
> Sure, let's see if I can figure out git send-email

Usually it is recommended to add them directly to their dedicated
thread so as it is easier to follow the review flow, and you have sent
six patches, that's as much threads to follow... Here are some general
guidelines:
https://wiki.postgresql.org/wiki/Submitting_a_Patch

You should register your patch as well here so as we do not lose track of it:
https://commitfest.postgresql.org/7/
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Oleg Bartunov
On Tue, Sep 29, 2015 at 6:36 PM, Andrew Dunstan  wrote:

>
>
> On 09/29/2015 10:55 AM, Steve Crawford wrote:
>
>> On Tue, Sep 29, 2015 at 7:16 AM, David Fetter  da...@fetter.org>> wrote:
>>
>> ...What we're not fine with is depending on a proprietary system, no
>> matter what type of license, as infrastructure...
>>
>>
>> Exactly. Which is why I was warning about latching onto features only
>> available in the closed enterprise version.
>>
>>
>>
>>
> Like Tom, I more or less promised myself not to get terribly involved in
> this discussion. Oh, well.
>

me too, but I have to mention one problem we should have in mind - it's
independency from political games (sanctions).  Think about developers from
Russia, who one day may be blocked by Github, for example.


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-09-29 Thread Peter Geoghegan
On Fri, Sep 25, 2015 at 8:51 AM, Marti Raudsepp  wrote:
> I've also been seeing lots of log messages saying "LOG:  out of
> memory" on a server that's hosting development databases. I put off
> debugging this until now because it didn't seem to have any adverse
> effects on the system.

That's unfortunate.

> It's very frustrating when debugging aides cause further problems on a
> system. If the in-line compaction doesn't materialize (or it's decided
> not to backport it), I would propose instead to add a test to
> pgss_store() to avoid growing the file beyond MaxAlloc (or perhaps
> even a lower limit). Surely dropping some statistics is better than
> this pathology.

I heard a customer complaint today that seems similar. A Heroku
customer attempted a migration from MySQL to PostgreSQL in this
manner:

mysqldump | psql

This at least worked well enough to cause problems for
pg_stat_statements (some queries were not rejected by the parser, I
suppose).

While I'm opposed to arbitrary limits for tools like
pg_stat_statements, I think the following defensive measure might be
useful on top of better OOM defenses:

Test for query text length within pgss_store() where a pgssJumbleState
is passed by caller (the post-parse-analysis hook caller -- not
executor hook caller that has query costs to store). If it is greater
than, say, 10 * Max(ASSUMED_MEDIAN_INIT, pgss->cur_median_usage), do
not bother to normalize the query text, or store anything at all.
Simply return.

Any entry we create at that point will be a sticky entry (pending
actually completing execution without the entry being evicted), and it
doesn't seem worth worrying about normalizing very large query texts,
which tend to be qualitatively similar to utility statements from the
user's perspective anyway. Besides, query text normalization always
occurred on a best-effort basis. It's not very uncommon for
pg_stat_statements to fail to normalize query texts today for obscure
reasons.

This would avoid storing very large query texts again and again when a
very large DML statement repeatedly fails (e.g. due to a data
integration task that can run into duplicate violations) and is
repeatedly rerun with tweaks. Maybe there is a substantively distinct
table in each case, because a CREATE TABLE is performed as part of the
same transaction, so each failed query gets a new pg_stat_statements
entry, and a new, large query text.

I should probably also assume that sticky entries have a generic
length (existing pgss->mean_query_len) for the purposes of
accumulating query text length within entry_dealloc(). They should not
get to contribute to median query length (which throttles garbage
collection to prevent thrashing).

Anyone have an opinion on that?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Merlin Moncure
On Tue, Sep 29, 2015 at 12:42 PM, Joshua D. Drake  
wrote:
> On 09/29/2015 07:25 AM, Merlin Moncure wrote:
>>
>> On Wed, Sep 23, 2015 at 1:18 PM, Kam Lasater  wrote:
>>>
>>> Hello,
>>>
>>> Last night I heard that Postgres had no issue/bug tracker. At first I
>>> thought the guy was trolling me. Seriously, how could this be. Certainly
>>> a
>>> mature open source project that is the database for a measurable
>>> percentage
>>> of the internet would have an issue tracker.
>>>
>>> Sadly, I was not being trolled. I'm new around here so I will keep the
>>> preaching to a minimum and cut right to the chase...
>>>
>>> ___It is time for an issue tracker___
>>
>>
>> This thread is depressing me.  We use all these fancy tools at $work
>> and I'm not sure they are much of an improvement over informal
>> processes run by capable people.   I regularly advocate, pretty much
>> to the wind, to use JIRA less and email *more*.   The main benefit of
>> the system, various reports to corporate taskmasters, seems pretty
>> much irrelevant here.  If you're advocating introduction of new
>> tooling with all the associated processes and complexities, can you
>> point to specific breakdowns in the project and exactly how said
>> tooling would have helped the situation?
>
>
> From my perspective this is about more than bugs it is about development as
> a whole. Here is a very simple problem we have:
>
> I come to hackers to discuss an idea
> idea gets sign off
> I submit a patch
> *I am told to go over to this commitfest app thing and upload it there*
>
> Why? That's stupid. I should have submitted the patch to hackers and it
> should have been automatically part of my existing thread.
>
> Someone reviews the patch, decides it needs to be pushed to the next
> commitfest
>
> In theory, at some point I will be informed of that or I will see that it
> was pushed to the next commitfest.
>
> If we were running a properly integrated tracker, I would know that
> instantly because the issue would have been updated to the affect and marked
> for the next commitfest.
>
> The next commitfest comes around, and I can't get to the patch changes in
> time so it gets pushed to the next release. With a properly integrated issue
> tracker, I would immediately see that, be able to comment on it and be able
> to see the entire history of the patch.
>
> Oh... and this can be done all from email as long as the tracker is properly
> integrated.
>
> Bugs can certainly be handled the same way and in some ways better.

I've read this email about three times now and it's not clear at all
to me what a issue/bug tracker brings to the table.  First, I find it
pretty hard to believe that a hypothetical patch author would not know
that a patch was or was not submitted in the current framework.
Putting that aside, if you insisted an automatic status change
notifications, that could be worked into the current commit fest
application, right? Note, once done, you'd get that notification with
zero extra process effort beyond what we currently do.

I also openly wonder how big this problem really is.  I haven't
submitted all that many patches, but for those I have it's never
really been in doubt as to what status they've were in at the time.
So, I'll repeat the question: if you want to accomplish with this
thread, let's give *specific examples* of project breakdowns we're
trying to solve with this tooling.  Did a user lose status of a patch?
If so, which user?...what patch?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for improving buildfarm robustness

2015-09-29 Thread Alvaro Herrera
Joe Conway wrote:
> On 09/29/2015 01:48 PM, Alvaro Herrera wrote:

> > I remember it, but I'm not sure it would have helped you.  As I recall,
> > your trouble was that after a reboot the init script decided to initdb
> > the mount point -- postmaster wouldn't have been running at all ...
> 
> Right, which the init script non longer does as far as I'm aware, so
> hopefully will never happen again to anyone.

Yeah.

> But it was still a case where the postmaster started on one copy of
> PGDATA (the newly init'd copy), and then the contents of the real PGDATA
> was swapped in (when the filesystem was finally mounted), causing
> corruption to the production data.

Ah, I didn't remember that part of it, but it makes sense.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add pg_basebackup single tar output format

2015-09-29 Thread Andres Freund
Hi Josh,

On 2015-09-29 14:38:11 -0700, Josh Elsasser wrote:
> As a side effect I've refactored the some of the pg_basebackup code
> for readability and reusability

Cool, that's desperately needed. I've been trying to bug Magnus into
doing that for a bunch of conferences now ;)

> I've put my changes up as a series of relatively small commits on this
> branch of a github fork:
> 
> https://github.com/jre/postgres/commits/single-tar
> 
> Comments and suggestions are welcome.

Please post them to the list, we want patches to be in our archives.

Regards

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Alvaro Herrera
Merlin Moncure wrote:

> I've read this email about three times now and it's not clear at all
> to me what a issue/bug tracker brings to the table.

In my opinion, this thread is about a bug tracker, not a patch tracker.
We already have a patch tracking system which works very well.  (We are
not able to process all the things we get, but that's a problem of
workload, not tooling).

Let's not mix things, as that only adds to the confusion.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for improving buildfarm robustness

2015-09-29 Thread Josh Berkus
On 09/29/2015 12:47 PM, Tom Lane wrote:
> Josh Berkus  writes:
>> In general, having the postmaster survive deletion of PGDATA is
>> suboptimal.  In rare cases of having it survive installation of a new
>> PGDATA (via PITR restore, for example), I've even seen the zombie
>> postmaster corrupt the data files.
> 
> However ... if you'd simply deleted everything *under* $PGDATA but not
> that directory itself, then this type of failure mode is 100% plausible.
> And that's not an unreasonable thing to do, especially if you've set
> things up so that $PGDATA's parent is not a writable directory.

I don't remember the exact setup, but this is likely the case.  Probably
1/3 of the systems I monitor have a root-owned mount point for PGDATA's
parent directory.

> Testing accessibility of "global/pg_control" would be enough to catch this
> case, but only if we do it before you create a new one.  So that seems
> like an argument for making the test relatively often.  The once-a-minute
> option is sounding better and better.
> 
> We could possibly add additional checks, like trying to verify that
> pg_control has the same inode number it used to.  But I'm afraid that
> would add portability issues and false-positive hazards that would
> outweigh the value.

It's not worth doing extra stuff for this.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Josh Berkus
On 09/29/2015 03:08 PM, Merlin Moncure wrote:
> I've read this email about three times now and it's not clear at all
> to me what a issue/bug tracker brings to the table.

Here are the problems I'd like to solve:

1. "Was this issue fixed in a Postgres update?  Which one?"

2. Not losing track of minor bugs.

3. Having a better way to track bugs which require multi-part solutions
(e.g. multixact).

4. Having a place for downstream projects/packagers to report bugs.

5. Not answering this question ever again: "Why doesn't your project
have a bug tracker?"

Note that all of the above requires a bug *tracker*, that is, a tool
which tracks the bug activity which was happening anyway, just makes it
more visible.  Rather than an Issue Resolution System, which would be
intended to remake our workflow.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for improving buildfarm robustness

2015-09-29 Thread Tom Lane
Josh Berkus  writes:
> Give me source with the change, and I'll put it on a cheap, low-bandwith
> AWS instance and hammer the heck out of it.  That should raise any false
> positives we can expect.

Here's a draft patch against HEAD (looks like it will work on 9.5 or
9.4 without modifications, too).

regards, tom lane

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index baa43b2..52c9acd 100644
*** a/src/backend/postmaster/postmaster.c
--- b/src/backend/postmaster/postmaster.c
*** static void CloseServerPorts(int status,
*** 373,378 
--- 373,379 
  static void unlink_external_pid_file(int status, Datum arg);
  static void getInstallationPaths(const char *argv0);
  static void checkDataDir(void);
+ static bool recheckDataDir(void);
  static Port *ConnCreate(int serverFd);
  static void ConnFree(Port *port);
  static void reset_shared(int port);
*** checkDataDir(void)
*** 1490,1495 
--- 1491,1539 
  }
  
  /*
+  * Revalidate the data directory; return TRUE if OK, FALSE if not
+  *
+  * We don't try to check everything that checkDataDir() does.  Ideally, we'd
+  * return FALSE *only* if the data directory has been deleted.  As a proxy
+  * for that that matches a condition that checkDataDir() checked, verify that
+  * pg_control still exists.  Because the postmaster will quit if we return
+  * FALSE, do not do so if there is any doubt or possibly-transient failure.
+  * Better to wait till we're sure.
+  *
+  * Unlike checkDataDir(), we assume we've chdir'd into the data directory.
+  */
+ static bool
+ recheckDataDir(void)
+ {
+ 	const char *path = "global/pg_control";
+ 	FILE	   *fp;
+ 
+ 	fp = AllocateFile(path, PG_BINARY_R);
+ 	if (fp != NULL)
+ 	{
+ 		FreeFile(fp);
+ 		return true;
+ 	}
+ 
+ 	/*
+ 	 * There are many foreseeable false-positive error conditions, for example
+ 	 * EINTR or ENFILE should not cause us to fail.  For safety, fail only on
+ 	 * enumerated clearly-something-is-wrong conditions.
+ 	 */
+ 	switch (errno)
+ 	{
+ 		case ENOENT:
+ 		case ENOTDIR:
+ 			elog(LOG, "could not open file \"%s\": %m", path);
+ 			return false;
+ 		default:
+ 			elog(LOG, "could not open file \"%s\": %m; continuing anyway",
+  path);
+ 			return true;
+ 	}
+ }
+ 
+ /*
   * Determine how long should we let ServerLoop sleep.
   *
   * In normal conditions we wait at most one minute, to ensure that the other
*** ServerLoop(void)
*** 1602,1610 
  	fd_set		readmask;
  	int			nSockets;
  	time_t		now,
  last_touch_time;
  
! 	last_touch_time = time(NULL);
  
  	nSockets = initMasks();
  
--- 1646,1655 
  	fd_set		readmask;
  	int			nSockets;
  	time_t		now,
+ last_dir_recheck_time,
  last_touch_time;
  
! 	last_dir_recheck_time = last_touch_time = time(NULL);
  
  	nSockets = initMasks();
  
*** ServerLoop(void)
*** 1754,1772 
  		if (StartWorkerNeeded || HaveCrashedWorker)
  			maybe_start_bgworker();
  
- 		/*
- 		 * Touch Unix socket and lock files every 58 minutes, to ensure that
- 		 * they are not removed by overzealous /tmp-cleaning tasks.  We assume
- 		 * no one runs cleaners with cutoff times of less than an hour ...
- 		 */
- 		now = time(NULL);
- 		if (now - last_touch_time >= 58 * SECS_PER_MINUTE)
- 		{
- 			TouchSocketFiles();
- 			TouchSocketLockFiles();
- 			last_touch_time = now;
- 		}
- 
  #ifdef HAVE_PTHREAD_IS_THREADED_NP
  
  		/*
--- 1799,1804 
*** ServerLoop(void)
*** 1793,1798 
--- 1825,1868 
  			/* reset flag so we don't SIGKILL again */
  			AbortStartTime = 0;
  		}
+ 
+ 		/*
+ 		 * Lastly, check to see if it's time to do some things that we don't
+ 		 * want to do every single time through the loop, because they're a
+ 		 * bit expensive.  Note that there's up to a minute of slop in when
+ 		 * these tasks will be performed, since DetermineSleepTime() will let
+ 		 * us sleep at most that long.
+ 		 */
+ 		now = time(NULL);
+ 
+ 		/*
+ 		 * Once a minute, verify that $PGDATA hasn't been removed.  If it has,
+ 		 * we want to commit hara-kiri.  This avoids having postmasters and
+ 		 * child processes hanging around after their database is gone, and
+ 		 * maybe causing problems if a new database cluster is created in the
+ 		 * same place.
+ 		 */
+ 		if (now - last_dir_recheck_time >= 1 * SECS_PER_MINUTE)
+ 		{
+ 			if (!recheckDataDir())
+ 			{
+ elog(LOG, "shutting down because data directory is gone");
+ kill(MyProcPid, SIGQUIT);
+ 			}
+ 			last_dir_recheck_time = now;
+ 		}
+ 
+ 		/*
+ 		 * Touch Unix socket and lock files every 58 minutes, to ensure that
+ 		 * they are not removed by overzealous /tmp-cleaning tasks.  We assume
+ 		 * no one runs cleaners with cutoff times of less than an hour ...
+ 		 */
+ 		if (now - last_touch_time >= 58 * SECS_PER_MINUTE)
+ 		{
+ 			TouchSocketFiles();
+ 			TouchSocketLockFiles();
+ 			last_touch_time = now;
+ 		}
  	}
  }
  

-- 
Sent 

[HACKERS] [PATCH 2/6] pg_basebackup: factor out tar open/close code for reusability.

2015-09-29 Thread Joshua Elsasser
This adds a simple struct and open and close functions to abstract and
isolate the zlib vs. stdio output logic and allow it to be reused.
---
 src/bin/pg_basebackup/pg_basebackup.c | 300 +-
 1 file changed, 154 insertions(+), 146 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c 
b/src/bin/pg_basebackup/pg_basebackup.c
index 80de882..fa942ab 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -51,6 +51,15 @@ typedef struct TablespaceList
TablespaceListCell *tail;
 } TablespaceList;
 
+typedef struct TarStream {
+   char path[MAXPGPATH];
+   FILE*file;
+#ifdef HAVE_LIBZ
+   gzFile   zfile;
+#endif
+   bool keepopen;
+} TarStream;
+
 /* Global options */
 static char *basedir = NULL;
 static TablespaceList tablespace_dirs = {NULL, NULL};
@@ -100,6 +109,8 @@ static void disconnect_and_exit(int code);
 static void verify_dir_is_empty_or_create(char *dirname);
 static void progress_report(int tablespacenum, const char *filename, bool 
force);
 
+static void OpenTarFile(TarStream *tarfile, const char *path);
+static void CloseTarFile(TarStream *tarfile);
 static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
 static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum);
 static void GenerateRecoveryConf(PGconn *conn);
@@ -725,169 +736,194 @@ parse_max_rate(char *src)
  * Write a piece of tar data
  */
 static void
-writeTarData(
-#ifdef HAVE_LIBZ
-gzFile ztarfile,
-#endif
-FILE *tarfile, char *buf, int r, char *current_file)
+writeTarData(TarStream *stream, char *buf, int r)
 {
 #ifdef HAVE_LIBZ
-   if (ztarfile != NULL)
+   if (stream->zfile != NULL)
{
-   if (gzwrite(ztarfile, buf, r) != r)
+   if (gzwrite(stream->zfile, buf, r) != r)
{
fprintf(stderr,
_("%s: could not write to compressed 
file \"%s\": %s\n"),
-   progname, current_file, 
get_gz_error(ztarfile));
+   progname, stream->path, 
get_gz_error(stream->zfile));
disconnect_and_exit(1);
}
}
else
 #endif
{
-   if (fwrite(buf, r, 1, tarfile) != 1)
+   if (fwrite(buf, r, 1, stream->file) != 1)
{
fprintf(stderr, _("%s: could not write to file \"%s\": 
%s\n"),
-   progname, current_file, 
strerror(errno));
+   progname, stream->path, 
strerror(errno));
disconnect_and_exit(1);
}
}
 }
 
-#ifdef HAVE_LIBZ
-#define WRITE_TAR_DATA(buf, sz) writeTarData(ztarfile, tarfile, buf, sz, 
filename)
-#else
-#define WRITE_TAR_DATA(buf, sz) writeTarData(tarfile, buf, sz, filename)
-#endif
 
 /*
- * Receive a tar format file from the connection to the server, and write
- * the data from this file directly into a tar file. If compression is
- * enabled, the data will be compressed while written to the file.
- *
- * The file will be named base.tar[.gz] if it's for the main data directory
- * or .tar[.gz] if it's for another tablespace.
- *
- * No attempt to inspect or validate the contents of the file is done.
+ * Open a (possibly zlib-compressed) tar file for writing.  A filename of -
+ * will cause stdout to be used.
  */
 static void
-ReceiveTarFile(PGconn *conn, PGresult *res, int rownum)
+OpenTarFile(TarStream *tarfile, const char *path)
 {
-   charfilename[MAXPGPATH];
-   char   *copybuf = NULL;
-   FILE   *tarfile = NULL;
-   chartarhdr[512];
-   boolbasetablespace = PQgetisnull(res, rownum, 0);
-   boolin_tarhdr = true;
-   boolskip_file = false;
-   size_t  tarhdrsz = 0;
-   size_t  filesz = 0;
+   booluse_stdout;
 
-#ifdef HAVE_LIBZ
-   gzFile  ztarfile = NULL;
-#endif
+   MemSet(tarfile, 0, sizeof(*tarfile));
+   use_stdout = (strcmp(path, "-") == 0);
+   strlcpy(tarfile->path, path, sizeof(tarfile->path));
 
-   if (basetablespace)
+#ifdef HAVE_LIBZ
+   if (compresslevel != 0)
{
-   /*
-* Base tablespaces
-*/
-   if (strcmp(basedir, "-") == 0)
+   /* Compression is in use */
+   if (use_stdout)
+   tarfile->zfile = gzdopen(dup(fileno(stdout)), "wb");
+   else
+   tarfile->zfile = gzopen(tarfile->path, "wb");
+   if (!tarfile->zfile)
{
-#ifdef HAVE_LIBZ
-   if (compresslevel != 0)
-   {
-   ztarfile = 

Re: [HACKERS] Add pg_basebackup single tar output format

2015-09-29 Thread Joshua Elsasser
On Tue, Sep 29, 2015 at 11:50:37PM +0200, Andres Freund wrote:
> On 2015-09-29 14:38:11 -0700, Josh Elsasser wrote:
> > I've put my changes up as a series of relatively small commits on this
> > branch of a github fork:
> > 
> > https://github.com/jre/postgres/commits/single-tar
> > 
> > Comments and suggestions are welcome.
> 
> Please post them to the list, we want patches to be in our archives.

Sure, let's see if I can figure out git send-email



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH 1/6] Add support for longer filenames in tar headers (up to 254 bytes).

2015-09-29 Thread Joshua Elsasser
New functions tarHeaderRename() and tarHeaderGetName() are exposed to
store and retrieve the longer filenames.

tarCreateHeader() continues to limit filenames to 99 bytes to preserve
compatability with existing consumers.
---
 src/include/pgtar.h |   2 +
 src/port/tar.c  | 134 ++--
 2 files changed, 121 insertions(+), 15 deletions(-)

diff --git a/src/include/pgtar.h b/src/include/pgtar.h
index 906db7c..b1c68fc 100644
--- a/src/include/pgtar.h
+++ b/src/include/pgtar.h
@@ -20,4 +20,6 @@ enum tarError
 };
 
 extern enum tarError tarCreateHeader(char *h, const char *filename, const char 
*linktarget, size_t size, mode_t mode, uid_t uid, gid_t gid, time_t mtime);
+extern enum tarError tarHeaderRename(char *h, const char *filename);
 extern int tarChecksum(char *header);
+extern size_t tarHeaderGetName(const char *h, char *buf, size_t buflen);
diff --git a/src/port/tar.c b/src/port/tar.c
index 72fd4e1..23f0201 100644
--- a/src/port/tar.c
+++ b/src/port/tar.c
@@ -45,6 +45,122 @@ tarChecksum(char *header)
 
 
 /*
+ * Split a file path for use in a tar header. The return value is the second
+ * half of the path if a split is required and possible, NULL otherwise.
+ */
+static const char *
+tarSplitName(const char *filename, size_t len)
+{
+   const char *sep;
+
+   if (len <= 99)
+   sep = NULL;
+   else if ((sep = strchr([len-99], '/')) != NULL)
+   sep++;
+
+   return NULL;
+}
+
+
+/*
+ * Fill in the name and prefix fields of the tar format header pointed to by
+ * h. This should only be used when initially filling out the header.
+ */
+static enum tarError
+tarHeaderSetName(char *h, const char *filename, bool isdir, bool longname)
+{
+   const char *prefix, *base;
+   size_t len, baselen, prelen;
+
+   len = strlen(filename);
+   if (longname)
+   base = tarSplitName(filename, len);
+   else
+   base = NULL;
+   if (base == NULL)
+   {
+   prefix = "";
+   prelen = 0;
+   base = filename;
+   baselen = len;
+   }
+   else
+   {
+   prefix = filename;
+   prelen = (base - filename) - 1;
+   baselen = len - (base - filename);
+   }
+
+   /*
+* Save room for a trailing slash if this is a directory or symlink to a
+* directory
+*/
+   if (isdir && base[baselen-1] != '/')
+   baselen++;
+
+   if (baselen > 99 || prelen > 154)
+   return TAR_NAME_TOO_LONG;
+
+   memcpy([345], prefix, prelen);
+   memset([345+prelen], 0, 155 - prelen);
+   memcpy([0], base, baselen);
+   memset([0+baselen], 0, 100 - baselen);
+
+   /*
+* We only support symbolic links to directories, and this is
+* indicated in the tar format by adding a slash at the end of the
+* name, the same as for regular directories.
+*/
+   if (isdir && base[baselen-1] != '/')
+   h[0+baselen-1] = '/';
+
+   return TAR_OK;
+}
+
+
+/*
+ * Wrapper around tarHeaderSetName() to be used when changing the name in an
+ * existing header.
+ */
+enum tarError
+tarHeaderRename(char *h, const char *filename)
+{
+   size_t  len;
+   boolisdir;
+   enum tarError   err;
+
+   /* If the existing name ends with a slash then this must be preserved */
+   len = strnlen(h, 100);
+   isdir = (len > 0 && h[len-1] == '/');
+
+   err = tarHeaderSetName(h, filename, isdir, true);
+
+   if (err == TAR_OK)
+   /* Recalculate checksum as per tarCreateHeader() */
+   sprintf([148], "%06o ", tarChecksum(h));
+
+   return err;
+}
+
+
+/*
+ * Copy the full pathname from the tar header pointed to by h into
+ * buf. Returns the total length of the path, if this is >= len then the path
+ * has been truncated.
+ */
+size_t
+tarHeaderGetName(const char *h, char *buf, size_t buflen)
+{
+   strlcpy(buf, [345], buflen);
+   if (buflen && buf[0] != '\0')
+   strlcat(buf, "/", buflen);
+   strlcat(buf, [0], buflen);
+
+   return (strlen([345]) + 1 + strlen([0]));
+}
+
+
+/*
  * Fill in the buffer pointed to by h with a tar format header. This buffer
  * must always have space for 512 characters, which is a requirement by
  * the tar format.
@@ -68,20 +184,8 @@ tarCreateHeader(char *h, const char *filename, const char 
*linktarget,
memset(h, 0, 512);  /* assume tar header size */
 
/* Name 100 */
-   strlcpy([0], filename, 100);
-   if (linktarget != NULL || S_ISDIR(mode))
-   {
-   /*
-* We only support symbolic links to directories, and this is
-* indicated in the tar format by adding a slash at the end of 
the
-* name, the same as for regular directories.
-*/
-   int

[HACKERS] [PATCH 3/6] pg_basebackup: factor out code to add a recovery.conf file to the tar file.

2015-09-29 Thread Joshua Elsasser
This is just a simple refactor for readability and reusability.
---
 src/bin/pg_basebackup/pg_basebackup.c | 46 ---
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c 
b/src/bin/pg_basebackup/pg_basebackup.c
index fa942ab..f73dd38 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -111,6 +111,7 @@ static void progress_report(int tablespacenum, const char 
*filename, bool force)
 
 static void OpenTarFile(TarStream *tarfile, const char *path);
 static void CloseTarFile(TarStream *tarfile);
+static void TarInsertRecoveryConf(TarStream *stream);
 static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
 static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum);
 static void GenerateRecoveryConf(PGconn *conn);
@@ -874,6 +875,30 @@ CloseTarFile(TarStream *tarfile)
 
 
 /*
+ * Write a recovery.conf file into the tar stream.
+ */
+static void
+TarInsertRecoveryConf(TarStream *stream)
+{
+   static char zerobuf[512];
+   charheader[512];
+   int padding;
+
+   tarCreateHeader(header, "recovery.conf", NULL,
+   recoveryconfcontents->len,
+   0600, 04000, 02000,
+   time(NULL));
+
+   padding = ((recoveryconfcontents->len + 511) & ~511) - 
recoveryconfcontents->len;
+
+   writeTarData(stream, header, sizeof(header));
+   writeTarData(stream, recoveryconfcontents->data, 
recoveryconfcontents->len);
+   if (padding)
+   writeTarData(stream, zerobuf, padding);
+}
+
+
+/*
  * Open a (possibly zlib-compressed) tar file for writing. The filebase
  * argument should be the desired filename relative to basedir, without a .tar
  * or .tar.gz file extension. If the user specified a basedir of - then stdout
@@ -957,27 +982,8 @@ ReceiveTarFile(PGconn *conn, PGresult *res, int rownum)
 * Also, write two completely empty blocks at the end 
of the tar
 * file, as required by some tar programs.
 */
-   charzerobuf[1024];
-
-   MemSet(zerobuf, 0, sizeof(zerobuf));
-
if (basetablespace && writerecoveryconf)
-   {
-   charheader[512];
-   int padding;
-
-   tarCreateHeader(header, "recovery.conf", NULL,
-   
recoveryconfcontents->len,
-   0600, 04000, 
02000,
-   time(NULL));
-
-   padding = ((recoveryconfcontents->len + 511) & 
~511) - recoveryconfcontents->len;
-
-   writeTarData(, header, sizeof(header));
-   writeTarData(, 
recoveryconfcontents->data, recoveryconfcontents->len);
-   if (padding)
-   writeTarData(, zerobuf, padding);
-   }
+   TarInsertRecoveryConf();
 
CloseTarFile();
break;
-- 
2.3.0



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for improving buildfarm robustness

2015-09-29 Thread Alvaro Herrera
Joe Conway wrote:
> On 09/29/2015 12:47 PM, Tom Lane wrote:
> > We could possibly add additional checks, like trying to verify that
> > pg_control has the same inode number it used to.  But I'm afraid that
> > would add portability issues and false-positive hazards that would
> > outweigh the value.
> 
> Not sure you remember the incident, but I think years ago that would
> have saved me some heartache ;-)

I remember it, but I'm not sure it would have helped you.  As I recall,
your trouble was that after a reboot the init script decided to initdb
the mount point -- postmaster wouldn't have been running at all ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-09-29 Thread Robert Haas
On Mon, Sep 28, 2015 at 11:15 PM, Etsuro Fujita
 wrote:
> I thought the same thing [1].  While I thought it was relatively easy to
> make changes to RefetchForeignRow that way for the foreign table case
> (scanrelid>0), I was not sure how hard it would be to do so for the foreign
> join case (scanrelid==0).  So, I proposed to leave that changes for 9.6.
> I'll have a rethink on this issue along the lines of that approach.

Well, I spent some more time looking at this today, and testing it out
using a fixed-up version of your foreign_join_v16 patch, and I decided
that RefetchForeignRow is basically a red herring.  That's only used
for FDWs that do late row locking, but postgres_fdw (and probably many
others) do early row locking, in which case RefetchForeignRow never
gets called. Instead, the row is treated as a "non-locked source row"
by ExecLockRows (even though it is in fact locked) and is re-fetched
by EvalPlanQualFetchRowMarks.  We should probably update the comment
about non-locked source rows to mention the case of FDWs that do early
row locking.

Anyway, everything appears to work OK up to this point: we correctly
retrieve the saved whole-rows from the foreign side and call
EvalPlanQualSetTuple on each one, setting es_epqTuple[rti - 1] and
es_epqTupleSet[rti - 1].  So far, so good.  Now we call
EvalPlanQualNext, and that's where we get into trouble.  We've got the
already-locked tuples from the foreign side and those tuples CANNOT
have gone away or been modified because we have already locked them.
So, all the foreign join needs to do is return the same tuple that it
returned before: the EPQ recheck was triggered by some *other* table
involved in the plan, not our table.  A local table also involved in
the query, or conceivably a foreign table that does late row locking,
could have had something change under it after the row was fetched,
but in postgres_fdw that can't happen because we locked the row up
front.  And thus, again, all we need to do is re-return the same
tuple.  But we don't have that.  Instead, the ROW_MARK_COPY logic has
caused us to preserve a copy of each *baserel* tuple.

Now, this is as sad as can be.  Early row locking has huge advantages
for FDWs, both in terms of minimizing server round trips and also
because the FDW doesn't really need to do anything about EPQ.  Sure,
it's inefficient to carry around whole-row references, but it makes
life easy for the FDW author.

So, if we wanted to fix this in a way that preserves the spirit of
what's there now, it seems to me that we'd want the FDW to return
something that's like a whole row reference, but represents the output
of the foreign join rather than some underlying base table.  And then
get the EPQ machinery to have the evaluation of the ForeignScan for
the join, when it happens in an EPQ context, to return that tuple.
But I don't really have a good idea how to do that.

More thought seems needed here...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CustomScan support on readfuncs.c

2015-09-29 Thread Kouhei Kaigai
> On Mon, Sep 28, 2015 at 8:31 PM, Kouhei Kaigai  wrote:
> >> Instead of doing this:
> >>
> >> +/* Dump library and symbol name instead of raw pointer */
> >>  appendStringInfoString(str, " :methods ");
> >> -_outToken(str, node->methods->CustomName);
> >> +_outToken(str, node->methods->methods_library_name);
> >> +appendStringInfoChar(str, ' ');
> >> +_outToken(str, node->methods->methods_symbol_name);
> >>
> >> Suppose we just make library_name and symbol_name fields in the node
> >> itself, that are dumped and loaded like any others.
> >>
> >> Would that be better?
> >>
> > I have no preference here.
> >
> > Even if we dump library_name/symbol_name as if field in CustomScan,
> > not CustomScanMethods, in a similar way, we cannot use WRITE_STRING_FIELD
> > here, because its 'fldname' assumes these members are direct field of
> > CustomScan.
> >
> >   /* Write a character-string (possibly NULL) field */
> >   #define WRITE_STRING_FIELD(fldname) \
> >   (appendStringInfo(str, " :" CppAsString(fldname) " "), \
> >_outToken(str, node->fldname))
> 
> Well that's exactly what I was suggesting: making them a direct field
> of CustomScan.
>
Let me confirm. Are you suggesting to have library_name/symbol_name
in CustomScan, not CustomScanMethods?
I prefer these fields are in CustomScanMethods because we don't need
to setup them again once PG_init set up these symbols. CustomScan has
to be created every time when it is chosen by planner.

> > One other question I have. Do we have a portable way to lookup
> > a pair of library and symbol by address?
> > Glibc has dladdr() functions that returns these information,
> > however, manpage warned it is not defined by POSIX.
> > If we would be able to have any portable way, it may make the
> > interface simpler.
> 
> Yes: load_external_function.
>
It looks up an address by a pair of library and symbol name
What I'm looking for is a portable function that looks up a pair
of library and symbol name by an address, like dladdr() of glibc.
I don't know whether other *nix or windows have these infrastructure.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Comment update to pathnode.c

2015-09-29 Thread Etsuro Fujita

On 2015/09/29 20:51, Robert Haas wrote:

On Tue, Sep 29, 2015 at 1:55 AM, Etsuro Fujita
 wrote:

Thanks for the comments!  Attached is an updated version of the patch.


Committed and back-patched to 9.5.


Thanks!

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 5/6] pg_basebackup: allow GetConnection() to make non-replication connections.

2015-09-29 Thread Robert Haas
On Tue, Sep 29, 2015 at 6:16 PM, Joshua Elsasser  wrote:
> ---

Hi!

Thanks for submitting patches to the PostgreSQL community.  We need
more developers, and appreciate contributions.  However, I'm somewhat
flummoxed by this particular patch series, because (1) there's no real
explanation of what these patches are trying to accomplish (e.g. this
one has zero explanation other than the subject line), and (2) they
seem to be a mix of bug fixes and new features and general
refactoring.  Bug fixes we will want to back-patch to all supported
releases, but features should just go into master.  It's helpful if
you can explain what falls into which category.

Do some of these patches have dependencies on others?  If so, it would
be helpful to explain that.  I'm not inclined to bother with the
refactoring patches unless they enable some other thing that we agree
is a worthwhile goal.  Moving 20 lines of code into a function is
something that could be done in a lot of places in our source base,
but it doesn't seem like a worthy goal unto itself.

It would probably be a good idea to review
https://wiki.postgresql.org/wiki/Submitting_a_Patch -- the Linux style
of patch submission is generally not preferred here; we like to
discuss closely-related patches as a group, on a single thread, and
separate patches on completely separate threads.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LISTEN denial of service with aborted transaction

2015-09-29 Thread Matt Newell
On Tuesday, September 29, 2015 09:58:35 PM Tom Lane wrote:
> Matt Newell  writes:
> > On Monday, September 28, 2015 07:22:26 PM Tom Lane wrote:
> >> Possibly.  sinval catchup notification works like that, so you could
> >> maybe
> >> invent a similar mechanism for notify.  Beware that we've had to fix
> >> performance issues with sinval catchup; you don't want to release a whole
> >> bunch of processes to do work at the same time.
> > 
> > I'll have to think about this more but i don't envision it causing such as
> > scenario.
> > The extra processing that will happen at signaling time would have been
> > done anyway when the aborted transaction is finally rolled back, the only
> > extra work is copying the relevant notifications to local memory.
> 
> Hm.  An idle-in-transaction listening backend will eventually cause a more
> serious form of denial of service: once the pg_notify SLRU area fills up
> to its maximum of 8GB, no more new messages can be inserted, and every
> transaction that tries to do a NOTIFY will abort.  However, that's a
> documented hazard and we've not really had complaints about it.  In any
> case, trying to fix that by having such a backend absorb messages into
> local memory doesn't seem like a great idea: if it sits for long enough,
> its local memory usage will bloat.  Eventually you'd have a failure
> anyway.
> 
> > Regardless it might not be worthwhile since my patch for #1 seems to
> > address the symptom that bit me.
> 
> I looked at this patch for a bit and found it kind of ugly, particularly
> the business about "we allow one process at a time to advance
> firstUncommitted by using advancingFirstUncommitted as a mutex".
> That doesn't sound terribly safe or performant.
> 
It can be implemented without that but then you'll have multiple backends 
trying to do the same work.  This might not be an issue at all but I couldn't 
tell at first glance the potential cost of extra calls to 
TransactionIdIsInProgress. Since there are no extra locks taken I figured 
ensuring the work is only done once is good for performance.

If the cluster only has one database generating notifies then there is 
practically no extra work with the patch.

As far as safety is concerned I think the worst case is that behavior returns 
to what it is now, where firstUncommitted ends up tracking QUEUE_TAIL.

I originally used a boolean then realized I could get some extra safety by 
using the backendId so that the mutex would be released automatically if the 
backend dies.

> After further thought, I wonder whether instead of having an incoming
> listener initialize its "pos" to QUEUE_TAIL, we couldn't have it
> initialize to the maximum "pos" among existing listeners (with a floor of
> QUEUE_TAIL of course).  If any other listener has advanced over a given
> message X, then X was committed (or aborted) earlier and the new listener
> is not required to return it; so it should be all right to take that
> listener's "pos" as a starting point.

That's a great idea and I will try it.  It does need to be the maximum 
position of existing listeners *with matching db oid" since 
asyncQueueReadAllNotifications doesn't check transaction status if the db oid 
doesn't match it's own.  Another advantage of this approach is that a queued 
notify from an open transaction in one database won't incur additional cost on 
listens coming from other databases, whereas with my patch such a situation 
will prevent firstUncommitted from advancing.

> 
> The minimum-code-change way to do that would be to compute the max pos
> the hard way while Exec_ListenPreCommit holds the AsyncQueueLock, which
> it would now have to do in exclusive mode.  That's a little bit annoying
> but it's surely not much worse than what SignalBackends has to do after
> every notification.
Yeah I think it's going to be a win regardless.  Could also skip the whole 
thing if QUEUE_HEAD - QUEUE_TAIL is under a fixed amount, which would probably 
cover a lot of use cases.

> 
> Alternatively, we could try to maintain a shared pointer that is always
> equal to the frontmost backend's "pos".  But I think that would require
> more locking during queue reading than exists now; so it's far from clear
> it would be a win, especially in systems where listening sessions last for
> a reasonable amount of time (so that Exec_ListenPreCommit is infrequent).
And that would be even more complicated since it has to be per db oid.

Matt Newell


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for improving buildfarm robustness

2015-09-29 Thread Michael Paquier
On Wed, Sep 30, 2015 at 7:19 AM, Tom Lane  wrote:
> I wrote:
>> Josh Berkus  writes:
>>> Give me source with the change, and I'll put it on a cheap, low-bandwith
>>> AWS instance and hammer the heck out of it.  That should raise any false
>>> positives we can expect.
>
>> Here's a draft patch against HEAD (looks like it will work on 9.5 or
>> 9.4 without modifications, too).
>
> BTW: in addition to whatever AWS testing Josh has in mind, it'd be good if
> someone tried it on Windows.  AFAIK, the self-kill() should work in the
> postmaster on Windows, but that should be checked.  Also, does the set of
> errnos it checks cover typical deletion cases on Windows?  Try both
> removal of $PGDATA in toto and removal of just pg_control or just global/.

Just tested on Windows, and this is working fine for me. It seems to
me as well that looking only for ENOENT and ENOTDIR is fine (here is
what I looked at for reference, note the extra EXDEV or STRUNCATE for
example with MS 2015):
https://msdn.microsoft.com/en-us/library/5814770t.aspx
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: use foreign keys to improve join estimates v1

2015-09-29 Thread David Rowley
On 29 September 2015 at 01:59, Tomas Vondra 
wrote:

> Hi,
>
> On 09/27/2015 02:00 PM, David Rowley wrote:
>
>> I've been working on this again. I've put back the code that you wrote
>> for the looping over each combination of relations from either side of
>> the join.
>>
>> I've also added some code to get around the problem with eclass joins
>> and the RestrictInfo having some alternative Vars that don't belong to
>> the foreign key. Basically I'm just checking if the RestrictInfo has a
>> parent_ec, and if it does just loop over the members to try and find the
>> Vars that belong to the foreign key. I've tested it with the following,
>> and it seems to work:
>>
>
> I didn't have time to look into the code yet, but this seems like an
> interesting idea.
>
>
>
>> create table a as select i as a_id1, i as a_id2, i as dummy1 from
>> generate_series(0,999) s(i);
>> alter table a add unique (a_id1, a_id2);
>> create table b as select i as b_id1, i as b_id2 from
>> generate_series(0,332) s(i);
>>
>> analyze a;
>> analyze b;
>>
>> alter table b add foreign key (b_id1, b_id2) references a (a_id1, a_id2);
>>
>> explain analyze select * from a inner join b on a.dummy1 = b.b_id1 and
>> a.a_id2 = b.b_id2 where a.a_id1 = a.dummy1;
>>
>>   QUERY PLAN
>>
>> ---
>>   Hash Join  (cost=18.57..26.41 rows=2 width=20) (actual
>> time=0.775..1.046 rows=333 loops=1)
>> Hash Cond: ((b.b_id1 = a.dummy1) AND (b.b_id2 = a.a_id2))
>> ->  Seq Scan on b  (cost=0.00..5.33 rows=333 width=8) (actual
>> time=0.013..0.046 rows=333 loops=1)
>> ->  Hash  (cost=18.50..18.50 rows=5 width=12) (actual
>> time=0.737..0.737 rows=1000 loops=1)
>>   Buckets: 1024  Batches: 1  Memory Usage: 51kB
>>   ->  Seq Scan on a  (cost=0.00..18.50 rows=5 width=12) (actual
>> time=0.014..0.389 rows=1000 loops=1)
>> Filter: (dummy1 = a_id1)
>>
>> The non-patched version estimates 1 row. The patched estimates 2 rows,
>> but that's due to the bad estimate on dummy1 = a_id1.
>>
>> The 2 comes from ceil(5 * 0.333).
>>
>> Perhaps you have a better test case to for this?
>>
>
> I think the additional WHERE clause is needlessly confusing. I've been
> able to come up with an example - pretty much a normalized with a "main"
> table and auxiliary tables (referencing the main one using FK) with
> additional info. So not unlikely to happen in practice (except maybe for
> the multi-column foreign key bit).
>
>
> CREATE TABLE f (id1 INT, id2 INT, PRIMARY KEY (id1, id2));
>
> CREATE TABLE d1 (id1 INT, id2 INT, FOREIGN KEY (id1, id2) REFERENCES
> f(id1, id2));
> CREATE TABLE d2 (id1 INT, id2 INT, FOREIGN KEY (id1, id2) REFERENCES
> f(id1, id2));
>
> INSERT INTO f SELECT i, i FROM generate_series(1,100) s(i);
>
> INSERT INTO d1 SELECT i, i FROM generate_series(1,10) s(i);
> INSERT INTO d2 SELECT i, i FROM generate_series(1,30) s(i);
>
> now, both pair-wise joins (f JOIN d1) and (f JOIN d2) are estimated
> perfectly accurately, but as soon as the query involves both of them, this
> happens:
>
> SELECT * FROM f JOIN d1 ON (f.id1 = d1.id1 AND f.id2 = d1.id2)
> JOIN d2 ON (f.id1 = d2.id1 AND f.id2 = d2.id2);
>
>   QUERY PLAN
> -
>  Nested Loop  (cost=3334.43..12647.57 rows=3 width=24)
>   (actual time=221.086..1767.206 rows=10 loops=1)
>Join Filter: ((d1.id1 = f.id1) AND (d1.id2 = f.id2))
>->  Hash Join  (cost=3334.00..12647.01 rows=1 width=16)
>   (actual time=221.058..939.482 rows=10 loops=1)
>  Hash Cond: ((d2.id1 = d1.id1) AND (d2.id2 = d1.id2))
>  ->  Seq Scan on d2  (cost=0.00..4328.00 rows=30 width=8)
>  (actual time=0.038..263.356 rows=30 loops=1)
>  ->  Hash  (cost=1443.00..1443.00 rows=10 width=8)
>(actual time=220.721..220.721 rows=10 loops=1)
>Buckets: 131072  Batches: 2  Memory Usage: 2982kB
>->  Seq Scan on d1  (cost=0.00..1443.00 rows=10 ...)
>(actual time=0.033..101.547 rows=10 loops=1)
>->  Index Only Scan using f_pkey on f  (cost=0.42..0.54 rows=1 ...)
> (actual time=0.004..0.004 rows=1 loops=10)
>  Index Cond: ((id1 = d2.id1) AND (id2 = d2.id2))
>  Heap Fetches: 10
>
> Clearly, the inner join (d1 JOIN d2) is poorly estimated (1 vs. 10). I
> assume that's only because we find FK only on the second join with f.
>
> So it seems like s a clear improvement, both compared to master and the
> previous versions of the patch.


I've been experimenting with this example. Of course, the reason why we get
the 1 row estimate on the join between d1 and d2 is that there's no foreign
key 

Re: [HACKERS] Parallel Seq Scan

2015-09-29 Thread Robert Haas
On Thu, Sep 24, 2015 at 2:31 PM, Amit Kapila  wrote:
> [ parallel_seqscan_partialseqscan_v18.patch ]

I spent a bit of time reviewing the heapam.c changes in this patch
this evening, and I think that your attempt to add support for
synchronized scans has some problems.

- In both heapgettup() and heapgettup_pagemode(), you call
ss_report_location() on discovering that we're trying to initialize
after the scan is already complete.  This seems wrong.  For the
reasons noted in the existing comments, it's good for the backend that
finishes the scan to report the starting position as the current
position hint, but you don't want every parallel backend to do it
turn.  Unrelated, overlapping scans might be trying to continue
advancing the scan, and you don't want to drag the position hint
backward for no reason.

- heap_parallelscan_initialize_startblock() calls ss_get_location()
while holding a spinlock.  This is clearly no good, because spinlocks
can only be held while executing straight-line code that does not
itself acquire locks - and ss_get_location() takes an *LWLock*.  Among
other problems, an error anywhere inside ss_get_location() would leave
behind a stuck spinlock.

- There's no point that I can see in initializing rs_startblock at all
when a ParallelHeapScanDesc is in use.  The ParallelHeapScanDesc, not
rs_startblock, is going to tell you what page to scan next.

I think heap_parallelscan_initialize_startblock() should basically do
this, in the synchronized scan case:

SpinLockAcquire(_scan->phs_mutex);
page = parallel_scan->phs_startblock;
SpinLockRelease(_scan->phs_mutex);
if (page != InvalidBlockNumber)
return; /* some other process already did this */
page = ss_get_location(scan->rs_rd, scan->rs_nblocks);
SpinLockAcquire(_scan->phs_mutex);
/* even though we checked before, someone might have beaten us here */
if (parallel_scan->phs_startblock == InvalidBlockNumber)
{
parallel_scan->phs_startblock = page;
parallel_scan->phs_cblock = page;
}
SpinLockRelease(_scan->phs_mutex);

- heap_parallelscan_nextpage() seems to have gotten unnecessarily
complicated.  I particularly dislike the way you increment phs_cblock
and then sometimes try to back it out later.  Let's decide that
phs_cblock == InvalidBlockNumber means that the scan is finished,
while phs_cblock == phs_startblock means that we're just starting.  We
then don't need phs_firstpass at all, and can write:

SpinLockAcquire(_scan->phs_mutex);
page = parallel_scan->phs_cblock;
if (page != InvalidBlockNumber)
{
parallel_scan->phs_cblock++;
if (parallel_scan->phs_cblock >= scan->rs_nblocks)
parallel_scan->phs_cblock = 0;
if (parallel_scan->phs_cblock == parallel_scan->phs_startblock)
{
parallel_scan->phs_cblock = InvalidBlockNumber;
report_scan_done = true;
}
}
SpinLockRelease(_scan->phs_mutex);

At this point, if page contains InvalidBlockNumber, then the scan is
done, and if it contains anything else, that's the next page that the
current process should scan.  If report_scan_done is true, we are the
first to observe that the scan is done and should call
ss_report_location().

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Removing max_connection requirement on hot_standby

2015-09-29 Thread Chris Winslett
I'm orchestrating Postgres to behave as a leader-follower cluster. I've run
into issues when I am scaling down a connection count for a cluster
(scaling up is fine -- scaling down results in fatal errors).  I use an
open source tool I've written to orchestrate the cluster called Governor (
http://github.com/compose/governor).  It would work if I weren't running
hot_standby, but alas…I'm running with it.

I found the code which throws the fatal is actually a pre-flight test for
hot_standby written in 2009 (i.e. battle tested):

https://github.com/postgres/postgres/blob/efc16ea520679d713d98a2c7bf1453c4ff7b91ec/src/backend/access/transam/xlog.c#L5312-L5321

I've tested changing this value from a FATAL to a WARN.  I've compiled and
tested my scenario and all appears to be correct:

https://github.com/compose/postgres/commit/2bdf6b36821987aadb401e1b8590ecc5b02126d8

In researching these lines of code, it appears the original FATAL code was
put in place to ensure that a hot_standby is as close as possibly
configured the same as the leader.

This change will also allow backups taken using `pg_basebackup` to work
with settings that different from the original host.

Am I missing something with this change?

Cheers,
Chris


Re: [HACKERS] LISTEN denial of service with aborted transaction

2015-09-29 Thread Tom Lane
Matt Newell  writes:
> On Monday, September 28, 2015 07:22:26 PM Tom Lane wrote:
>> Possibly.  sinval catchup notification works like that, so you could maybe
>> invent a similar mechanism for notify.  Beware that we've had to fix
>> performance issues with sinval catchup; you don't want to release a whole
>> bunch of processes to do work at the same time.

> I'll have to think about this more but i don't envision it causing such as 
> scenario.
> The extra processing that will happen at signaling time would have been done 
> anyway when the aborted transaction is finally rolled back, the only extra 
> work is copying the relevant notifications to local memory. 

Hm.  An idle-in-transaction listening backend will eventually cause a more
serious form of denial of service: once the pg_notify SLRU area fills up
to its maximum of 8GB, no more new messages can be inserted, and every
transaction that tries to do a NOTIFY will abort.  However, that's a
documented hazard and we've not really had complaints about it.  In any
case, trying to fix that by having such a backend absorb messages into
local memory doesn't seem like a great idea: if it sits for long enough,
its local memory usage will bloat.  Eventually you'd have a failure
anyway.

> Regardless it might not be worthwhile since my patch for #1 seems to address 
> the symptom that bit me.

I looked at this patch for a bit and found it kind of ugly, particularly
the business about "we allow one process at a time to advance
firstUncommitted by using advancingFirstUncommitted as a mutex".
That doesn't sound terribly safe or performant.

After further thought, I wonder whether instead of having an incoming
listener initialize its "pos" to QUEUE_TAIL, we couldn't have it
initialize to the maximum "pos" among existing listeners (with a floor of
QUEUE_TAIL of course).  If any other listener has advanced over a given
message X, then X was committed (or aborted) earlier and the new listener
is not required to return it; so it should be all right to take that
listener's "pos" as a starting point.

The minimum-code-change way to do that would be to compute the max pos
the hard way while Exec_ListenPreCommit holds the AsyncQueueLock, which
it would now have to do in exclusive mode.  That's a little bit annoying
but it's surely not much worse than what SignalBackends has to do after
every notification.

Alternatively, we could try to maintain a shared pointer that is always
equal to the frontmost backend's "pos".  But I think that would require
more locking during queue reading than exists now; so it's far from clear
it would be a win, especially in systems where listening sessions last for
a reasonable amount of time (so that Exec_ListenPreCommit is infrequent).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Arguable RLS security bug, EvalPlanQual() paranoia

2015-09-29 Thread Adam Brightwell
> I'm not convinced this is the right place, but at a minimum it should be
> referenced from the RLS documentation.  Further, it should be noted that
> users who have direct SQL access can control what the isolation level
> is for their transaction.

I agree that it should be referenced by the RLS docs.  However, I'm
not sure I can think of a better place for it.  My reasons for
choosing this location was that the behavior seems specific to the
READ COMMITTED isolation level and was an accepted MVCC anomaly; not
necessarily specific only to RLS and SBV.  But, again, I'd agree that
referencing it in the other locations would be desired.  Of course,
I'm willing to accept that I may be making the wrong assumptions.

> Also, isn't it possible to avoid this by locking the records?  If the
> locking fails or blocks then you know another user has those records
> locked and you don't update or you wait until you hold the lock.
> Assuming that works (I don't immediately see why it wouldn't..), we
> should provide an example.

I haven't found that to work, at least not with the specific case
presented by Peter.  Based on the following (output from Peter's
isolation test), I would understand that the 'malicious' UPDATE is
waiting for the previous update to be committed before it continues,
even without the FOR UPDATE lock on the rows.

step no_trust_mallory: update users set group_id = 1 where user_name =
'mallory';
step update_hc: update information set info = 'secret' where group_id = 2;
step updatem: update information set info = info where group_id = 2
returning 'mallory update: ' m, *; 
step commit_hc: commit;
step updatem: <... completed>

As well, due to this, as described by the READ COMMITTED documentation:

"it is possible for an updating command to see an inconsistent
snapshot: it can see the effects of concurrent updating commands on
the same rows it is trying to update"

I'm not convinced that this is something that FOR UPDATE can address
for this specific case.  If inconsistencies in the 'snapshot' can be
expected and are accepted at this isolation level, then I'm not sure
how we can reasonably expect locking the rows to have any affect.
Though, again, I'm willing to accept that I am not fully understanding
this behavior and that I am completely wrong.

-Adam

-- 
Adam Brightwell - adam.brightw...@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doubt in pgbench TPS number

2015-09-29 Thread Tatsuo Ishii
>> Here conn_total_time is the sum of time to establish connection to
>> PostgreSQL. Since establishing connections to PostgreSQL is done in
>> serial rather than in parallel, conn_total_time should have been
>> divided by nclients.
> 
> After some more thinking and looking again at the connection code, I
> revise slightly my diagnostic:
> 
>  - the amount of parallelism is "nclients", as discussed above, when
>  - reconnecting on each transaction (-C) because the connections are
>  - managed in parallel from doCustom,
> 
> * BUT *
> 
>  - if there is no reconnections (not -C) the connections are performed in
>  - threadRun in a sequential way, all clients wait for the connections of
>  - other clients in the same thread before starting processing
>  - transactions, so "nthreads" is the right amount of parallelism in this
>  - case.
> 
> So on second thought the formula should rather be:
> 
>   ...  / (is_connect? nclients: nthreads)

I don't think this is quite correct.

If is_connect is false, then following loop is executed in threadRun():

/* make connections to the database */
for (i = 0; i < nstate; i++)
{
if ((state[i].con = doConnect()) == NULL)
goto done;
}

Here, nstate is nclients/nthreads. Suppose nclients = 16 and nthreads
= 2, then 2 threads run in parallel, and each thread is connecting 8
times (nstate = 8) in *serial*. The total connection time for this
thread is calculated by "the time ends the loop" - "the time starts
the loop". So if the time to establish a connection is 1 second, the
total connection time for a thread will be 8 seconds. Thus grand total
of connection time will be 2 * 8 = 16 seconds.

If is_connect is true, following loop is executed.

/* send start up queries in async manner */
for (i = 0; i < nstate; i++)
{
CState *st = [i];
Command   **commands = sql_files[st->use_file];
int prev_ecnt = st->ecnt;

st->use_file = getrand(thread, 0, num_files - 1);
if (!doCustom(thread, st, >conn_time, logfile, ))

In the loop, exactly same thing happens as is_connect = false case. If
t = 1, total connection time will be same as is_connect = false case,
i.e. 16 seconds.

In summary, I see no reason to change the v1 patch.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-29 Thread Michael Paquier
On Tue, Sep 29, 2015 at 11:39 PM, Nikolay Shaplov wrote:
> But since now we actually parse data with tuple_data_split, we can
provide a
> precisely formed  fake information, so you are not limited with how it is
> actually stored in page. You just pass any arguments you want. So you
does not
> need warning mode anymore.

Yeah, I agree with you here, let's simplify it then. One could as well
catch the error in a plpgsql wrapper function if that's really necessary
and log the failed events at the same time in a custom way.

- errmsg("input page too small (%d bytes)",
raw_page_size)));
+errmsg("input page too small (%d
bytes)", raw_page_size)));
Please be careful of spurious diffs. Those just make the life of committers
more difficult than it already is.

+ 
+ General idea about output columns: lp_*
attributes
+ are about where tuple is located inside the page;
+ t_xmin, t_xmax,
+ t_field3, t_ctid are about
+ visibility of this tuplue inside or outside of the transaction;
+ t_infomask2, t_infomask has
some
+ information about properties of attributes stored in tuple data.
+ t_hoff sais where tuple data begins and
+ t_bits sais which attributes are NULL and which
are
+ not. Please notice that t_bits here is not an actual value that is
stored
+ in tuple data, but it's text representation  with '0' and '1'
charactrs.
+ 
I would remove that as well. htup_details.h contains all this information.

+ 
+ For more detailed information see documentation:
+ , 
+ and source code: src/include/storage/itemid.h,
+ src/include/access/htup_details.h.
+ 
This looks cool to me though.

+
+test=# select * from heap_page_item_attrs(get_raw_page('pg_range',
0),'pg_range'::regclass);
+[loong tuple data]
This example is too large in character per lines, I think that you should
cut a major part of the fields, why not just keeping lp and t_attrs for
example.

+  
+   
+rel_oid
+oid
+OID of the relation, of the tuple we want to split
+   
+
+   
+tuple_data
+bytea
+tuple raw data to split
+
+   
In the description of tuple_data_split, I am not sure it is worth listing
all the argument of the function like that. IMO, we should just say: those
are the fields returned by "heap_page_items". tuple_data should as well be
renamed to t_data.

+  tuple attributes instead of one peace of raw tuple data. All other
return
This is not that "peaceful" to me. It should be "piece" :)

+   values[13] = PointerGetDatum(tuple_data_bytea);
+   nulls[13] = false;
There is no point to set nulls[13] here.

+
+test=# select tuple_data_split('pg_range'::regclass,
'\x400f1700ba074a0f520f'::bytea, 2304, 6, null);
+   tuple_data_split
+---
+
{"\\x400f","\\x1700","\\x","\\xba07","\\x4a0f","\\x520f"}
+(1 row)
This would be more demonstrative if combined with heap_page_items, like
that for example:
SELECT tuple_data_split('pg_class'::regclass, t_data, t_infomask,
t_infomask2, t_bits) FROM heap_page_items(get_raw_page('pg_class', 0));
And actually this query crashes.
-- 
Michael


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Joshua D. Drake

On 09/29/2015 03:46 PM, Tom Lane wrote:

Alvaro Herrera  writes:

Merlin Moncure wrote:

I've read this email about three times now and it's not clear at all
to me what a issue/bug tracker brings to the table.



In my opinion, this thread is about a bug tracker, not a patch tracker.
We already have a patch tracking system which works very well.


Mumble ... the CF app works, but I'm not sure if it works "very well".
There's some ease-of-use issues I think, though we've now damped them
down to the point where the only really major stumbling block is getting
a patch into the CF app initially.

One thing to think about if we do adopt a bug tracker is how will it
interact with the CF app.  Ideally, patches that represent responses
to issues in the BT would be tracked more or less automatically by
both apps, if indeed we don't merge them altogether.


That was kind of my point, although I obviously am not making it clear. 
The idea that we need just a bug tracker is flawed and narrow in 
thinking. We want something that will integrate into our entire 
development work flow with minimal disruption. Otherwise we are just 
building a lot of infrastructure in different places for no reason. 
Properly configured an issue tracker would service our existing work 
flow including enhancing the commitfest process as well as helping us 
track bugs.


JD




regards, tom lane




--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 5/6] pg_basebackup: allow GetConnection() to make non-replication connections.

2015-09-29 Thread Michael Paquier
On Wed, Sep 30, 2015 at 10:47 AM, Robert Haas  wrote:
> It would probably be a good idea to review
> https://wiki.postgresql.org/wiki/Submitting_a_Patch -- the Linux style
> of patch submission is generally not preferred here; we like to
> discuss closely-related patches as a group, on a single thread, and
> separate patches on completely separate threads.

FWIW base thread is here:
http://www.postgresql.org/message-id/20150929213811.gf22...@idealist.org
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: index-only scans with partial indexes

2015-09-29 Thread Kyotaro HORIGUCHI
Hello,

At Tue, 29 Sep 2015 16:57:03 +0200, Tomas Vondra  
wrote in <560aa6bf.5030...@2ndquadrant.com>
> >>> The patch does not change the check_index_only implementation - it
> >>> still needs to check the clauses, just like in v1 of the patch. To
> >>> make this re-check unnecessary, we'd have to stick the remaining
> >>> clauses somewhere, so that check_index_only can use only the filtered
> >>> list (instead of walking through the complete list of restrictions).
> >>
> >> After thinking about this a bit more, I realized we already have a
> >> good place for keeping this list - IndexClauseSet. So I've done that,
> >
> > The place looks fine but it might be better that rclauses have
> > baserestrictinfo itself when indpred == NIL. It would make the
> > semantics of rclauses more consistent.
> >
> >> and removed most of the code from check_index_only - it still needs to
> >> decide whether to use the full list of restrictions (regular indexes)
> >> or the filtered list (for partial indexes).
> >
> > The change above allows to reduce the modification still left in
> > check_index_only.
> 
> I'm not sure I understand what change you suggest? Could you explain?
> 
> The change in check_index_only is effectively just (a) comment update
> and (b) choice of the right list of clauses. I'd say both changes are
> needed, although (b) could happen outside check_index_only (i.e. we
> could do the check elsewhere). Is that what you mean?

I meant that the (b) could be done earlier in
match_clause_to_index. Currently clauseset->rclauses stores given
(restrict) clauses which are not implied by indpred *only when*
indpred is present. But it's no problem that rclauses *always*
stores such clauses. rclauses is still storing clauses "not
implied by the index" for the case. It is what I meant by "more
consistent".

The following codelet would be more clearer than my crumsy
words:)

in match_clause_to_index()
>   if (index->indpred != NIL &&
>   predicate_implied_by(list_make1(rinfo->clause), index->indpred))
>   return;  /* ignore clauses implied by index */
>
>   /* track all clauses not implied by indpred */
>   clauseset->rclauses = lappend(clauseset->rclauses, rinfo);
>
>   for (indexcol = 0; indexcol < index->ncolumns; indexcol++)

in check_index_only()
-* For partial indexes use the filtered clauses, otherwise use the
-* baserestrictinfo directly for non-partial indexes.
-*/
-   rclauses = (index->indpred != NIL) ? clauses : rel->baserestrictinfo;
-
-   /*
 * Add all the attributes used by restriction clauses (only those not
 * implied by the index predicate for partial indexes).
 */
-   foreach(lc, rclauses)
+   foreach(lc, clauses)

> > cost_index() seems to need to be fixed. It would count excluded
> > clauses in estimate.
> 
> Hmm, good point. The problem is that extract_nonindex_conditions uses
> baserel->baserestrictinfo again, i.e. it does not skip the implied
> clauses. So we may either stick the filtered clauses somewhere (for
> example in the IndexPath), teach extract_nonindex_conditions to use
> predicate_implied_by. I'd say the first option is better. Agreed?

I'll mention this in a separate reply for the following mail.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Pavan Deolasee
On Wed, Sep 30, 2015 at 2:16 AM, Alvaro Herrera 
wrote:

>
>
> That's a very good point.  I think Github and other sites are already
> blocked in countries like India and Cuba.


Github is not blocked in India and was never as far as I know. Well our
government recently blocked some porn sites, but (thankfully) they went
only that far. But I see Oleg's point. Many things including Google are
blocked in China for example.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Doubt in pgbench TPS number

2015-09-29 Thread Tatsuo Ishii
>> FWIW, I vote with Tatsuo-san.  Such a change will break comparability of
>> results with all previous versions, which means it's not something to do
>> in minor releases, even if we now believe the previous results were
>> somewhat bogus.  Arguing that it's "not a behavioral change" seems
>> quite loony to me: for most people, the TPS numbers are the only
>> interesting output of pgbench.
>> 
>> I think we should fix it in 9.5 and document it as an incompatible change.
> 
> Plus if 9.4 or earlier version of PostgreSQL user wants to use fixed
> version of pgbench, he/she can always use pgbench coming with 9.5 or
> later version of PostgreSQL.
> 
>> (Note: I've not read the patch, so this is not an opinion about its
>> correctness.)
> 
> As Fabien anayzed the problem was that pgbench simply uses wrong
> variable: nthreads (number of threads, specified by "-j" option)
> vs. nclients (number of clients, specified by "-c" option).
> 
> @@ -2616,7 +2616,7 @@ printResults(int ttype, int64 normal_xacts, int 
> nclients,
>   time_include = INSTR_TIME_GET_DOUBLE(total_time);
>   tps_include = normal_xacts / time_include;
>   tps_exclude = normal_xacts / (time_include -
> - 
> (INSTR_TIME_GET_DOUBLE(conn_total_time) / nthreads));
> + 
> (INSTR_TIME_GET_DOUBLE(conn_total_time) / nclients));
> 
> Here conn_total_time is the sum of time to establish connection to
> PostgreSQL. Since establishing connections to PostgreSQL is done in
> serial rather than in parallel, conn_total_time should have been
> divided by nclients.

Fix committed to master and 9.5 stable branches.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-09-29 Thread Robert Haas
On Tue, Sep 29, 2015 at 12:39 AM, Amit Kapila  wrote:
> Attached patch is a rebased patch based on latest commit (d1b7c1ff)
> for Gather node.
>
> - I have to reorganize the defines in execParallel.h and .c. To keep
> ParallelExecutorInfo, in GatherState node, we need to include execParallel.h
> in execnodes.h which was creating compilation issues as execParallel.h
> also includes execnodes.h, so for now I have defined ParallelExecutorInfo
> in execnodes.h and instrumentation related structures in instrument.h.
> - Renamed parallel_seqscan_degree to degree_of_parallelism
> - Rename Funnel to Gather
> - Removed PARAM_EXEC parameter handling code, I think we can do this
> separately.
>
> I have to work more on partial seq scan patch for rebasing it and handling
> review comments for the same, so for now I am sending the first part of
> patch (which included Gather node functionality and some general support
> for parallel-query execution).

Thanks for the fast rebase.

This patch needs a bunch of cleanup:

- The formatting for the GatherState node's comment block is unlike
that of surrounding comment blocks.  It lacks the --- dividers,
and the indentation is not the same.  Also, it refers to
ParallelExecutorInfo by the type name, but the other members by
structure member name.  The convention is to refer to them by
structure member name, so please do that.

- The naming of fs_workersReady is inconsistent with the other
structure members.  The other members use all lower-case names,
separating words with dashes, but this one uses a capital letter.  The
other members also don't prefix the names with anything, but this uses
a "fs_" prefix which I assume is left over from when this was called
FunnelState.  Finally, this doesn't actually tell you when workers are
ready, just whether they were launched.  I suggest we rename this to
"any_worker_launched".

- Instead of moving the declaration of ParallelExecutorInfo, please
just refer to it as "struct ParallelExecutorInfo" in execnodes.h.
That way, you're not sucking these includes into all kinds of places
they don't really need to be.

- Let's not create a new PARALLEL_QUERY category of GUC.  Instead,
let's the GUC for the number of workers with under resource usage ->
asynchronous behavior.

- I don't believe that shm_toc *toc has any business being part of a
generic PlanState node.  At most, it should be part of an individual
type of PlanState, like a GatherState or PartialSeqScanState.  But
really, I don't see why we need it there at all.  It should, I think,
only be needed during startup to dig out the information we need.  So
we should just dig that stuff out and keep pointers to whatever we
actually need - in this case the ParallelExecutorInfo, I think - in
the particular type of PlanState node that's at issue - here
GatherState.  After that we don't need a pointer to the toc any more.

- I'd like to do some renaming of the new GUCs.  I suggest we rename
cpu_tuple_comm_cost to parallel_tuple_cost and degree_of_parallelism
to max_parallel_degree.

- I think that a Gather node should inherit from Plan, not Scan.  A
Gather node really shouldn't have a scanrelid.  Now, admittedly, if
the only thing under the Gather is a Partial Seq Scan, it wouldn't be
totally bonkers to think of the Gather as scanning the same relation
that the Partial Seq Scan is scanning.  But in any more complex case,
like where it's scanning a join, you're out of luck.  You'll have to
set scanrelid == 0, I suppose, but then, for example, ExecScanReScan
is not going to work.  In fact, as far as I can see, the only way
nodeGather.c is actually using any of the generic scan stuff is by
calling ExecInitScanTupleSlot, which is all of one line of code.
ExecEndGather fetches node->ss.ss_currentRelation but then does
nothing with it.  So I think this is just a holdover from early
version of this patch where what's now Gather and PartialSeqScan were
a single node, and I think we should rip it out.

- On a related note, the assertions in cost_gather() are both bogus
and should be removed. Similarly with create_gather_plan().  As
previously mentioned, the Gather node should not care what sort of
thing is under it; I am not interested in restricting it to baserels
and then undoing that later.

- For the same reason, cost_gather() should refer to it's third
argument as "rel" not "baserel".

- Also, I think this stuff about physical tlists in
create_gather_plan() is bogus.  use_physical_tlist is ignorant of the
possibility that the RelOptInfo passed to it might be anything other
than a baserel, and I think it won't be happy if it gets a joinrel.
Moreover, I think our plan here is that, at least for now, the
Gather's tlist will always match the tlist of its child.  If that's
so, there's no point to this: it will end up with the same tlist
either way.  If any projection is needed, it should be done by the
Gather node's child, not the Gather node itself.

- Let's rename 

Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Jeff Anton

Seems to me that there are a bunch of agendas here.

I read about not wanting to be trapped into a proprietary system.  You 
can be trapped in any software you depend upon.  Compilers, Platforms, 
SCM, issue tracking are all places to be trapped.


Postgres and Postgresql have been around a very long time for the 
software world.  It has outlived several of the systems it has depended 
upon over those many years.  I hope and expect that Postgres will 
continue to outlive some of these platforms.


So do not get hung up on having been 'burned' in the past.  Expect to be 
'burned' again.  Take steps to minimize that pain in the future.


For an issue tracker, open source or proprietary, I would want raw 
database dumps and schema information.  Postgres is a database after 
all.  If you truly have the data, you should be able to migrate it.


Also, does the system you adopt use Postgres?  You are your best open 
source software.  If performance starts to go downhill, you are in the 
best position to fix it if you understand and control it.  How 
responsive will whatever system be to your changing needs?  If you 
partner with an external group, the two groups will benefit from each 
other if they are truly sharing the technologies.


Jeff Anton


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Andres Freund
On 2015-09-29 13:27:15 -0400, Tom Lane wrote:
> Corey Huinker  writes:
> >>> And they'd sure love to be in charge of our code repo.
> 
> >> Mh - i'm not a native speaker. I didn't understand this line.
> 
> > Tom was saying that the JIRA/Atlassian people would happily volunteer to
> > host our code repository for no cost (in money) to us.
> 
> Not that so much as that the gitlab code really wants to be connected up
> to our code repo.  That makes complete sense in terms of its goals as
> stated by Torsten upthread, namely to be a git repo manager.  But it's
> not what we're looking for here.  We want an issue tracker, and we don't
> particularly need it to decide that it's in charge of repo access
> authorization for instance.  There's a whole bunch of functionality there
> that at best is useless to us, and at worst will get in the way.

I don't have any opinion WRT gitlab, but I'm fairly certain it'd be
unproblematic to configure automatic mirroring into it from
gitmaster.

It imo can make a fair amount of sense to give the bugtracker
et al access to the commits so it can automatically track mentions of
individual bugs and such.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Tom Lane
Andres Freund  writes:
> On 2015-09-29 13:27:15 -0400, Tom Lane wrote:
>> Not that so much as that the gitlab code really wants to be connected up
>> to our code repo.  That makes complete sense in terms of its goals as
>> stated by Torsten upthread, namely to be a git repo manager.  But it's
>> not what we're looking for here.  We want an issue tracker, and we don't
>> particularly need it to decide that it's in charge of repo access
>> authorization for instance.  There's a whole bunch of functionality there
>> that at best is useless to us, and at worst will get in the way.

> I don't have any opinion WRT gitlab, but I'm fairly certain it'd be
> unproblematic to configure automatic mirroring into it from
> gitmaster.

I think you missed my point: gitlab would then believe it's in charge of,
eg, granting write access to that repo.  We could perhaps whack it over
the head till it only does what we want and not ten other things, but
we'd be swimming upstream.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-09-29 Thread Robert Haas
On Tue, Sep 29, 2015 at 4:49 AM, Kouhei Kaigai  wrote:
> Also note that EvalPlanQualFetchRowMarks() will raise an error
> if RefetchForeignRow callback returned NULL tuple.
> Is it right or expected behavior?

That's not how I read the code.  If RefetchForeignRow returns NULL, we
just ignore the row and continue on to the next one:

if (copyTuple == NULL)
{
/* couldn't get the lock, so skip this row */
goto lnext;
}

And that seems exactly right: RefetchForeignRow needs to test that the
tuple is still present on the remote side, and that any remote quals
are matched.  If either of those is false, it can return NULL.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Corey Huinker
>
>
> And they'd sure love to be in charge of our code repo.
>>
>
> Mh - i'm not a native speaker. I didn't understand this line.
>
>
Tom was saying that the JIRA/Atlassian people would happily volunteer to
host our code repository for no cost (in money) to us. The implication
being that they have their own interests for doing so. The obvious reason
is the free advertising, but the reader might infer less altruistic
motivations, which Tom referenced with his next sentence

And the main, possibly only, metadata they can track about an issue is
"assignee" and "milestone".


Having spent a large portion of my life in the Midwest US and Texas, I'm
painfully aware of how often native English speakers speak in idioms and
hyperbole.

(example: "painfully aware" - I am not in actual pain, I'm mildly
embarrassed. I'm also not that "aware", as I managed to use hyperbole in my
sentence lamenting the use of it.)


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Tom Lane
Corey Huinker  writes:
>>> And they'd sure love to be in charge of our code repo.

>> Mh - i'm not a native speaker. I didn't understand this line.

> Tom was saying that the JIRA/Atlassian people would happily volunteer to
> host our code repository for no cost (in money) to us.

Not that so much as that the gitlab code really wants to be connected up
to our code repo.  That makes complete sense in terms of its goals as
stated by Torsten upthread, namely to be a git repo manager.  But it's
not what we're looking for here.  We want an issue tracker, and we don't
particularly need it to decide that it's in charge of repo access
authorization for instance.  There's a whole bunch of functionality there
that at best is useless to us, and at worst will get in the way.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Andres Freund
On 2015-09-29 13:40:28 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > I don't have any opinion WRT gitlab, but I'm fairly certain it'd be
> > unproblematic to configure automatic mirroring into it from
> > gitmaster.
> 
> I think you missed my point: gitlab would then believe it's in charge of,
> eg, granting write access to that repo.  We could perhaps whack it over
> the head till it only does what we want and not ten other things, but
> we'd be swimming upstream.

We today already have a github mirror, where exactly the same thing
exists, no?

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-09-29 Thread Tom Lane
Andres Freund  writes:
> On 2015-09-29 13:40:28 -0400, Tom Lane wrote:
>> I think you missed my point: gitlab would then believe it's in charge of,
>> eg, granting write access to that repo.  We could perhaps whack it over
>> the head till it only does what we want and not ten other things, but
>> we'd be swimming upstream.

> We today already have a github mirror, where exactly the same thing
> exists, no?

Sure, there's a mirror out there somewhere.  It has nothing to do with
our core development processes.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >