date:20160505

Re: [HACKERS] old_snapshot_threshold's interaction with hash index

2016-05-05 Thread Amit Kapila

On Wed, May 4, 2016 at 7:48 PM, Kevin Grittner  wrote:
>
> On Tue, May 3, 2016 at 11:48 AM, Robert Haas 
wrote:
>
> > OK, I see now: the basic idea here is that we can't prune based on the
> > newer XID unless the page LSN is guaranteed to advance whenever data
> > is removed.  Currently, we attempt to limit bloat in non-unlogged,
> > non-catalog tables.  You're saying we can instead attempt to limit
> > bloat only in non-unlogged, non-catalog tables without hash indexes,
> > and that will fix this issue.  Am I right?
>
> As a first cut, something like the attached.
>

Patch looks good to me.  I have done some testing with hash and btree
indexes and it works as expected.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] [BUGS] Breakage with VACUUM ANALYSE + partitions

2016-05-05 Thread Piotr Stefaniak


On 2016-05-05 09:32, Fabien COELHO wrote:

I note that C99 specifically mentions this as something a compiler
might warn about: [...]


Indeed. Neither gcc nor clang emit such warnings... but they might some
day, which would be a blow for my suggestion!


For what it's worth, newer versions of clang can emit useful warnings 
for cases like this:


int main(void) {
enum test { FOUR = 4 };
enum incompatible { INCOMPATIBLE_FOUR = 4 };
enum test variable;
variable = INCOMPATIBLE_FOUR;
variable = 5;
variable = 4;
variable = 3;

return 0;
}

enum.c:5:13: warning: implicit conversion from enumeration type 'enum 
incompatible' to different enumeration type 'enum test' [-Wenum-conversion]

variable = INCOMPATIBLE_FOUR;
 ~ ^
enum.c:6:13: warning: integer constant not in range of enumerated type 
'enum test' [-Wassign-enum]

variable = 5;
   ^
enum.c:8:13: warning: integer constant not in range of enumerated type 
'enum test' [-Wassign-enum]

variable = 3;
   ^
3 warnings generated.

So with -Wenum-conversion -Wassign-enum you could treat enum types as 
distinct and incompatible with each other.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Amit Kapila

On Fri, May 6, 2016 at 12:11 AM, Robert Haas  wrote:
>
> On Thu, May 5, 2016 at 1:32 PM, Tom Lane  wrote:
> > * As is somewhat customary for early drafts of the notes, I've made no
> > attempt to call out which are the most significant changes.  I've not
> > tried to isolate the non-backwards-compatible items, either.
>
> There was quite a bit of discussion of this on pgsql-advocacy, and the
> press release we're going to put out surely must say something.  It
> wouldn't hurt if the release notes at least somewhat matched that.
>

I think with PostgreSQL 9.6, the OLTP workloads (short queries) will see a
major performance improvement for both read-write as well as read-only
workloads mainly due to below features (there could be others as well, but
this is what comes to my mind first):

Replace shared-buffer header spinlocks with atomic operations to improve
scalability
Partition the freelist for shared hash tables, to reduce contention on
many-CPU servers
Reduce contention for the ProcArrayLock
Where feasible, trigger kernel writeback after a configurable number of
writes
Increase the number of clog buffers for better scalability

I think we should add that as a significant change.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Alvaro Herrera

Andres Freund wrote:

> On 2016-05-05 13:32:55 -0400, Tom Lane wrote:

> +
> +   
> +Add pg_config
> +system view to expose the same information available from
> +the pg_config utility (Joe Conway)
> +   
> +  
> 
> Hm. Rereading this I'm wondering whether pg_config isn't going to be
> confused with pg_settings all the time.

Hmm, yeah, that's a good point ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Peter Geoghegan

On Thu, May 5, 2016 at 7:42 PM, Tom Lane  wrote:
> Peter Geoghegan  writes:
>> I think that there could stand to be some consolidation among the
>> items that I authored.
>
> After thinking a bit, I merged all the abbreviated-keys stuff including
> the ordered-set-aggregate item.  Let me know if that seems wrong.

I didn't imagine going so far as putting the ordered-set aggregate
item in there too, but I actually think that that's an improvement.

>> If it were up to me, I'd consolidate the two, and provide a
>> higher-level description. I'd probably say something about CPU cache
>> efficiency, and how the distinction between external sorts and
>> internal sorts has been significantly softened. I'd also mention that
>> the new approach can make better use of larger work_mem settings, and
>> great temp_tablespaces I/O capacity, which Bruce agreed warranted
>> notice in the release notes [1].
>
> Meh.  The release notes are not the place for that kind of detail,
> mainly because nobody will look at old release notes when searching
> for info.

I agree with this, but was concerned that better advice around sizing
work_mem in the documentation would not be accepted.

> Also, I saw that you already had a rather long discussion
> about this associated with the replacement_sort_tuples GUC (which
> *is* a reasonable place for it).  My intention in providing the link
> was so people could consult that info easily --- but I added a few
> more words to point out explicitly that there was more info there.

The documentation provides only a very weak endorsement of the idea
that increasing maintenance_work_mem improves the performance of
*anything*, and it only does so once (work_mem doesn't even get that
much). It's easy to fail to notice that as an expert.

I actually think of the 9.6 work on external sorting as fixing a
problem peculiar to one particular consumer of work_mem as much as
anything else (the problem of weird CPU cache sensitivity). So, my
concern is fairly high-level. I want to communicate two ideas to
users:

1. Consider that your previous experiences with sizing work_mem might
be made obsolete by 9.6. You should be able to safely increase
work_mem with the expectation that doing so will more or less improve
performance across the board, or at least not hurt things. There are
some caveats, of course, but they're fairly limited, and even obvious
(e.g., don't let Postgres do a significant amount of swapping). We
don't need to equivocate, which ISTM is what we did when discussing
these settings before now.

If you think that's unfair, consider how terse the docs are when
discussing sizing work_mem compared to settings like commit_delay and
effective_io_concurrency.

2. A fast temp_tablespaces setting becomes more useful, particularly
when adding more memory stops helping.

I'm not sure that this even needs to be made about the external
sorting stuff. I'm not attached to communicating this to users in any
particular way.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

2016-05-05 Thread Tom Lane

Andreas Seltenreich  writes:
> when fuzz testing master as of c1543a8, parallel workers trigger the
> following assertion in ExecInitSubPlan every couple hours.
> TRAP: FailedAssertion("!(list != ((List *) ((void *)0)))", File: 
> "list.c", Line: 390)
> Sample backtraces of a worker and leader below, plan of leader attached.
> The collected queries don't seem to reproduce it.

Odd.  My understanding of the restrictions on parallel query is that
anything involving a SubPlan ought not be parallelized; and this query
doesn't seem to get parallelized when I try it.  It seems like what you're
seeing is that sometimes that restriction fails to be checked, but how
could that happen?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Tom Lane

Peter Geoghegan  writes:
> I think that there could stand to be some consolidation among the
> items that I authored.

After thinking a bit, I merged all the abbreviated-keys stuff including
the ordered-set-aggregate item.  Let me know if that seems wrong.

> Also, I personally don't really think of these two as separate items,
> even though technically they're independently useful:

OK, also merged.

> If it were up to me, I'd consolidate the two, and provide a
> higher-level description. I'd probably say something about CPU cache
> efficiency, and how the distinction between external sorts and
> internal sorts has been significantly softened. I'd also mention that
> the new approach can make better use of larger work_mem settings, and
> great temp_tablespaces I/O capacity, which Bruce agreed warranted
> notice in the release notes [1].

Meh.  The release notes are not the place for that kind of detail,
mainly because nobody will look at old release notes when searching
for info.  Also, I saw that you already had a rather long discussion
about this associated with the replacement_sort_tuples GUC (which
*is* a reasonable place for it).  My intention in providing the link
was so people could consult that info easily --- but I added a few
more words to point out explicitly that there was more info there.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Tom Lane

Andres Freund  writes:
> wal-writer-flush-after doesn't really fit into this section, it wasn't
> affected by any of the above commits, and the change in 9.6 is to make
> it *less* aggressive in flushing (as you listed in a separate entry).

I hadn't focused on this before, but wal_writer_flush_after is new in 9.6.
I think the right thing to do here is to remove the separate entry for
7975c5e0a and just treat it as part of this group.

> Hm. Kernel traffic is maybe a bit hard to understand (guess you're
> referring to the number of syscalls)? Isn't that also affecting actual
> IO? But mostly it's about our own locking around relation extension?

Right, I was thinking about syscalls.  But given that this only happens
under contention, maybe best to just take that part out.

> An important benefit here is that after that patch we can increase
> the padding of the locks remaining lwlocks; which we previously
> avoided out of memory usage concerns.

I doubt it's necessary to explain that in the release notes...

> Hm, I guess we need a warning about reindexing such indexes after a 
> pg_upgrade somwhere?

See discussion with Noah yesterday.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] modifying WaitEventSets (was: Performance degradation in commit ac1d794)

2016-05-05 Thread Andres Freund

On 2016-05-04 16:05:04 -0400, Robert Haas wrote:
> I'm more than happy to rip it out, either now or after the tree opens
> for 9.7 development.

Let's rip the select support out in 9.7 then; given the relevant code
was already written and tested there's no hurry.  But if you'd rather do
so earlier (as in *now*) I'm not going to protest either.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Andres Freund

Hi,

On 2016-05-05 13:32:55 -0400, Tom Lane wrote:
> * Bruce usually likes to sprinkle the notes with a whole lot of links
> to the main docs.  I've only bothered with links for new GUCs and system
> views.

I guess it'd be worthwhile to add a links for new SQL functions as well.

> Please review and comment before Monday, if you can.

Some minor comments:

+
+   
+Where feasible, trigger kernel writeback after a configurable number
+of writes, to prevent accumulation of dirty data in kernel disk
+buffers (Fabien Coelho, Andres Freund)
+   
+
+   
+The new configuration parameters
+,
+,
+, and
+ control this behavior.
+   
+  

wal-writer-flush-after doesn't really fit into this section, it wasn't
affected by any of the above commits, and the change in 9.6 is to make
it *less* aggressive in flushing (as you listed in a separate entry).

+
+   
+Extend relations multiple blocks at a time (Dilip Kumar)
+   
+
+   
+This reduces kernel traffic, and improves scalability when multiple
+processes are inserting into the same relation.
+   
+  

Hm. Kernel traffic is maybe a bit hard to understand (guess you're
referring to the number of syscalls)? Isn't that also affecting actual
IO? But mostly it's about our own locking around relation extension?

+
+   
+Improve performance by moving buffer content locks into the buffer
+descriptors (Andres Freund, Simon Riggs)
+   
+  

An important benefit here is that after that patch we can increase
the padding of the locks remaining lwlocks; which we previously
avoided out of memory usage concerns.

+
+   
+Add pg_config
+system view to expose the same information available from
+the pg_config utility (Joe Conway)
+   
+  

Hm. Rereading this I'm wondering whether pg_config isn't going to be
confused with pg_settings all the time.

+  
+
+   
+Ensure that invalidation messages are recorded in WAL even when
+issued by a transaction that has no XID assigned (Andres Freund)
+   
+
+   
+This fixes some corner cases in which transactions on standby
+servers failed to notice changes such as new indexes.
+   
+  

FWIW, I'm getting more and more doubtful about backpatching this - the
risk of breaking people's setup seems a lot more severe than any of
the known symptoms - but will start that conversation in the relevant
thread.

+  
+
+   
+If a CHECK constraint is declared NOT VALID in
+a table creation command, automatically mark it valid (Amit Langote,
+Amul Sul)
+   
+
+   
+This matches the longstanding behavior of FOREIGN KEY
+constraints.
+   
+  

Heh. I was about to say that NOT VALID for constraint is relatively
new, just to find out it's been introduced in 9.1...

+
+  
+
+   
+Treat role names beginning with pg_ as reserved
+(Stephen Frost)
+   
+
+   
+User creation of such role names is now disallowed.  This prevents
+conflicts with built-in roles created by initdb.
+   
+  

Maybe we should mention that that's similar to restrictions around
naming schemas?

+  
+
+   
+Fix text search parser to allow leading digits in email
+and host tokens (Artur Zakirov)
+   
+
+   
+In most cases this will result in few changes in the parsing of text.
+But if you have data where such addresses occur frequently, it may be
+worth rebuilding dependent tsvector columns and indexes, so
+that addresses of this form will be found properly by text searches.
+   
+  

Hm, I guess we need a warning about reindexing such indexes after a pg_upgrade 
somwhere?

- Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Peter Geoghegan

On Thu, May 5, 2016 at 10:32 AM, Tom Lane  wrote:
> Please review and comment before Monday, if you can.

I think that there could stand to be some consolidation among the
items that I authored.

Firstly, there's the abbreviated key stuff. The 9.5 notes described
the abbreviated keys feature as follows:

"""
Improve the speed of sorting of varchar, text, and numeric fields via
"abbreviated" keys (Peter Geoghegan, Andrew Gierth, Robert Haas)

"""

Users only cared that there was a nice optimization added to sorting
that affects only the types listed. The main point of this
optimization is that most comparisons can elide a memory access,
anyway. So, I suggest that you consolidate the two new 9.6 items as
follows:

"""
Improve the speed of sorting of UUID, bytea, and char(n) fields via
"abbreviated" keys.

Support for the optimization has also been added to the non-default
operator classes text_pattern_ops, varchar_pattern_ops, and
bpchar_pattern_ops, which support B-tree indexes on the types text,
varchar, and char respectively.

"""

So, that's just one final item instead of two.

Also, I personally don't really think of these two as separate items,
even though technically they're independently useful:

"""
Improve sorting performance by using quicksort, not replacement
selection, within steps of an external sort (Peter Geoghegan)

This behavior can be adjusted via the new configuration parameter
replacement_sort_tuples.

"""

and:

"""
Improve memory management for external sorts (Peter Geoghegan)

"""

If it were up to me, I'd consolidate the two, and provide a
higher-level description. I'd probably say something about CPU cache
efficiency, and how the distinction between external sorts and
internal sorts has been significantly softened. I'd also mention that
the new approach can make better use of larger work_mem settings, and
great temp_tablespaces I/O capacity, which Bruce agreed warranted
notice in the release notes [1].

I can make a suggestion here, too, if you're interested.

[1] http://www.postgresql.org/message-id/20160430234133.gh...@momjian.us
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Tom Lane

Masahiko Sawada  writes:
> Very minor comment but I'd like to unify my name to First Last (i.g.,
> Masahiko Sawada).

Will fix, thanks.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Tom Lane

Peter Geoghegan  writes:
> On Thu, May 5, 2016 at 5:54 PM, Tom Lane  wrote:
>> Oh, now I see why it's not here: it was back-patched into 9.5, so it
>> will not be a new feature in 9.6.0.  It will be listed in the 9.5.3
>> release notes, instead.

> I was really hoping that the OpenSSL bugfix patch would receive the
> same treatment (commit 7c7d4fddab82dc756d8caa67b1b31fcdde355aab).
> Should I take its inclusion here in the 9.6 release notes as
> portending a backpatch never happening?

No, it just means that hasn't happened yet.

> There seems to be a lack of urgency about it,

Perhaps, or maybe it's just a lack of cycles.  I certainly haven't
had any time to think about it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Masahiko Sawada

On Fri, May 6, 2016 at 2:32 AM, Tom Lane  wrote:
> I've pushed a first cut at release notes for 9.6.  There's a good deal
> of work to do yet:
>
> * The section about parallel query could probably stand to be fleshed out,
> but I'm unsure what to say.  Somebody who's worked on that should provide
> some text.
>
> * Bruce usually likes to sprinkle the notes with a whole lot of links
> to the main docs.  I've only bothered with links for new GUCs and system
> views.
>
> * As is somewhat customary for early drafts of the notes, I've made no
> attempt to call out which are the most significant changes.  I've not
> tried to isolate the non-backwards-compatible items, either.
>
> Please review and comment before Monday, if you can.
>

+Avoid re-vacuuming pages containing only frozen tuples
+(Masahiko Sawada, Robert Haas)

+Support synchronous replication with multiple synchronous standby
+servers, not just one (Sawada Masahiko, Beena Emerson, Michael
+Paquier, Fujii Masao, Kyotaro Horiguchi)

Very minor comment but I'd like to unify my name to First Last (i.g.,
Masahiko Sawada).

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Feature request: make cluster_name GUC useful for psql prompts

2016-05-05 Thread Steve Crawford

It's great that 9.5 has the new cluster_name variable as an available GUC.

It would be even better to make that GUC available for use in psql
prompting escape sequences.

Prompting via sequences utilizing %M, %m and %> means the same cluster
could be identified numerous ways (local, 127.0.0.1, 10.1.2.3, localhost,
myserver.example.com, myserver, etc.) which is further exacerbated when
pooling or port-forwarding is in play.

In the inverse case, when logging into a multiple servers and running psql,
all the prompts might just say "local" despite all being different clusters.

Adding an escape sequence that references cluster_name would enable prompts
to identify the cluster in a manner that is both consistent and distinct
regardless of access path.

Potential issues/improvements:

What should the escape-sequence display if cluster_name is not set or the
cluster is a pre-9.5 version. %M? %m?

In future server versions should there be a default for cluster_name if it
is not set? If so, what should it be? Would the server canonical hostname +
listen-port be reasonable?

Cheers,
Steve

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Peter Geoghegan

On Thu, May 5, 2016 at 5:54 PM, Tom Lane  wrote:
>> Hmm, I had decided that wasn't worth listing, but now I can't think
>> why :-(.  Will add it.
>
> Oh, now I see why it's not here: it was back-patched into 9.5, so it
> will not be a new feature in 9.6.0.  It will be listed in the 9.5.3
> release notes, instead.

I was really hoping that the OpenSSL bugfix patch would receive the
same treatment (commit 7c7d4fddab82dc756d8caa67b1b31fcdde355aab).
Should I take its inclusion here in the 9.6 release notes as
portending a backpatch never happening?

There seems to be a lack of urgency about it, and given that it's
moderately complicated, that tends to mean it will keep getting put
off. I did notice that you have an sgml comment about it, but the
wording isn't optimistic.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Tom Lane

I wrote:
> Michael Paquier  writes:
>> Are you not adding VS2015 support in those notes?

> Hmm, I had decided that wasn't worth listing, but now I can't think
> why :-(.  Will add it.

Oh, now I see why it's not here: it was back-patched into 9.5, so it
will not be a new feature in 9.6.0.  It will be listed in the 9.5.3
release notes, instead.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Tom Lane

Vitaly Burovoy  writes:
> 1. "YUriy Zhuravlev" should be "Yury Zhuravlev"
> Previously[1] he had the first version in his signature, but I guess
> it was misconfiguring, now[2] hi has second version.

Ah.  Now that I look, I see we've got three different ASCII-izations
of his name in the notes, none of them right :-(

> 2. I'd add Dean Rasheed as co-author to "Add pg_size_bytes()" since
> being a committer he made more than cosmetic changes[3] but he was
> humble enough not to mention himself in the commit message.

OK.

Thanks for the info!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Vitaly Burovoy

On 5/5/16, Tom Lane  wrote:
> Please review and comment before Monday, if you can.
>
>   regards, tom lane

1. "YUriy Zhuravlev" should be "Yury Zhuravlev"
Previously[1] he had the first version in his signature, but I guess
it was misconfiguring, now[2] hi has second version.

2. I'd add Dean Rasheed as co-author to "Add pg_size_bytes()" since
being a committer he made more than cosmetic changes[3] but he was
humble enough not to mention himself in the commit message.

[1] http://www.postgresql.org/message-id/2723429.ZaCixaFn1x@dinodell
[2] 
http://www.postgresql.org/message-id/dd701b62-008d-4048-882e-20df0e4b5...@postgrespro.ru
[3] 
http://www.postgresql.org/message-id/caezatcxhz5ggfrfcf9_yw5h6wdxr68qdfiwhvmgfr3ascnq...@mail.gmail.com
-- 
Best regards,
Vitaly Burovoy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Tom Lane

Michael Paquier  writes:
> Are you not adding VS2015 support in those notes?

Hmm, I had decided that wasn't worth listing, but now I can't think
why :-(.  Will add it.

> Petr Jelinek is a
> co-author btw, he's missing from the credits in 0fb54de.

OK, thanks.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Michael Paquier

On Fri, May 6, 2016 at 7:53 AM, Tom Lane  wrote:
> Andres Freund  writes:
>> On 2016-05-05 16:25:38 -0400, Robert Haas wrote:
>>> This was basically an attempt to cure a defect in 48354581a and could
>>> perhaps be lumped under that item.
>
>> It's also an independent performance improvement (sadly), and has the
>> potential for issues; so there's *some* benefits on keeping this as its
>> own entry.
>
> I left that as-is, but otherwise adopted Robert's suggestions.
> Thanks for the comments!

Are you not adding VS2015 support in those notes? Petr Jelinek is a
co-author btw, he's missing from the credits in 0fb54de.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg9.6 segfault using simple query (related to use fk for join estimates)

2016-05-05 Thread David Rowley

On 6 May 2016 at 02:48, David Rowley  wrote:
> In the attached I've left the GUC remaining. The reason for the GUC is
> for testing purposes and it should be removed before release. It
> should likely be documented though, even if we're planning to remove
> it later. I've not gotten around to that in this patch though.

I've attached a patch which is the bear minimum fix for the crash
reported by Stefan. I don't think we need to debate any of whether
this goes in.

I also included removing that bogus function declaration from paths.h
as it was upsetting me a bit, and also there should be no debate on
that.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

fkest_crash_fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Naming of new tsvector functions

2016-05-05 Thread Tom Lane

Stas Kelvich  writes:
> On 06 May 2016, at 00:46, Gavin Flower  wrote:
>> On 06/05/16 07:44, Tom Lane wrote:
>>> Yeah, I see we're already a bit inconsistent here.  The problem with using
>>> a ts_ prefix, to my mind, is that it offers no option for distinguishing
>>> tsvector from tsquery, should you need to do that.  Maybe this isn't a
>>> problem for functions that have tsvector as input.

>> use tsv_ and tsq_?

> That would be a good convention if we were able to easily rename old 
> functions.
> But now that will just create another pattern on top of three existing (no 
> prefix, ts_*, tsvector_*).

Yeah :-(.  Well, time grows short, so let's go with ts_ for these.
I'll go make it happen.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_shmem_allocations view

2016-05-05 Thread Andres Freund

On 2014-09-19 23:07:07 -0500, Michael Paquier wrote:
> On Mon, Aug 18, 2014 at 1:12 PM, Robert Haas  wrote:
> > On Mon, Aug 18, 2014 at 1:27 PM, Tom Lane  wrote:
> >> Robert Haas  writes:
> >>> I thought you were printing actual pointer addresses.  If you're just
> >>> printing offsets relative to wherever the segment happens to be
> >>> mapped, I don't care about that.
> >>
> >> Well, that just means that it's not an *obvious* security risk.
> >>
> >> I still like the idea of providing something comparable to
> >> MemoryContextStats, rather than creating a SQL interface.  The problem
> >> with a SQL interface is you can't interrogate it unless (1) you are not
> >> already inside a query and (2) the client is interactive and under your
> >> control.  Something you can call easily from gdb is likely to be much
> >> more useful in practice.
> >
> > Since the shared memory segment isn't changing at runtime, I don't see
> > this as being a big problem.  It could possibly be an issue for
> > dynamic shared memory segments, though.
> Patch has been reviewed some time ago, extra ideas as well as
> potential security risks discussed as well but no new version has been
> sent, hence marking it as returned with feedback.

Here's a rebased version.  I remember why I didn't call the column
"offset" (as Michael complained about upthread), it's a keyword...

Regards,

Andres
>From 719f7e2493832564c58c3ab319344f31abef1653 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Tue, 6 May 2014 19:42:36 +0200
Subject: [PATCH 1/2] Associate names to created dynamic shared memory
 segments.

This unfortunately implies an API breakage.
---
 src/backend/access/transam/parallel.c |  3 +-
 src/backend/storage/ipc/dsm.c | 61 ++-
 src/include/storage/dsm.h |  2 +-
 src/test/modules/test_shm_mq/setup.c  |  2 +-
 4 files changed, 42 insertions(+), 26 deletions(-)

diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 0bba9a7..c1473c1 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -268,7 +268,8 @@ InitializeParallelDSM(ParallelContext *pcxt)
 	 */
 	segsize = shm_toc_estimate(>estimator);
 	if (pcxt->nworkers != 0)
-		pcxt->seg = dsm_create(segsize, DSM_CREATE_NULL_IF_MAXSEGMENTS);
+		pcxt->seg = dsm_create("parallel worker", segsize,
+			   DSM_CREATE_NULL_IF_MAXSEGMENTS);
 	if (pcxt->seg != NULL)
 		pcxt->toc = shm_toc_create(PARALLEL_MAGIC,
    dsm_segment_address(pcxt->seg),
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 573c54d..7166328 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -80,8 +80,10 @@ struct dsm_segment
 /* Shared-memory state for a dynamic shared memory segment. */
 typedef struct dsm_control_item
 {
-	dsm_handle	handle;
+	dsm_handle	handle;			/* segment identifier */
 	uint32		refcnt;			/* 2+ = active, 1 = moribund, 0 = gone */
+	Size		size;			/* current size */
+	char		name[SHMEM_INDEX_KEYSIZE]; /* informational name */
 } dsm_control_item;
 
 /* Layout of the dynamic shared memory control segment. */
@@ -454,15 +456,18 @@ dsm_set_control_handle(dsm_handle h)
  * Create a new dynamic shared memory segment.
  */
 dsm_segment *
-dsm_create(Size size, int flags)
+dsm_create(const char *name, Size size, int flags)
 {
 	dsm_segment *seg;
-	uint32		i;
-	uint32		nitems;
+	dsm_control_item *item;
+	uint32		slot;
 
 	/* Unsafe in postmaster (and pointless in a stand-alone backend). */
 	Assert(IsUnderPostmaster);
 
+	Assert(name != NULL && strlen(name) > 0 &&
+		   strlen(name) < SHMEM_INDEX_KEYSIZE - 1);
+
 	if (!dsm_init_done)
 		dsm_backend_startup();
 
@@ -482,23 +487,18 @@ dsm_create(Size size, int flags)
 	/* Lock the control segment so we can register the new segment. */
 	LWLockAcquire(DynamicSharedMemoryControlLock, LW_EXCLUSIVE);
 
-	/* Search the control segment for an unused slot. */
-	nitems = dsm_control->nitems;
-	for (i = 0; i < nitems; ++i)
+	/*
+	 * Search the control segment for an unused slot that's previously been
+	 * used. If we don't find one initialize a new one if there's still space.
+	 */
+	for (slot = 0; slot < dsm_control->nitems; ++slot)
 	{
-		if (dsm_control->item[i].refcnt == 0)
-		{
-			dsm_control->item[i].handle = seg->handle;
-			/* refcnt of 1 triggers destruction, so start at 2 */
-			dsm_control->item[i].refcnt = 2;
-			seg->control_slot = i;
-			LWLockRelease(DynamicSharedMemoryControlLock);
-			return seg;
-		}
+		if (dsm_control->item[slot].refcnt == 0)
+			break;
 	}
 
-	/* Verify that we can support an additional mapping. */
-	if (nitems >= dsm_control->maxitems)
+	/* Verify that we can support the mapping. */
+	if (slot >= dsm_control->maxitems)
 	{
 		if ((flags & DSM_CREATE_NULL_IF_MAXSEGMENTS) != 0)
 		{
@@ -516,12 +516,23 @@ dsm_create(Size size, int flags)

Re: [HACKERS] pg9.6 segfault using simple query (related to use fk for join estimates)

2016-05-05 Thread Tomas Vondra

Hi,

On 05/05/2016 04:48 PM, David Rowley wrote:

On 5 May 2016 at 16:04, David Rowley  wrote:

I've started making some improvements to this, but need to talk to
Tomas. It's currently in the middle of his night, but will try to
catch him in his morning to discuss this with him.

Ok, so I spoke to Tomas about this briefly, and he's asked me to
send in this patch. He didn't get time to look over it due to some
other commitments he has today.

Thanks!

>

I do personally feel that if the attached is not good enough, or not
very close to good enough then probably the best course of action is
to revert the whole thing. I'd much rather have this out than to
feel some responsibility for someone's performance problems with
this feature.

>

I share this opinion. If the approach proposed here is deemed not good 
enough, then +1 to revert.

I think we can further limit the impact by ignoring foreign keys on a 
single column, because the primary goal of the patch is improving 
estimates with multi-column FKs (and matching joins). I'd argue that 99% 
of the FKs in practice is single-column ones, but sure - if you have a 
database with many multi-column FKs this won't help.

I think it's also important to point out that whatever solution we 
choose, it should probably allow us to relax some of the restrictions in 
the future. For example, with a FK on 3 columns, the current patch only 
handles a "full match" joins, with conditions on all three columns. But 
it may be possible to also improve estimates on just two columns - the 
current patch does not do that, as that needs definitely further more 
thought and discussion.

>

But I also do feel that the patch allows a solution to a big problem
that we currently have with PostgreSQL, which there's any easy
workarounds for... that's multi-column join estimates.

In a sense, yes. FKs allow us to improve estimates for a fairly narrow 
case of multi-column estimates - joins matching multi-column FKs. And 
for this (not entirely narrow) case they are actually more accurate and 
way cheaper than e.g. histograms or MCV lists (at least in the 
multi-column case).

FWIW the current multivariate stats patch does not even try to mess with 
joins, as it's quite complicated in the multi-column case.

In the attached I've left the GUC remaining. The reason for the GUC
is for testing purposes and it should be removed before release. It
should likely be documented though, even if we're planning to remove
it later. I've not gotten around to that in this patch though.

Yes, removal of the GUC should go to the Open Items list.

I'd also like to make it clear that all the shitty parts of the patch 
are my fault, not David's. His name is on the patch as he provided a lot 
of useful ideas, pieces of code and feedback in general. But it was up 
to me to craft the final patch - and I clearly failed to do so. I think 
David still feels responsible for the patch as he made a lot of effort 
to salvage it over the last few days, so I'd like to set the record 
straight.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Tom Lane

Andres Freund  writes:
> On 2016-05-05 16:25:38 -0400, Robert Haas wrote:
>> This was basically an attempt to cure a defect in 48354581a and could
>> perhaps be lumped under that item.

> It's also an independent performance improvement (sadly), and has the
> potential for issues; so there's *some* benefits on keeping this as its
> own entry.

I left that as-is, but otherwise adopted Robert's suggestions.
Thanks for the comments!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Naming of new tsvector functions

2016-05-05 Thread Stas Kelvich


> On 06 May 2016, at 00:46, Gavin Flower  wrote:
> 
> On 06/05/16 07:44, Tom Lane wrote:
>> 
>> Yeah, I see we're already a bit inconsistent here.  The problem with using
>> a ts_ prefix, to my mind, is that it offers no option for distinguishing
>> tsvector from tsquery, should you need to do that.  Maybe this isn't a
>> problem for functions that have tsvector as input.
>> 
>>  regards, tom lane
>> 
>> 
> use tsv_ and tsq_?
> 
> 
> Cheers,
> Gavin
> 

That would be a good convention if we were able to easily rename old functions.
But now that will just create another pattern on top of three existing (no 
prefix, ts_*, tsvector_*).

Stas Kelvich
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is pg_control file crashsafe?

2016-05-05 Thread Tom Lane

Greg Stark  writes:
> One thing we could do without much worry of being less reliable would be to
> keep two copies of pg_control. Write one, fsync, then write to the other
> and fsync that one.

Hmm, interesting thought.  Without knowing more about the filesystem
problem that the OP had, it's hard to tell whether this would have saved
us; but in principle it sounds like it would be more reliable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Naming of new tsvector functions

2016-05-05 Thread Gavin Flower


On 06/05/16 07:44, Tom Lane wrote:

Stas Kelvich  writes:

On 04 May 2016, at 20:15, Tom Lane  wrote:
Also, I'd supposed that we'd rename to tsvector_something, since
the same patch also introduced tsvector_to_array() and
array_to_tsvector().  What's the motivation for using ts_ as the
prefix?

There is already several functions named ts_* (ts_rank, ts_headline, ts_rewrite)
and two named starting from tsvector_* (tsvector_update_trigger, 
tsvector_update_trigger_column).
Personally Iâd prefer ts_ over tsvector_ since it is shorter, and still keeps 
semantics.

Yeah, I see we're already a bit inconsistent here.  The problem with using
a ts_ prefix, to my mind, is that it offers no option for distinguishing
tsvector from tsquery, should you need to do that.  Maybe this isn't a
problem for functions that have tsvector as input.

regards, tom lane



use tsv_ and tsq_?


Cheers,
Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Poorly-thought-out handling of double variables in pgbench

2016-05-05 Thread Tom Lane

Fabien COELHO  writes:
>> A better answer, perhaps, would be to store double-valued variables in 
>> double format to begin with, coercing to text only when and if the value 
>> is interpolated into a string.

> Yep, but that was yet more changes for a limited benefit and would have 
> increase the probability that the patch would have been rejected.

>> That's probably a bigger change than we want to be putting in right now, 
>> though I'm a bit tempted to go try it.

> I definitely agree that the text variable solution is pretty ugly, but it 
> was the minimum change solution, and I do not have much time available.

Well, I felt like doing some minor hacking, so I went and adjusted the
code to work this way.  I'm pretty happy with the result, what do you
think?

regards, tom lane

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 2a9063a..a484165 100644
*** a/src/bin/pgbench/pgbench.c
--- b/src/bin/pgbench/pgbench.c
***
*** 38,43 
--- 38,44 
  #include "portability/instr_time.h"
  
  #include 
+ #include 
  #include 
  #include 
  #include 
*** const char *progname;
*** 185,195 
  
  volatile bool timer_exceeded = false;	/* flag from signal handler */
  
! /* variable definitions */
  typedef struct
  {
! 	char	   *name;			/* variable name */
! 	char	   *value;			/* its value */
  } Variable;
  
  #define MAX_SCRIPTS		128		/* max number of SQL scripts allowed */
--- 186,205 
  
  volatile bool timer_exceeded = false;	/* flag from signal handler */
  
! /*
!  * Variable definitions.  If a variable has a string value, "value" is that
!  * value, is_numeric is false, and num_value is undefined.  If the value is
!  * known to be numeric, is_numeric is true and num_value contains the value
!  * (in any permitted numeric variant).  In this case "value" contains the
!  * string equivalent of the number, if we've had occasion to compute that,
!  * or NULL if we haven't.
!  */
  typedef struct
  {
! 	char	   *name;			/* variable's name */
! 	char	   *value;			/* its value in string form, if known */
! 	bool		is_numeric;		/* is numeric value known? */
! 	PgBenchValue num_value;		/* variable's value in numeric form */
  } Variable;
  
  #define MAX_SCRIPTS		128		/* max number of SQL scripts allowed */
*** typedef struct
*** 237,243 
  	bool		throttling;		/* whether nap is for throttling */
  	bool		is_throttled;	/* whether transaction throttling is done */
  	Variable   *variables;		/* array of variable definitions */
! 	int			nvariables;
  	int64		txn_scheduled;	/* scheduled start time of transaction (usec) */
  	instr_time	txn_begin;		/* used for measuring schedule lag times */
  	instr_time	stmt_begin;		/* used for measuring statement latencies */
--- 247,254 
  	bool		throttling;		/* whether nap is for throttling */
  	bool		is_throttled;	/* whether transaction throttling is done */
  	Variable   *variables;		/* array of variable definitions */
! 	int			nvariables;		/* number of variables */
! 	bool		vars_sorted;	/* are variables sorted by name? */
  	int64		txn_scheduled;	/* scheduled start time of transaction (usec) */
  	instr_time	txn_begin;		/* used for measuring schedule lag times */
  	instr_time	stmt_begin;		/* used for measuring statement latencies */
*** static const BuiltinScript builtin_scrip
*** 363,368 
--- 374,381 
  
  
  /* Function prototypes */
+ static void setIntValue(PgBenchValue *pv, int64 ival);
+ static void setDoubleValue(PgBenchValue *pv, double dval);
  static bool evaluateExpr(TState *, CState *, PgBenchExpr *, PgBenchValue *);
  static void doLog(TState *thread, CState *st, instr_time *now,
  	  StatsData *agg, bool skipped, double latency, double lag);
*** discard_response(CState *state)
*** 836,868 
  	} while (res);
  }
  
  static int
! compareVariables(const void *v1, const void *v2)
  {
  	return strcmp(((const Variable *) v1)->name,
    ((const Variable *) v2)->name);
  }
  
! static char *
! getVariable(CState *st, char *name)
  {
! 	Variable	key,
! 			   *var;
  
  	/* On some versions of Solaris, bsearch of zero items dumps core */
  	if (st->nvariables <= 0)
  		return NULL;
  
  	key.name = name;
! 	var = (Variable *) bsearch((void *) ,
! 			   (void *) st->variables,
! 			   st->nvariables,
! 			   sizeof(Variable),
! 			   compareVariables);
! 	if (var != NULL)
! 		return var->value;
  	else
! 		return NULL;
  }
  
  /* check whether the name consists of alphabets, numerals and underscores. */
--- 849,945 
  	} while (res);
  }
  
+ /* qsort comparator for Variable array */
  static int
! compareVariableNames(const void *v1, const void *v2)
  {
  	return strcmp(((const Variable *) v1)->name,
    ((const Variable *) v2)->name);
  }
  
! /* Locate a variable by name; returns NULL if unknown */
! static Variable *
! lookupVariable(CState *st, char *name)
  {
! 	Variable	key;

Re: [HACKERS] Is pg_control file crashsafe?

2016-05-05 Thread Greg Stark

On 5 May 2016 12:32 am, "Tom Lane"  wrote:
>
> To repeat, I'm pretty hesitant to change this logic.  While this is not
> the first report we've ever heard of loss of pg_control, I believe I could
> count those reports without running out of fingers on one hand --- and
> that's counting since the last century. It will take quite a lot of
> evidence to convince me that some other implementation will be more
> reliable.  If you just come and present a patch to use direct write, or
> rename, or anything else for that matter, I'm going to reject it out of
> hand unless you provide very strong evidence that it's going to be more
> reliable than the current code across all the systems we support.

One thing we could do without much worry of being less reliable would be to
keep two copies of pg_control. Write one, fsync, then write to the other
and fsync that one.

Oracle keeps a copy of the old control file so that you can always go back
to an older version if a hardware or software bug currupts it. But they
keep a lot more data in their control file and they can be quite large.

Re: [HACKERS] pg_dump dump catalog ACLs

2016-05-05 Thread Stephen Frost

* Stephen Frost (sfr...@snowman.net) wrote:
> In any case, as I was saying, that's far closer to 9.5 run-time.  I've
> not measured the time added when things like TRANSFORMs were added, but
> it wouldn't surprise me if adding a new query for every database to
> pg_dump adds something similar to this difference.

Updated patch attached.

This adds the same kind of test suite to a new
'src/test/modules/test_pg_dump' for testing pg_dump with extensions.
I'm going to flesh that out tomorrow with some real tests.  Today was
spent mostly setting up that test suite and spending quite a bit of time
cleaning up the src/bin/pg_dump/t tests, to make it easier for others to
review and make sense of the regexp's and whatnot that are included.

Also updated the skipping of LOCK TABLE, per comments from Tom.

I should be able to get the testing that I want added to test_pg_dump
tomorrow and will push all of this once that's included.  I'd really
like to make sure that we've got coverage of initial privileges in
extensions with pg_dump, which is why I've been working on this today
and haven't pushed the patch-set yet.  I'm pretty sure they work as
expected, but we should be testing it through the buildfarm.

Thanks!

Stephen
From 1e3d3053d9cbb309af0307e31ddea92999d9ed15 Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Sun, 24 Apr 2016 23:59:23 -0400
Subject: [PATCH 1/4] Correct pg_dump WHERE clause for functions/aggregates

The query to grab the function/aggregate information is now joining
to pg_init_privs, so we can simplify (and correct) the WHERE clause
used to determine if a given function's ACL has changed from the
initial ACL on the function.

Bug found by Noah, patch by me.
---
 src/bin/pg_dump/pg_dump.c | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index d3f5157..bb33075 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4673,11 +4673,7 @@ getAggregates(Archive *fout, int *numAggs)
 		  "p.pronamespace != "
 		  "(SELECT oid FROM pg_namespace "
 		  "WHERE nspname = 'pg_catalog') OR "
-		  "EXISTS (SELECT * FROM pg_init_privs pip "
-		  "WHERE p.oid = pip.objoid AND pip.classoid = "
-		  "(SELECT oid FROM pg_class "
-		  "WHERE relname = 'pg_proc') "
-		  "AND p.proacl IS DISTINCT FROM pip.initprivs)",
+		  "p.proacl IS DISTINCT FROM pip.initprivs",
 		  username_subquery,
 		  acl_subquery->data,
 		  racl_subquery->data,
@@ -4923,11 +4919,7 @@ getFuncs(Archive *fout, int *numFuncs)
 		  "pronamespace != "
 		  "(SELECT oid FROM pg_namespace "
 		  "WHERE nspname = 'pg_catalog') OR "
-		  "EXISTS (SELECT * FROM pg_init_privs pip "
-		  "WHERE p.oid = pip.objoid AND pip.classoid = "
-		  "(SELECT oid FROM pg_class "
-		  "WHERE relname = 'pg_proc') "
-		  "AND p.proacl IS DISTINCT FROM pip.initprivs)",
+		  "p.proacl IS DISTINCT FROM pip.initprivs",
 		  acl_subquery->data,
 		  racl_subquery->data,
 		  initacl_subquery->data,
-- 
2.5.0

From 374d17371aff04f3887fa980337154b5c2e8db22 Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Mon, 25 Apr 2016 00:00:15 -0400
Subject: [PATCH 2/4] pg_dump performance and other fixes

Do not try to dump objects which do not have ACLs when only ACLs are
being requested.  This results in a significant performance improvement
as we can avoid querying for further information on these objects when
we don't need to.

When limiting the components to dump for an extension, consider what
components have been requested.  Initially, we incorrectly hard-coded
the components of the extension objects to dump, which would mean that
we wouldn't dump some components even with they were asked for and in
other cases we would dump components which weren't requested.

Correct defaultACLs to use 'dump_contains' instead of 'dump'.  The
defaultACL is considered a member of the namespace and should be
dumped based on the same set of components that the other objects in
the schema are, not based on what we're dumping for the namespace
itself (which might not include ACLs, if the namespace has just the
default or initial ACL).

Use DUMP_COMPONENT_ACL for from-initdb objects, to allow users to
change their ACLs, should they wish to.  This just extends what we
are doing for the pg_catalog namespace to objects which are not
members of namespaces.
---
 src/bin/pg_dump/pg_dump.c | 176 +++---
 1 file changed, 149 insertions(+), 27 deletions(-)

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index bb33075..d826b4d 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -1297,14 +1297,17 @@ checkExtensionMembership(DumpableObject *dobj, Archive *fout)
 	 * contents rather than replace the extension contents with something
 	 * different.
 	 */
-	if (!fout->dopt->binary_upgrade &&

Re: [HACKERS] Missing error handling for FATALs in checkpointer/bgwriter

2016-05-05 Thread Andres Freund

On 2016-05-05 11:52:46 -0700, Andres Freund wrote:
> Hi Jeff,
> 
> On 2016-04-29 10:38:55 -0700, Jeff Janes wrote:
> > I don't see the problem with an cassert-enabled, probably because it
> > is just too slow to ever reach the point where the problem occurs.
> 
> Running the test with cassert enabled I actually get assertion failures,
> due to the FATAL you added.
> 
> #1  0x00958dde in ExceptionalCondition (conditionName=0xb36c2a 
> "!(RefCountErrors == 0)", errorType=0xb361af "FailedAssertion", 
> fileName=0xb36170 
> "/home/admin/src/postgresql/src/backend/storage/buffer/bufmgr.c", 
> lineNumber=2506) at 
> /home/admin/src/postgresql/src/backend/utils/error/assert.c:54
> #2  0x007c9fc9 in CheckForBufferLeaks () at 
> /home/admin/src/postgresql/src/backend/storage/buffer/bufmgr.c:2506
> #3  0x007c9f09 in AtProcExit_Buffers (code=1, arg=0) at 
> /home/admin/src/postgresql/src/backend/storage/buffer/bufmgr.c:2459
> #4  0x007d927f in shmem_exit (code=1) at 
> /home/admin/src/postgresql/src/backend/storage/ipc/ipc.c:261
> #5  0x007d90dd in proc_exit_prepare (code=1) at 
> /home/admin/src/postgresql/src/backend/storage/ipc/ipc.c:185
> #6  0x007d904b in proc_exit (code=1) at 
> /home/admin/src/postgresql/src/backend/storage/ipc/ipc.c:102
> #7  0x0095958d in errfinish (dummy=0) at 
> /home/admin/src/postgresql/src/backend/utils/error/elog.c:543
> #8  0x0080214b in mdwrite (reln=0x2e8b4a8, forknum=MAIN_FORKNUM, 
> blocknum=154, buffer=0x2e8e5a8 "", skipFsync=0 '\000')
> at /home/admin/src/postgresql/src/backend/storage/smgr/md.c:832
> #9  0x00804633 in smgrwrite (reln=0x2e8b4a8, forknum=MAIN_FORKNUM, 
> blocknum=154, buffer=0x2e8e5a8 "", skipFsync=0 '\000')
> at /home/admin/src/postgresql/src/backend/storage/smgr/smgr.c:650
> #10 0x007ca548 in FlushBuffer (buf=0x7f0285955330, reln=0x2e8b4a8) at 
> /home/admin/src/postgresql/src/backend/storage/buffer/bufmgr.c:2734
> #11 0x007c9d5a in SyncOneBuffer (buf_id=2503, skip_recently_used=0 
> '\000', wb_context=0x7ffe7305d290) at 
> /home/admin/src/postgresql/src/backend/storage/buffer/bufmgr.c:2377
> #12 0x007c964e in BufferSync (flags=64) at 
> /home/admin/src/postgresql/src/backend/storage/buffer/bufmgr.c:1967
> #13 0x007ca185 in CheckPointBuffers (flags=64) at 
> /home/admin/src/postgresql/src/backend/storage/buffer/bufmgr.c:2561
> #14 0x0052d497 in CheckPointGuts (checkPointRedo=382762776, flags=64) 
> at /home/admin/src/postgresql/src/backend/access/transam/xlog.c:8644
> #15 0x0052cede in CreateCheckPoint (flags=64) at 
> /home/admin/src/postgresql/src/backend/access/transam/xlog.c:8430
> #16 0x007706ac in CheckpointerMain () at 
> /home/admin/src/postgresql/src/backend/postmaster/checkpointer.c:488
> #17 0x0053e0d5 in AuxiliaryProcessMain (argc=2, argv=0x7ffe7305ea40) 
> at /home/admin/src/postgresql/src/backend/bootstrap/bootstrap.c:429
> #18 0x0078099f in StartChildProcess (type=CheckpointerProcess) at 
> /home/admin/src/postgresql/src/backend/postmaster/postmaster.c:5227
> #19 0x0077dcc3 in reaper (postgres_signal_arg=17) at 
> /home/admin/src/postgresql/src/backend/postmaster/postmaster.c:2781
> #20 
> #21 0x7f028ebbdac3 in __select_nocancel () at 
> ../sysdeps/unix/syscall-template.S:81
> #22 0x0077c049 in ServerLoop () at 
> /home/admin/src/postgresql/src/backend/postmaster/postmaster.c:1654
> #23 0x0077b7a9 in PostmasterMain (argc=4, argv=0x2e49f20) at 
> /home/admin/src/postgresql/src/backend/postmaster/postmaster.c:1298
> #24 0x006c5849 in main (argc=4, argv=0x2e49f20) at 
> /home/admin/src/postgresql/src/backend/main/main.c:228
> 
> You didn't see those?
> 
> 
> The trigger here appears to be that the checkpointer doesn't have
> on-exit callback similar to a normal backend's ShutdownPostgres() et al,
> and thus doesn't trigger a resource owner release.  The normal ERROR
> path has
>   /* buffer pins are released here: */
>   ResourceOwnerRelease(CurrentResourceOwner,
>
> RESOURCE_RELEASE_BEFORE_LOCKS,
>false, true);
>   /* we needn't bother with the other ResourceOwnerRelease phases 
> */
> 
> That clearly is a bug. But I'm not immediately seing how this could
> trigger the corruption issue you observed.


The same issue exists in bgwriter afaics. ISTM that we need to provide
an before_shmem_exit (or on_shmem_exit?) handler for both which essentially does
/*
 * These operations are really just a minimal subset of
 * AbortTransaction().  We don't have very many resources to worry
 * about in bgwriter, but we do have LWLocks, buffers, and temp files.
 */
LWLockReleaseAll();
AbortBufferIO();
UnlockBuffers();
/* buffer pins are released here: */

Re: [HACKERS] [sqlsmith] Failed assertion in BecomeLockGroupLeader

2016-05-05 Thread Andreas Seltenreich

Alvaro Herrera writes:

> Robert Haas wrote:
>> On Thu, May 5, 2016 at 4:11 PM, Andreas Seltenreich  
>> wrote:
>> > I don't see these crashes anymore in c1543a8.  By the amount of fuzzing
>> > done it should have happened a dozen times, so it's highly likely
>> > something in 23b09e15..c1543a8 fixed it.
>> 
>> Hmm, I'd guess c45bf5751b6338488bd79ce777210285531da373 to be the most
>> likely candidate.
>
> I thought so too, but then that patch change things in the planner side,
> but it seems to me that the reported crash is in the executor, unless I'm
> misreading.

Tom had a theory in Message-ID: <12751.1461937...@sss.pgh.pa.us> on how
the planner bug could cause the executor crash.

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is pg_control file crashsafe?

2016-05-05 Thread Andres Freund

On 2016-05-05 00:32:29 -0400, Tom Lane wrote:
> To repeat, I'm pretty hesitant to change this logic.  While this is not
> the first report we've ever heard of loss of pg_control, I believe I could
> count those reports without running out of fingers on one hand --- and
> that's counting since the last century. It will take quite a lot of
> evidence to convince me that some other implementation will be more
> reliable.  If you just come and present a patch to use direct write, or
> rename, or anything else for that matter, I'm going to reject it out of
> hand unless you provide very strong evidence that it's going to be more
> reliable than the current code across all the systems we support.

https://lwn.net/SubscriberLink/686150/9697c313930fbe99/ :

"Jeff Moyer pointed out that sector tearing can happen on block devices
like SSDs, which is not what users expect. "
"Actually, what I said was that sector tearing doesn't usually happen on
SSDs due to the nature of the FTL. Traditional storage, however, never
guaranteed sector atomicity, but it usually does provide it."

FWIW, at the LSF/MM session Robert and I attended I talked to a Seagate
and a WD (IIRC) employee, and there answer echoed the second comment
from above. It's unlikely, but entirely possible to get torn sectors
after power outages. What's worse, if you get one it's entirely possible
that future *reads* will not just return torn contents, but an error.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [sqlsmith] Failed assertion in BecomeLockGroupLeader

2016-05-05 Thread Alvaro Herrera

Robert Haas wrote:
> On Thu, May 5, 2016 at 4:11 PM, Andreas Seltenreich  
> wrote:
> >> Amit Kapila writes:
> >>> Sounds good.  So can we assume that you will try to get us the new report
> >>> with more information?
> >
> > I don't see these crashes anymore in c1543a8.  By the amount of fuzzing
> > done it should have happened a dozen times, so it's highly likely
> > something in 23b09e15..c1543a8 fixed it.
> 
> Hmm, I'd guess c45bf5751b6338488bd79ce777210285531da373 to be the most
> likely candidate.

I thought so too, but then that patch change things in the planner side,
but it seems to me that the reported crash is in the executor, unless I'm
misreading.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Poorly-thought-out handling of double variables in pgbench

2016-05-05 Thread Fabien COELHO



Hello Tom,


While testing 7a622b273 I happened to notice this:

\set x greatest(3, 2, 4.9)
create table mytab (x numeric);
insert into mytab values(:x);
 x
--
4.900355

The reason for that is that the result of a "double" calculation
is coerced to text like this:

   sprintf(res, "%.18e", result.u.dval);

apparently on the theory that this will result in being able to convert it
back to double with no loss of low-order bits.


Yep. I did that for this very reason.

ISTM that the alternative was to reimplement the variable stuff to add & 
manage types, but that induces significant changes, and such change are 
likely to be rejected, or at least to raise long debates and questions, 
make the patch harder to check... so I just avoided them.


But of course the last two or three digits of such output are pretty 
much garbage to the naked eye.


Sure, it is binary to decimal conversion noise.

Then what gets interpolated as the variable value is something like 
'4.900355e+00'.


I think this is a very poor decision; it's going to cause surprises and 
probably bug reports.


Moreover, if we were testing this behavior in the buildfarm (which we 
are not, but only because the test coverage for pgbench is so shoddy),


I think that "none" is the right word?

we would be seeing failures, because those low-order digits are going to 
be platform dependent.


Possibly.


The only value of doing it like this would be if people chained multiple
floating-point calculations in successive pgbench \set commands and needed
full precision to be retained from one \set to the next ... but how likely
is that?


The constraint for me is that a stored double should not lose precision. 
Any solution which achieves this goal is fine with me.



A minimum-change fix would be to print only DBL_DIG digits here.


Probably 15, but there is some rounding implied I think (?). I'm not that 
sure of DBL_DIG+1, so I put 18 to be safer. I did not envision that 
someone would store that in a NUMERIC:-) Your example would work fine with 
a DOUBLE PRECISION attribute.


A better answer, perhaps, would be to store double-valued variables in 
double format to begin with, coercing to text only when and if the value 
is interpolated into a string.


Yep, but that was yet more changes for a limited benefit and would have 
increase the probability that the patch would have been rejected.


That's probably a bigger change than we want to be putting in right now, 
though I'm a bit tempted to go try it.



Thoughts?


My 0.02€:

I definitely agree that the text variable solution is pretty ugly, but it 
was the minimum change solution, and I do not have much time available.


I do not think that rounding values is desirable, on principle.

Now doubles are only really there because random_*() functions need one 
double argument, otherwise only integers make much sense as keys to select 
tuples in a performance bench. So this is not a key feature, just a side 
effect of having more realistic non uniform random generators and 
expressions. In an early submission I did not bother to include double 
variables because I cannot see much actual use to them, and there were 
implementation issues given how integer variables were managed.


Also I think that the above NUMERIC/DOUBLE case is a little contrived, so 
I would wait for someone to actually complain with a good use case before 
spending any significant time to change how variables are stored in 
pgbench.


I would expect a long review process for such a patch if I submitted it, 
because there are details and the benefit is quite limited.



BTW, just to add insult to injury, the debug() function prints double
values with "%.f", which evidently had even less thought put into it.


AFAICR the thought I put into it is that it is definitely useful to be 
able to get a simple trace for debugging purpose. The precision of this 
debug output is not very important, so I probably just put the simplest 
possible format.



That one should definitely be %g with DBL_DIG precision.


That would be fine as well.

--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] quickdie doing memory allocations (was atomic pin/unpin causing errors)

2016-05-05 Thread Andres Freund

On 2016-05-05 16:32:38 -0400, Robert Haas wrote:
> On Thu, May 5, 2016 at 11:51 AM, Andres Freund  wrote:
> >> #7  0x00080149e3d6 in nsdispatch () from /lib/libc.so.7
> >> #8  0x0008014a41c6 in __cxa_finalize () from /lib/libc.so.7
> >> #9  0x00080144525c in exit () from /lib/libc.so.7
> >> #10 0x008e1bc2 in quickdie (postgres_signal_arg=3) at 
> >> postgres.c:2623
> 
> Eh, this doesn't this __cxa_finalize() stuff suggest that some C++
> code was linked into the backend?

IIRC __cxa_finalize also handles atexit() (and gcc
__attribute__((destructor))).

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] quickdie doing memory allocations (was atomic pin/unpin causing errors)

2016-05-05 Thread Robert Haas

On Thu, May 5, 2016 at 11:51 AM, Andres Freund  wrote:
>> #7  0x00080149e3d6 in nsdispatch () from /lib/libc.so.7
>> #8  0x0008014a41c6 in __cxa_finalize () from /lib/libc.so.7
>> #9  0x00080144525c in exit () from /lib/libc.so.7
>> #10 0x008e1bc2 in quickdie (postgres_signal_arg=3) at postgres.c:2623

Eh, this doesn't this __cxa_finalize() stuff suggest that some C++
code was linked into the backend?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Andres Freund

On 2016-05-05 16:25:38 -0400, Robert Haas wrote:
> On Thu, May 5, 2016 at 1:32 PM, Tom Lane  wrote:
> > Please review and comment before Monday, if you can.
> 
> Overall, I think this looks pretty great.  Thanks for pulling it
> together so quickly.

+1

> +
> +   
> +Use atomic operations, rather than a spinlock, to protect an LWLock's
> +wait queue (Andres Freund)
> +   
> +  
> +
> +  
> 
> This was basically an attempt to cure a defect in 48354581a and could
> perhaps be lumped under that item.

It's also an independent performance improvement (sadly), and has the
potential for issues; so there's *some* benefits on keeping this as its
own entry.


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [sqlsmith] Failed assertion in BecomeLockGroupLeader

2016-05-05 Thread Robert Haas

On Thu, May 5, 2016 at 4:11 PM, Andreas Seltenreich  wrote:
>> Amit Kapila writes:
>>> Sounds good.  So can we assume that you will try to get us the new report
>>> with more information?
>
> I don't see these crashes anymore in c1543a8.  By the amount of fuzzing
> done it should have happened a dozen times, so it's highly likely
> something in 23b09e15..c1543a8 fixed it.

Hmm, I'd guess c45bf5751b6338488bd79ce777210285531da373 to be the most
likely candidate.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Robert Haas

On Thu, May 5, 2016 at 1:32 PM, Tom Lane  wrote:
> Please review and comment before Monday, if you can.

Overall, I think this looks pretty great.  Thanks for pulling it
together so quickly.  Various nitpicky comments below.

+   
+Extend relations multiple blocks at a time (Dilip Kumar)
+   
+
+   
+This reduces kernel traffic, and improves scalability when multiple
+processes are inserting into the same relation.
+   
+  

I think this should be rephrased so as to make it clear that we only
do this when there is contention.

+
+   
+Use atomic operations, rather than a spinlock, to protect an LWLock's
+wait queue (Andres Freund)
+   
+  
+
+  

This was basically an attempt to cure a defect in 48354581a and could
perhaps be lumped under that item.

+   
+Force backends to exit if the postmaster dies
+(Rajeev Rastogi and Robert Haas)
+   
+

Separator between names should probably be a comma rather than "and".

+   
+Ensure that trigonometric functions handle infinity and NaN inputs per
+spec (Dean Rasheed)
+   

"per spec" is pretty vague. What spec?  And how about spelling out
"specification"?

+   
+An example usage is CHECK(num_nonnulls(a,b,c) = 1) which
+asserts that exactly one of a,b,c isn't NULL.  These functions can
+also be pressed into service to count the number of null or nonnull
+elements in an array.
+   

"pressed into service"?  If it's a hack, let's not mention it at all.
If it's not a hack, let's just say the functions can do that, plain
and simple.

+   
+This function converts strings like those produced
+by pg_size_pretty() into sizes in bytes.  An example
+usage is WHERE pg_total_relation_size(oid) 
+pg_size_bytes('10 GB').
+   

I would be inclined to expend a few more bytes to turn that into a
complete SQL query, like "SELECT oid::regclass FROM pg_class
WHERE...".

+   
+In pg_size_pretty(), format negative numbers similarly to
+positive ones (Adrian Vondendriesch)
+   

Maybe add: "Previously, the suffix for a negative number was always
'bytes', never 'kB', 'MB', 'GB', or 'TB'".  Or something like that.

+   
+Add an optional missing_ok argument to
+the current_setting() function (David Christensen)
+   

Adjust text to clarify that it suppresses the error for a nonexisting setting?

+   
+Improve error reporting during initdb's post-bootstrap
+phase, by not reporting the entire input file as the failing
+query (Tom Lane)
+   

This reads oddly to me  How about "When initdb fails during the
post-bootstrap phase, report the failing query rather than the
entirety of the file that contained it"?

+   
+Consider performing sorts on the remote server (Ashutosh Bapat)
+   
+  
+
+  
+
+   
+Push down joins to the remote server when possible (Shigeru Hanada,
+Ashutosh Bapat)
+   

I think these could be worded the same way.  Like maybe "Consider
performing sorts on the remote server" and "Consider performing joins
on the remote server".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] quickdie doing memory allocations (was atomic pin/unpin causing errors)

2016-05-05 Thread Andres Freund

Hi,

On 2016-05-05 15:56:45 -0400, Tom Lane wrote:
> Andres Freund  writes:
> >> #0  0x0008014321d7 in sbrk () from /lib/libc.so.7
> >> #1  0x000801431ddd in sbrk () from /lib/libc.so.7
> >> #2  0x00080142e5bb in sbrk () from /lib/libc.so.7
> >> #3  0x00080142e085 in sbrk () from /lib/libc.so.7
> >> #4  0x00080142de28 in sbrk () from /lib/libc.so.7
> >> #5  0x00080142e1cf in sbrk () from /lib/libc.so.7
> >> #6  0x000801439815 in free () from /lib/libc.so.7
> >> #7  0x00080149e3d6 in nsdispatch () from /lib/libc.so.7
> >> #8  0x0008014a41c6 in __cxa_finalize () from /lib/libc.so.7
> >> #9  0x00080144525c in exit () from /lib/libc.so.7
> >> #10 0x008e1bc2 in quickdie (postgres_signal_arg=3) at 
> >> postgres.c:2623
> >> #11 
> >> #12 0x000801431847 in sbrk () from /lib/libc.so.7
> 
> > That looks like independent issue, namely that we're trigger memory
> > allocations from a signal handler (see frames 12, 11, 10, 9). Presumably
> > due to system registered atexit handlers.  I suspect we should be using
> > _exit() here?  Tom?
> 
> I don't think that would improve matters.  In the first place, if we use
> _exit() here that might encourage third-party extension authors to believe
> they should use _exit(), which would be bad.

The sourcetree already has a number of _exit() calls, so I don't think
that'd make a meaningfull difference.


> In the second place, we don't know what it is we're skipping by not
> running atexit handlers, and again that could be bad.

I've a hard time coming up with a scenario where that'd be a problem in
a PANIC case. Isn't it pretty common to use _exit after fatal errors
(and forks)?


> In the third place, by the time we
> get to the exit() call we've already exposed ourselves to a whole lot of
> such hazards by running ereport() (including sending a message to the
> client!).

True. And that's not good. But the magic of ErrorContext shields us from
a fair amount of issues.


> In the fourth place, if we've received a quickdie interrupt,
> it doesn't actually matter if the process crashes; we just want it to
> quit ASAP.

If it always were crashing, that'd be somewhat fine. But sbrk internally
uses mutexes, so this can result in processes getting stuck. And that is
a problem.  There've actually been reports about that every now and then.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [sqlsmith] Failed assertion in BecomeLockGroupLeader

2016-05-05 Thread Andreas Seltenreich

> Amit Kapila writes:
>> Sounds good.  So can we assume that you will try to get us the new report
>> with more information?

I don't see these crashes anymore in c1543a8.  By the amount of fuzzing
done it should have happened a dozen times, so it's highly likely
something in 23b09e15..c1543a8 fixed it.

regards,
andreas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

2016-05-05 Thread Andreas Seltenreich

Hi,

when fuzz testing master as of c1543a8, parallel workers trigger the
following assertion in ExecInitSubPlan every couple hours.

TRAP: FailedAssertion("!(list != ((List *) ((void *)0)))", File: "list.c", 
Line: 390)

Sample backtraces of a worker and leader below, plan of leader attached.
The collected queries don't seem to reproduce it.  Curiously, running
explain on them on the failed instance after crash recovery never shows
any gather nodes…

regards,
andreas

Core was generated by `postgres: bgworker: parallel worker for PID 28062
   '.
Program terminated with signal SIGABRT, Aborted.
(gdb) bt
#3  0x0061bad2 in list_nth_cell (list=0x0, n=) at 
list.c:390
#4  0x0061bb26 in list_nth (list=, n=) at 
list.c:413
#5  0x005fe566 in ExecInitSubPlan (subplan=subplan@entry=0x1522a08, 
parent=parent@entry=0x1538188) at nodeSubplan.c:705
#6  0x005e3b54 in ExecInitExpr (node=0x1522a08, 
parent=parent@entry=0x1538188) at execQual.c:4724
#7  0x005e415c in ExecInitExpr (node=, 
parent=parent@entry=0x1538188) at execQual.c:5164
#8  0x005ff3fc in ExecInitAlternativeSubPlan 
(asplan=asplan@entry=0x1522060, parent=parent@entry=0x1538188) at 
nodeSubplan.c:1185
#9  0x005e35c4 in ExecInitExpr (node=0x1522060, 
parent=parent@entry=0x1538188) at execQual.c:4740
#10 0x005e3f8a in ExecInitExpr (node=0x1522978, 
parent=parent@entry=0x1538188) at execQual.c:4845
#11 0x005e415c in ExecInitExpr (node=, 
parent=parent@entry=0x1538188) at execQual.c:5164
#12 0x005e3687 in ExecInitExpr (node=0x1522920, 
parent=parent@entry=0x1538188) at execQual.c:4648
#13 0x005e330f in ExecInitExpr (node=0x15228c8, 
parent=parent@entry=0x1538188) at execQual.c:5132
#14 0x005e326f in ExecInitExpr (node=0x1522870, 
parent=parent@entry=0x1538188) at execQual.c:5152
#15 0x005e415c in ExecInitExpr (node=, 
parent=parent@entry=0x1538188) at execQual.c:5164
#16 0x005fbb62 in ExecInitSeqScan (node=0x1522728, estate=0x15379b8, 
eflags=16) at nodeSeqscan.c:192
#17 0x005dd567 in ExecInitNode (node=0x1522728, 
estate=estate@entry=0x15379b8, eflags=eflags@entry=16) at execProcnode.c:192
#18 0x005f12a5 in ExecInitHashJoin (node=0x1522530, estate=0x15379b8, 
eflags=16) at nodeHashjoin.c:489
#19 0x005dd497 in ExecInitNode (node=node@entry=0x1522530, 
estate=estate@entry=0x15379b8, eflags=eflags@entry=16) at execProcnode.c:275
#20 0x005dae6c in InitPlan (eflags=16, queryDesc=) at 
execMain.c:959
#21 standard_ExecutorStart (queryDesc=, eflags=16) at 
execMain.c:238
#22 0x005dcac4 in ParallelQueryMain (seg=, 
toc=0x7f442d27b000) at execParallel.c:729
#23 0x004e631b in ParallelWorkerMain (main_arg=) at 
parallel.c:1033
#24 0x00683af2 in StartBackgroundWorker () at bgworker.c:726
#25 0x0068ec32 in do_start_bgworker (rw=0x14c4a20) at postmaster.c:5531
#26 maybe_start_bgworker () at postmaster.c:5706
#27 0x0068f686 in sigusr1_handler (postgres_signal_arg=) 
at postmaster.c:4967
#28 
#29 0x7f442c839ac3 in __select_nocancel () at 
../sysdeps/unix/syscall-template.S:81
#30 0x0046c144 in ServerLoop () at postmaster.c:1654
#31 0x00690aae in PostmasterMain (argc=argc@entry=4, 
argv=argv@entry=0x14a1560) at postmaster.c:1298
#32 0x0046d78d in main (argc=4, argv=0x14a1560) at main.c:228
(gdb) frame 5
#5  0x005fe566 in ExecInitSubPlan (subplan=subplan@entry=0x1522a08, 
parent=parent@entry=0x1538188) at nodeSubplan.c:705
(gdb) list
704 /* Link the SubPlanState to already-initialized subplan */
705 sstate->planstate = (PlanState *) 
list_nth(estate->es_subplanstates,
706subplan->plan_id - 
1);

(gdb) attach 28062
Attaching to program: /home/smith/postgres/inst/master/bin/postgres, process 
28062
(gdb) bt
#0  0x7f442c840e33 in __epoll_wait_nocancel () at 
../sysdeps/unix/syscall-template.S:81
#1  0x006d1dde in WaitEventSetWaitBlock (nevents=1, 
occurred_events=0x7fffdedd75a0, cur_timeout=-1, set=0xe3eedb8) at latch.c:981
#2  WaitEventSetWait (set=set@entry=0xe3eedb8, timeout=timeout@entry=-1, 
occurred_events=occurred_events@entry=0x7fffdedd75a0, nevents=nevents@entry=1) 
at latch.c:935
#3  0x006d2226 in WaitLatchOrSocket (latch=0x7f442c0d9644, 
wakeEvents=wakeEvents@entry=1, sock=sock@entry=-1, timeout=-1, timeout@entry=0) 
at latch.c:347
#4  0x006d22ed in WaitLatch (latch=, 
wakeEvents=wakeEvents@entry=1, timeout=timeout@entry=0) at latch.c:302
#5  0x005ef4e3 in gather_readnext (gatherstate=0xe3d7ce8) at 
nodeGather.c:384
#6  gather_getnext (gatherstate=0xe3d7ce8) at nodeGather.c:283
#7  ExecGather (node=node@entry=0xe3d7ce8) at nodeGather.c:229
#8  0x005dd728 in ExecProcNode (node=node@entry=0xe3d7ce8) at 
execProcnode.c:515
#9  0x005f995c in ExecNestLoop (node=node@entry=0x10ef9d90) at 
nodeNestloop.c:174
#10

Re: [HACKERS] quickdie doing memory allocations (was atomic pin/unpin causing errors)

2016-05-05 Thread Tom Lane

Andres Freund  writes:
>> #0  0x0008014321d7 in sbrk () from /lib/libc.so.7
>> #1  0x000801431ddd in sbrk () from /lib/libc.so.7
>> #2  0x00080142e5bb in sbrk () from /lib/libc.so.7
>> #3  0x00080142e085 in sbrk () from /lib/libc.so.7
>> #4  0x00080142de28 in sbrk () from /lib/libc.so.7
>> #5  0x00080142e1cf in sbrk () from /lib/libc.so.7
>> #6  0x000801439815 in free () from /lib/libc.so.7
>> #7  0x00080149e3d6 in nsdispatch () from /lib/libc.so.7
>> #8  0x0008014a41c6 in __cxa_finalize () from /lib/libc.so.7
>> #9  0x00080144525c in exit () from /lib/libc.so.7
>> #10 0x008e1bc2 in quickdie (postgres_signal_arg=3) at postgres.c:2623
>> #11 
>> #12 0x000801431847 in sbrk () from /lib/libc.so.7

> That looks like independent issue, namely that we're trigger memory
> allocations from a signal handler (see frames 12, 11, 10, 9). Presumably
> due to system registered atexit handlers.  I suspect we should be using
> _exit() here?  Tom?

I don't think that would improve matters.  In the first place, if we use
_exit() here that might encourage third-party extension authors to believe
they should use _exit(), which would be bad.  In the second place,
we don't know what it is we're skipping by not running atexit handlers,
and again that could be bad.  We don't like people trying to bypass our
on-exit code, why would anyone else?  In the third place, by the time we
get to the exit() call we've already exposed ourselves to a whole lot of
such hazards by running ereport() (including sending a message to the
client!).  In the fourth place, if we've received a quickdie interrupt,
it doesn't actually matter if the process crashes; we just want it to
quit ASAP.

So I'd say that this is just a cosmetic problem and that trying to fix
it is likely to make things worse.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Naming of new tsvector functions

2016-05-05 Thread Tom Lane

Stas Kelvich  writes:
>> On 04 May 2016, at 20:15, Tom Lane  wrote:
>> Also, I'd supposed that we'd rename to tsvector_something, since
>> the same patch also introduced tsvector_to_array() and
>> array_to_tsvector().  What's the motivation for using ts_ as the
>> prefix?

> There is already several functions named ts_* (ts_rank, ts_headline, 
> ts_rewrite) 
> and two named starting from tsvector_* (tsvector_update_trigger, 
> tsvector_update_trigger_column).

> Personally Iâd prefer ts_ over tsvector_ since it is shorter, and still 
> keeps semantics.

Yeah, I see we're already a bit inconsistent here.  The problem with using
a ts_ prefix, to my mind, is that it offers no option for distinguishing
tsvector from tsquery, should you need to do that.  Maybe this isn't a
problem for functions that have tsvector as input.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] atomic pin/unpin causing errors

2016-05-05 Thread Teodor Sigaev


Hm. And you're not seeing the asserts I reported in
http://archives.postgresql.org/message-id/20160505185246.2i7qftadwhzewykj%40alap3.anarazel.de
?

I see it a lot, but I think that is a result of ereport(FATAL) after 
FileWrite(BLCKSZ/3) added by Jeff.


Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Poorly-thought-out handling of double variables in pgbench

2016-05-05 Thread Tom Lane

While testing 7a622b273 I happened to notice this:

\set x greatest(3, 2, 4.9)
create table mytab (x numeric);
insert into mytab values(:x);

results in this table:

  x   
--
 4.900355
(1 row)

The reason for that is that the result of a "double" calculation
is coerced to text like this:

sprintf(res, "%.18e", result.u.dval);

apparently on the theory that this will result in being able to convert it
back to double with no loss of low-order bits.  But of course the last two
or three digits of such output are pretty much garbage to the naked eye.
Then what gets interpolated as the variable value is something like
'4.900355e+00'.

I think this is a very poor decision; it's going to cause surprises and
probably bug reports.  Moreover, if we were testing this behavior in the
buildfarm (which we are not, but only because the test coverage for
pgbench is so shoddy), we would be seeing failures, because those
low-order digits are going to be platform dependent.

The only value of doing it like this would be if people chained multiple
floating-point calculations in successive pgbench \set commands and needed
full precision to be retained from one \set to the next ... but how likely
is that?

A minimum-change fix would be to print only DBL_DIG digits here.  A better
answer, perhaps, would be to store double-valued variables in double
format to begin with, coercing to text only when and if the value is
interpolated into a string.  That's probably a bigger change than we
want to be putting in right now, though I'm a bit tempted to go try it.

Thoughts?

BTW, just to add insult to injury, the debug() function prints double
values with "%.f", which evidently had even less thought put into it.
That one should definitely be %g with DBL_DIG precision.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] atomic pin/unpin causing errors

2016-05-05 Thread Andres Freund

On 2016-05-04 18:12:45 +0300, Teodor Sigaev wrote:
> > > I get the errors:
> > > 
> > > ERROR:  attempted to delete invisible tuple
> > > STATEMENT:  update foo set count=count+1,text_array=$1 where text_array 
> > > @> $2
> > > 
> > > And also:
> > > 
> > > ERROR:  unexpected chunk number 1 (expected 2) for toast value
> > > 85223889 in pg_toast_16424
> > > STATEMENT:  update foo set count=count+1 where text_array @> $1
> > 
> > Hm. I appear to have trouble reproducing this issue (continuing to try)
> > on master as of 8826d8507.  Is there any chance you could package up a
> > data directory after the issue hit?
> 
> I've got
> ERROR:  unexpected chunk number 0 (expected 1) for toast value 10192986 in
> pg_toast_16424
> 
> The test required 10 hours to run on my notebook. postgresql was compiled
> with -O0 --enable-debug --enable-cassert.

Hm. And you're not seeing the asserts I reported in
http://archives.postgresql.org/message-id/20160505185246.2i7qftadwhzewykj%40alap3.anarazel.de
?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] atomic pin/unpin causing errors

2016-05-05 Thread Andres Freund

Hi Jeff,

On 2016-04-29 10:38:55 -0700, Jeff Janes wrote:
> I don't see the problem with an cassert-enabled, probably because it
> is just too slow to ever reach the point where the problem occurs.

Running the test with cassert enabled I actually get assertion failures,
due to the FATAL you added.

#1  0x00958dde in ExceptionalCondition (conditionName=0xb36c2a 
"!(RefCountErrors == 0)", errorType=0xb361af "FailedAssertion", 
fileName=0xb36170 
"/home/admin/src/postgresql/src/backend/storage/buffer/bufmgr.c", 
lineNumber=2506) at 
/home/admin/src/postgresql/src/backend/utils/error/assert.c:54
#2  0x007c9fc9 in CheckForBufferLeaks () at 
/home/admin/src/postgresql/src/backend/storage/buffer/bufmgr.c:2506
#3  0x007c9f09 in AtProcExit_Buffers (code=1, arg=0) at 
/home/admin/src/postgresql/src/backend/storage/buffer/bufmgr.c:2459
#4  0x007d927f in shmem_exit (code=1) at 
/home/admin/src/postgresql/src/backend/storage/ipc/ipc.c:261
#5  0x007d90dd in proc_exit_prepare (code=1) at 
/home/admin/src/postgresql/src/backend/storage/ipc/ipc.c:185
#6  0x007d904b in proc_exit (code=1) at 
/home/admin/src/postgresql/src/backend/storage/ipc/ipc.c:102
#7  0x0095958d in errfinish (dummy=0) at 
/home/admin/src/postgresql/src/backend/utils/error/elog.c:543
#8  0x0080214b in mdwrite (reln=0x2e8b4a8, forknum=MAIN_FORKNUM, 
blocknum=154, buffer=0x2e8e5a8 "", skipFsync=0 '\000')
at /home/admin/src/postgresql/src/backend/storage/smgr/md.c:832
#9  0x00804633 in smgrwrite (reln=0x2e8b4a8, forknum=MAIN_FORKNUM, 
blocknum=154, buffer=0x2e8e5a8 "", skipFsync=0 '\000')
at /home/admin/src/postgresql/src/backend/storage/smgr/smgr.c:650
#10 0x007ca548 in FlushBuffer (buf=0x7f0285955330, reln=0x2e8b4a8) at 
/home/admin/src/postgresql/src/backend/storage/buffer/bufmgr.c:2734
#11 0x007c9d5a in SyncOneBuffer (buf_id=2503, skip_recently_used=0 
'\000', wb_context=0x7ffe7305d290) at 
/home/admin/src/postgresql/src/backend/storage/buffer/bufmgr.c:2377
#12 0x007c964e in BufferSync (flags=64) at 
/home/admin/src/postgresql/src/backend/storage/buffer/bufmgr.c:1967
#13 0x007ca185 in CheckPointBuffers (flags=64) at 
/home/admin/src/postgresql/src/backend/storage/buffer/bufmgr.c:2561
#14 0x0052d497 in CheckPointGuts (checkPointRedo=382762776, flags=64) 
at /home/admin/src/postgresql/src/backend/access/transam/xlog.c:8644
#15 0x0052cede in CreateCheckPoint (flags=64) at 
/home/admin/src/postgresql/src/backend/access/transam/xlog.c:8430
#16 0x007706ac in CheckpointerMain () at 
/home/admin/src/postgresql/src/backend/postmaster/checkpointer.c:488
#17 0x0053e0d5 in AuxiliaryProcessMain (argc=2, argv=0x7ffe7305ea40) at 
/home/admin/src/postgresql/src/backend/bootstrap/bootstrap.c:429
#18 0x0078099f in StartChildProcess (type=CheckpointerProcess) at 
/home/admin/src/postgresql/src/backend/postmaster/postmaster.c:5227
#19 0x0077dcc3 in reaper (postgres_signal_arg=17) at 
/home/admin/src/postgresql/src/backend/postmaster/postmaster.c:2781
#20 
#21 0x7f028ebbdac3 in __select_nocancel () at 
../sysdeps/unix/syscall-template.S:81
#22 0x0077c049 in ServerLoop () at 
/home/admin/src/postgresql/src/backend/postmaster/postmaster.c:1654
#23 0x0077b7a9 in PostmasterMain (argc=4, argv=0x2e49f20) at 
/home/admin/src/postgresql/src/backend/postmaster/postmaster.c:1298
#24 0x006c5849 in main (argc=4, argv=0x2e49f20) at 
/home/admin/src/postgresql/src/backend/main/main.c:228

You didn't see those?


The trigger here appears to be that the checkpointer doesn't have
on-exit callback similar to a normal backend's ShutdownPostgres() et al,
and thus doesn't trigger a resource owner release.  The normal ERROR
path has
/* buffer pins are released here: */
ResourceOwnerRelease(CurrentResourceOwner,
 
RESOURCE_RELEASE_BEFORE_LOCKS,
 false, true);
/* we needn't bother with the other ResourceOwnerRelease phases 
*/


That clearly is a bug. But I'm not immediately seing how this could
trigger the corruption issue you observed.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Robert Haas

On Thu, May 5, 2016 at 1:32 PM, Tom Lane  wrote:
> * As is somewhat customary for early drafts of the notes, I've made no
> attempt to call out which are the most significant changes.  I've not
> tried to isolate the non-backwards-compatible items, either.

There was quite a bit of discussion of this on pgsql-advocacy, and the
press release we're going to put out surely must say something.  It
wouldn't hurt if the release notes at least somewhat matched that.  I
thought this was a decent list:

http://www.postgresql.org/message-id/571fb941.3020...@commandprompt.com

Although "phrase search" seems a bit less important to me than the
rest of those.  There's also a good, somewhat-more-inclusive list of
the big stuff here:

https://en.wikipedia.org/wiki/PostgreSQL#Upcoming_features

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-05-05 Thread Alvaro Herrera

This is raw, in case anyone wants to look more closely.

alvherre=# select level, count(*), patch, subject from scary left join commits 
on patch = sha1 group by level, patch, subject order by level asc, count(*) 
desc;
┌───┬───┬───┬┐
│ level │ count │ patch 
│  subject   │
├───┼───┼───┼┤
│ 1 │ 3 │ fd31cd265138019d9b5fe53043670898bc9f  
│ Don't vacuum all-frozen pages. │
│ 1 │ 2 │ 3fc6e2d7f5b652b417fa6937c34de2438d60fa9f  
│ Make the upper part of the planner work by generating and comparing Paths. │
│ 1 │ 1 │ Change the format of the VM fork to add a second bit per page 
││
│ 1 │ 1 │ Don't VACUUM all-frozen pages 
││
│ 1 │ 1 │ 65578341af1ae50e52e0f45e691ce88ad5a1b9b1  
│ Add Generic WAL interface  │
│ 1 │ 1 │ 9cd00c457e6a1ebb984167ac556a9961812a683c  
│ Checkpoint sorting and balancing.  │
│ 1 │ 1 │ 924bcf4f16d54c55310b28f77686608684734f42  
│ Create an infrastructure for parallel computation in PostgreSQL.   │
│ 1 │ 1 │ Freeze Map
││
│ 1 │ 1 │ parallelism, gather node...   
││
│ 1 │ 1 │ Parallel query
││
│ 1 │ 1 │ Snapshot too old  
││
│ 1 │ 1 │ Snapshot Too Old  
││
│ 2 │ 2 │ bb140506df605fab58f48926ee1db1f80bdafb59  
│ Phrase full text search.   │
│ 2 │ 1 │ \crosstabview 
││
│ 2 │ 1 │ 1c1a7cbd6a1600c97dfcd9b5dc78a23b5db9bbf6  
│ Sync our copy of the timezone library with IANA release tzcode2016c.   │
│ 2 │ 1 │ 3fc6e2d7f5b652b417fa6937c34de2438d60fa9f  
│ Make the upper part of the planner work by generating and comparing Paths. │
│ 2 │ 1 │ 428b1d6b29ca599c5700d4bc4f4ce4c5880369bf  
│ Allow to trigger kernel writeback after a configurable number of writes.   │
│ 2 │ 1 │ 48354581a49c30f5757c203415aa8412d85b0f70  
│ Allow Pin/UnpinBuffer to operate in a lockfree manner. │
│ 2 │ 1 │ 848ef42bb8c7909c9d7baa38178d4a209906e7c1  
│ Add the "snapshot too old" feature │
│ 2 │ 1 │ fd31cd265138019d9b5fe53043670898bc9f  
│ Don't vacuum all-frozen pages. │
│ 2 │ 1 │ Freeze map
││
│ 2 │ 1 │ LWLock tranches   
││
│ 2 │ 1 │ Parallelism Stuff (many patches)  
││
│ 2 │ 1 │ Parallel Query
││
│ 3 │ 1 │ 013ebc0a7b7ea9c1b1ab7a3d4dd75ea121ea8ba7  
│ Microvacuum for GIST   │
│ 3 │ 1 │ 0711803775a37e0bf39d7efdd1e34d9d7e640ea1  
│ Use quicksort, not replacement selection, for external sorting.│
│ 3 │ 1 │ 45be99f8cd5d606086e0a458c9c72910ba8a613d  
│ Support parallel joins, and make related improvements.

Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-05-05 Thread Alvaro Herrera

Alvaro Herrera wrote:

> "Parallel Query" got many mentions; some of them were specific commits
> (such as "parallel infrastructure", "parallel joins", "parallel
> aggregates") and others were more generic.  For the generic mentions I
> just chose a few of the most salient patches, but didn't include either
> parallel aggregates nor parallel joins in that list.  "LWLock tranches"
> and "Freeze Map" also resulted in several commits appearing in the list
> below.  This distorts the results somewhat.  I considered redoing the
> results once I noticed the problem, but didn't really want to invest
> *too* much time.

After looking at commits mentioning "parallel", I'm surprised that this
one didn't turn up specifically as a scary commit:

│ a1c1af2a1f6099c039f145c1edb52257f315be51 │ rh...@postgresql.org│ 
Introduce group locking to prevent parallel processes from deadlocking.  │


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-05-05 Thread Alvaro Herrera

Noah Misch wrote:
> On Mon, Apr 18, 2016 at 03:37:21PM -0300, Alvaro Herrera wrote:
> > The RMT will publish aggregate, unattributed results after the poll
> > closes.

Here are some more detailed results.  We got 15 valid replies.  One
person voted twice, mentioning the same patches both times in slightly
different order; I considered the second reply only.

"Parallel Query" got many mentions; some of them were specific commits
(such as "parallel infrastructure", "parallel joins", "parallel
aggregates") and others were more generic.  For the generic mentions I
just chose a few of the most salient patches, but didn't include either
parallel aggregates nor parallel joins in that list.  "LWLock tranches"
and "Freeze Map" also resulted in several commits appearing in the list
below.  This distorts the results somewhat.  I considered redoing the
results once I noticed the problem, but didn't really want to invest
*too* much time.

I chose to ignore the scariness rating in this tabular report.

┌───┬──┬─┬─┐
│ count │   sha1   │committer│  
 subject   │
├───┼──┼─┼─┤
│ 8 │ fd31cd265138019d9b5fe53043670898bc9f │ rh...@postgresql.org│ 
Don't vacuum all-frozen pages.  │
│ 6 │ 924bcf4f16d54c55310b28f77686608684734f42 │ rh...@postgresql.org│ 
Create an infrastructure for parallel computation in PostgreSQL.│
│ 5 │ f0661c4e8c44c0ec7acd4ea7c82e85b265447398 │ rh...@postgresql.org│ 
Make sequential scans parallel-aware.   │
│ 5 │ ee7ca559fcf404f9a3bd99da85c8f4ea9fbc2e92 │ rh...@postgresql.org│ 
Add a C API for parallel heap scans.│
│ 3 │ a892234f830e832110f63fc0a2afce2fb21d1584 │ rh...@postgresql.org│ 
Change the format of the VM fork to add a second bit per page.  │
│ 3 │ 3fc6e2d7f5b652b417fa6937c34de2438d60fa9f │ t...@sss.pgh.pa.us   │ 
Make the upper part of the planner work by generating and comparing Paths.  │
│ 3 │ 848ef42bb8c7909c9d7baa38178d4a209906e7c1 │ kgri...@postgresql.org  │ 
Add the "snapshot too old" feature  │
│ 2 │ bb140506df605fab58f48926ee1db1f80bdafb59 │ teo...@sigaev.ru│ 
Phrase full text search.│
│ 2 │ 9cd00c457e6a1ebb984167ac556a9961812a683c │ and...@anarazel.de  │ 
Checkpoint sorting and balancing.   │
│ 1 │ 3bd909b220930f21d6e15833a17947be749e7fde │ rh...@postgresql.org│ 
Add a Gather executor node. │
│ 1 │ c319991bcad02a2e99ddac3f42762b0f6fa8d52a │ rh...@postgresql.org│ 
Use separate lwlock tranches for buffer, lock, and predicate lock managers. │
│ 1 │ 1c1a7cbd6a1600c97dfcd9b5dc78a23b5db9bbf6 │ t...@sss.pgh.pa.us   │ 
Sync our copy of the timezone library with IANA release tzcode2016c.│
│ 1 │ 65578341af1ae50e52e0f45e691ce88ad5a1b9b1 │ teo...@sigaev.ru│ 
Add Generic WAL interface   │
│ 1 │ c09b18f21c52cbcf8718d6c267c84fcfea3739a9 │ alvhe...@alvh.no-ip.org │ 
Support crosstabview in psql│
│ 1 │ e4106b2528727c4b48639c0e12bf2f70a766b910 │ rh...@postgresql.org│ 
postgres_fdw: Push down joins to remote servers.│
│ 1 │ 428b1d6b29ca599c5700d4bc4f4ce4c5880369bf │ and...@anarazel.de  │ 
Allow to trigger kernel writeback after a configurable number of writes.│
│ 1 │ 6150a1b08a9fe7ead2b25240be46dddeae9d98e1 │ rh...@postgresql.org│ 
Move buffer I/O and content LWLocks out of the main tranche.│
│ 1 │ 48354581a49c30f5757c203415aa8412d85b0f70 │ and...@anarazel.de  │ 
Allow Pin/UnpinBuffer to operate in a lockfree manner.  │
│ 1 │ 7a542700df25eaf97b794bff63606176433dcdda │ sfr...@snowman.net  │ 
Create default roles│
│ 1 │ 013ebc0a7b7ea9c1b1ab7a3d4dd75ea121ea8ba7 │ teo...@sigaev.ru│ 
Microvacuum for GIST│
│ 1 │ 53be0b1add7064ca5db3cd884302dfc3268d884e │ rh...@postgresql.org│ 
Provide much better wait information in pg_stat_activity.   │
│ 1 │ 0711803775a37e0bf39d7efdd1e34d9d7e640ea1 │ rh...@postgresql.org│ 
Use quicksort, not replacement selection, for external sorting. │
│ 1 │

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Robert Haas

On Thu, May 5, 2016 at 1:32 PM, Tom Lane  wrote:
> I've pushed a first cut at release notes for 9.6.  There's a good deal
> of work to do yet:
>
> * The section about parallel query could probably stand to be fleshed out,
> but I'm unsure what to say.  Somebody who's worked on that should provide
> some text.
>
> * Bruce usually likes to sprinkle the notes with a whole lot of links
> to the main docs.  I've only bothered with links for new GUCs and system
> views.
>
> * As is somewhat customary for early drafts of the notes, I've made no
> attempt to call out which are the most significant changes.  I've not
> tried to isolate the non-backwards-compatible items, either.
>
> Please review and comment before Monday, if you can.

As far as the parallel query stuff goes,
https://wiki.postgresql.org/wiki/Parallel_Query is a useful overview
of current restrictions which I'm still hoping to get incorporated
into the SGML documentation.  I suggest revising the text to something
like this:

===
With 9.6, PostgreSQL introduces initial support for
parallel execution of large queries.  Only strictly read-only queries
where the driving table is accessed via a sequential scan can be
parallelized.  Hash joins and nested loops can be performed in
parallel, as can aggregation for supported aggregates.  Much remains
to be done, but this is already a useful set of features.
===

If we put the documentation from that wiki page into the docs
someplace, then we could add a sentence "For an overview of the
current capabilities of parallel query, including relevant
restrictions, see ."

I'd probably mention David Rowley as a third author on parallel query.
It's true that the great bulk of the development work and design over
the last few years was done by Amit and I, but parallel aggregate
resulted in several large patches that were entirely written by David.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reviewing freeze map code

2016-05-05 Thread Andres Freund

Hi,

On 2016-05-02 14:48:18 -0700, Andres Freund wrote:
> 7087166 pg_upgrade: Convert old visibility map format to new format.

+const char *
+rewriteVisibilityMap(const char *fromfile, const char *tofile, bool force)
...

+   while ((bytesRead = read(src_fd, buffer, BLCKSZ)) == BLCKSZ)
+   {
..

Uh, shouldn't we actually fail if we read incompletely? Rather than
silently ignoring the problem? Ok, this causes no corruption, but it
indicates that something went significantly wrong.

+   charnew_vmbuf[BLCKSZ];
+   char   *new_cur = new_vmbuf;
+   boolempty = true;
+   boolold_lastpart;
+
+   /* Copy page header in advance */
+   memcpy(new_vmbuf, , SizeOfPageHeaderData);

Shouldn't we zero out new_vmbuf? Afaics we're not necessarily zeroing it
with old_lastpart && !empty, right?


+   if ((dst_fd = open(tofile, O_RDWR | O_CREAT | (force ? 0 : O_EXCL), 
S_IRUSR | S_IWUSR)) < 0)
+   {
+   close(src_fd);
+   return getErrorText();
+   }

I know you guys copied this, but what's the force thing about?
Expecially as it's always set to true by the callers (i.e. what is the
parameter even about?)?  Wouldn't we at least have to specify O_TRUNC in
the force case?


+   old_cur += BITS_PER_HEAPBLOCK_OLD;
+   new_cur += BITS_PER_HEAPBLOCK;

I'm not sure I'm understanding the point of the BITS_PER_HEAPBLOCK_OLD
stuff - as long as it's hardcoded into rewriteVisibilityMap() we'll not
be able to have differing ones anyway, should we decide to add a third
bit?

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Tom Lane

"Joshua D. Drake"  writes:
> On 05/05/2016 10:32 AM, Tom Lane wrote:
>> I've pushed a first cut at release notes for 9.6.  There's a good deal
>> of work to do yet:

> Just for the cheap seats, I assume they are pushed to git?

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=c311f7887376f7f3ce24c4c0dac4f9cb6ad3bee3

Should appear at

http://www.postgresql.org/docs/devel/static/release-9-6.html

after awhile, though it looks like it'll be about three hours before
guaibasaurus gets around to it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New pgbench functions are misnamed

2016-05-05 Thread Tom Lane

Robert Haas  writes:
> On Wed, May 4, 2016 at 5:41 PM, Tom Lane  wrote:
>> I noticed that commit 7e137f846 added functions named max() and min()
>> to pgbench's expression syntax.  Unfortunately, these functions have
>> zilch to do with what max() and min() do in SQL.  They're actually more
>> like the greatest() and least() server-side functions.
>> 
>> While I can't imagine that we'd ever want to implement true aggregates
>> in pgbench expressions, it still seems like this is a recipe for
>> confusion.  Shouldn't we rename these to greatest() and least()?

> Yeah, that's probably a good idea.

The vote seems to be 2 to 1 in favor, so I'll go do this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Joshua D. Drake


On 05/05/2016 10:32 AM, Tom Lane wrote:

I've pushed a first cut at release notes for 9.6.  There's a good deal
of work to do yet:

* The section about parallel query could probably stand to be fleshed out,
but I'm unsure what to say.  Somebody who's worked on that should provide
some text.

* Bruce usually likes to sprinkle the notes with a whole lot of links
to the main docs.  I've only bothered with links for new GUCs and system
views.

* As is somewhat customary for early drafts of the notes, I've made no
attempt to call out which are the most significant changes.  I've not
tried to isolate the non-backwards-compatible items, either.

Please review and comment before Monday, if you can.


Just for the cheap seats, I assume they are pushed to git?

JD



--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Initial release notes created for 9.6

2016-05-05 Thread Tom Lane

I've pushed a first cut at release notes for 9.6.  There's a good deal
of work to do yet:

* The section about parallel query could probably stand to be fleshed out,
but I'm unsure what to say.  Somebody who's worked on that should provide
some text.

* Bruce usually likes to sprinkle the notes with a whole lot of links
to the main docs.  I've only bothered with links for new GUCs and system
views.

* As is somewhat customary for early drafts of the notes, I've made no
attempt to call out which are the most significant changes.  I've not
tried to isolate the non-backwards-compatible items, either.

Please review and comment before Monday, if you can.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] quickdie doing memory allocations (was atomic pin/unpin causing errors)

2016-05-05 Thread Andres Freund

Hi Teodor,

Thanks for analyzing this.

On 2016-05-05 13:50:09 +0300, Teodor Sigaev wrote:
> > I'll try to get a coredump after SIGSEGV, but it could take a time.
> 
> Got it!
> 
> #0  0x0008014321d7 in sbrk () from /lib/libc.so.7
> #1  0x000801431ddd in sbrk () from /lib/libc.so.7
> #2  0x00080142e5bb in sbrk () from /lib/libc.so.7
> #3  0x00080142e085 in sbrk () from /lib/libc.so.7
> #4  0x00080142de28 in sbrk () from /lib/libc.so.7
> #5  0x00080142e1cf in sbrk () from /lib/libc.so.7
> #6  0x000801439815 in free () from /lib/libc.so.7
> #7  0x00080149e3d6 in nsdispatch () from /lib/libc.so.7
> #8  0x0008014a41c6 in __cxa_finalize () from /lib/libc.so.7
> #9  0x00080144525c in exit () from /lib/libc.so.7
> #10 0x008e1bc2 in quickdie (postgres_signal_arg=3) at postgres.c:2623
> #11 
> #12 0x000801431847 in sbrk () from /lib/libc.so.7
> #13 0x000801431522 in sbrk () from /lib/libc.so.7
> #14 0x00080142d47f in sbrk () from /lib/libc.so.7
> #15 0x000801434628 in malloc () from /lib/libc.so.7
> #16 0x00aca278 in AllocSetAlloc (context=0x801c0bb88, size=24) at 
> aset.c:853
> #17 0x00acca0e in MemoryContextAlloc (context=0x801c0bb88, size=24)
> at mcxt.c:764
> #18 0x00aebdb8 in PushActiveSnapshot (snap=0xf4ae10) at snapmgr.c:652
> #19 0x008e54bd in exec_bind_message (input_message=0x7fffdf60)
> at postgres.c:1602
> #20 0x008e3957 in PostgresMain (argc=1, argv=0x801d3c968,
> dbname=0x801d3c948 "teodor", username=0x801d3c928 "teodor") at
> postgres.c:4105
> #21 0x00839744 in BackendRun (port=0x801c991c0) at postmaster.c:4258
> #22 0x00838d54 in BackendStartup (port=0x801c991c0) at 
> postmaster.c:3932
> #23 0x00835617 in ServerLoop () at postmaster.c:1690
> #24 0x00832c69 in PostmasterMain (argc=4, argv=0x7fffe420) at
> postmaster.c:1298
> #25 0x0075f228 in main (argc=4, argv=0x7fffe420) at main.c:228
> 
> Seems, we have some memory corruption, but it could either separate or the
> same problem.

That looks like independent issue, namely that we're trigger memory
allocations from a signal handler (see frames 12, 11, 10, 9). Presumably
due to system registered atexit handlers.  I suspect we should be using
_exit() here?  Tom?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg9.6 segfault using simple query (related to use fk for join estimates)

2016-05-05 Thread David Rowley

On 5 May 2016 at 16:04, David Rowley  wrote:
> I've started making some improvements to this, but need to talk to
> Tomas. It's currently in the middle of his night, but will try to
> catch him in his morning to discuss this with him.

Ok, so I spoke to Tomas about this briefly, and he's asked me to send
in this patch. He didn't get time to look over it due to some other
commitments he has today.

I do personally feel that if the attached is not good enough, or not
very close to good enough then probably the best course of action is
to revert the whole thing. I'd much rather have this out than to feel
some responsibility for someone's performance problems with this
feature. But I also do feel that the patch allows a solution to a big
problem that we currently have with PostgreSQL, which there's any easy
workarounds for... that's multi-column join estimates.

In the attached I've left the GUC remaining. The reason for the GUC is
for testing purposes and it should be removed before release. It
should likely be documented though, even if we're planning to remove
it later. I've not gotten around to that in this patch though.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

fkest_fixes_david.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SET ROLE and reserved roles

2016-05-05 Thread Stephen Frost

* Stephen Frost (sfr...@snowman.net) wrote:
> * Robert Haas (robertmh...@gmail.com) wrote:
> > On Tue, Apr 26, 2016 at 7:39 PM, Robert Haas  wrote:
> > > On Mon, Apr 25, 2016 at 6:55 PM, Stephen Frost  wrote:
> > >> Based on our discussion at PGConf.US and the comments up-thread from
> > >> Tom, I'll work up a patch to remove those checks around SET ROLE and
> > >> friends which were trying to prevent default roles from possibly being
> > >> made to own objects.
> > >>
> > >> Should the checks, which have been included since nearly the start of
> > >> this version of the patch, to prevent users from GRANT'ing other rights
> > >> to the default roles remain?  Or should those also be removed?  I
> > >> *think* pg_dump/pg_upgrade would be fine with rights being added, and if
> > >> we aren't preventing ownership of objects then we aren't going to be
> > >> able to remove such roles in any case.
> > >
> > > It'd be good to test that that works.  If it does, I think we may as
> > > well allow it.
> > >
> > >> Of course, with these default roles, users can't REVOKE the rights which
> > >> are granted to them as that happens in C code, outside of the GRANT
> > >> system.
> > >
> > > I think you mean that they can't revoke the special magic rights, but
> > > they could revoke any additional privileges which were granted.
> > >
> > >> Working up a patch to remove these checks should be pretty quickly done
> > >> (iirc, I've actually got an independent patch around from when I added
> > >> them, just need to find it and then go through the committed patches to
> > >> make sure I take care of everything), but would like to make sure that
> > >> we're now all on the same page and that *all* of these checks should be
> > >> removed, making default roles just exactly like "regular" roles, except
> > >> that they're created at initdb time and have "special" rights provided
> > >> by C-level code checks.
> > >
> > > That's what I'm thinking.  I would welcome other views.
> > 
> > Ping!
> 
> Thanks.  I'm planning to post a patch tomorrow to remove these checks.

Apologies about not getting to this yesterday, was pretty busy finding
pre-existing issues in pg_dump.

Attached is a patch which removes the various special checks that I had
added to prevent using default roles like regular roles.  As noted in
the commit message, users are still prevented from creating roles in the
"pg_" namespace and from ALTER'ing those roles, but otherwise they're
very much like regular roles.

I've adjusted the regression tests accordingly, but I'm going to do more
testing to make sure that pg_dump handles them correctly (and will be
adding cases to my pg_dump TAP test suite to ensure that they stay
working) over the next day or so.

Barring objections or concerns, I'll push this sometime tomorrow
(probably after I get back to DC).

Thanks!

Stephen
From 4fae6e77eddc86360381e44f35d4da4a495cbdc1 Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Thu, 5 May 2016 09:52:26 -0400
Subject: [PATCH] Remove various special checks around default roles

Default roles really should be like regular roles, for the most part.
This removes a number of checks that were trying to make default roles
extra special by not allowing them to be used as regular roles.

We still prevent users from creating roles in the "pg_" namespace or
from altering roles which exist in that namespace via ALTER ROLE, as
we can't preserve such changes, but otherwise the roles are very much
like regular roles.

Based on discussion with Robert and Tom.
---
 src/backend/catalog/aclchk.c|  7 ---
 src/backend/commands/alter.c|  3 ---
 src/backend/commands/foreigncmds.c  | 13 -
 src/backend/commands/policy.c   |  5 -
 src/backend/commands/schemacmds.c   |  4 
 src/backend/commands/tablecmds.c|  2 --
 src/backend/commands/tablespace.c   |  4 
 src/backend/commands/user.c | 11 ---
 src/backend/commands/variable.c |  7 ---
 src/test/regress/expected/rolenames.out | 18 +-
 src/test/regress/sql/rolenames.sql  | 10 +-
 11 files changed, 10 insertions(+), 74 deletions(-)

diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index 7d656d5..d074e85 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -423,9 +423,6 @@ ExecuteGrantStmt(GrantStmt *stmt)
 grantee_uid = ACL_ID_PUBLIC;
 break;
 			default:
-if (!IsBootstrapProcessingMode())
-	check_rolespec_name((Node *) grantee,
-			"Cannot GRANT or REVOKE privileges to or from a reserved role.");
 grantee_uid = get_rolespec_oid((Node *) grantee, false);
 break;
 		}
@@ -921,8 +918,6 @@ ExecAlterDefaultPrivilegesStmt(AlterDefaultPrivilegesStmt *stmt)
 grantee_uid = ACL_ID_PUBLIC;
 break;
 			default:
-check_rolespec_name((Node *) grantee,
-	"Cannot GRANT or REVOKE

Re: [HACKERS] Pg_stop_backup process does not run - Backup Intervals

2016-05-05 Thread Rodrigo Cavalcante

Hi,

Thanks for the feedback.

Log:

LOG:  connection authorized: user=postgres database=template1
LOG:  statement: select pg_start_backup('bkpfull',true);
ERROR:  a backup is already in progress
HINT:  Run pg_stop_backup() and try again.
STATEMENT:  select pg_start_backup('bkpfull',true);
LOG:  disconnection: session time: 0:00:00.165 user=postgres
database=template1 host=localhost port=45746


I will use the pg_basebackup.





--
View this message in context: 
http://postgresql.nabble.com/Pg-stop-backup-process-does-not-run-Backup-Intervals-tp5901538p5902082.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump vs. TRANSFORMs

2016-05-05 Thread Stephen Frost

* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> On 5/4/16 11:23 PM, Tom Lane wrote:
> >Actually, I believe it will be dumped.  selectDumpableCast believes it
> >should dump casts with OID >= FirstNormalObjectId.  That's a kluge no
> >doubt, but reasonably effective; looks like we've been doing that since
> >9.0.
> >
> >pg_dump appears not to have a special-case selectDumpableTransform
> >routine.  Instead it falls back on the generic selectDumpableObject
> >for transforms.  That's a bad idea because the only useful bit of
> >knowledge selectDumpableObject has is to look at the containing
> >namespace ... and transforms don't have one, just as casts don't.
> >
> >My recommendation is to clone that cast logic for transforms.
> 
> Hmm.  The way I understand it is that Stephen wants to make dumping
> that test case work.  But note that that test case is bogus; it
> wouldn't actually do anything useful in practice.  There aren't any
> functions in the system catalog that could be used for actual
> transforms.  So making these changes in pg_dump isn't really of much
> practical value.  Perhaps it would be easier to change the test case
> or adjust the testing procedure?

I strongly disagree with the idea that this is only an issue with the
testing system.  What if we add functions in the future that are
created by initdb and *are* useful for transforms?  What about casts?
There are a lot of functions in pg_catalog that a user might wish to use
to define their own cast.  This also doesn't do anything about users
creating functions in pg_catalog.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] pg_dump vs. TRANSFORMs

2016-05-05 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Peter Eisentraut  writes:
> > On 5/4/16 2:39 PM, Stephen Frost wrote:
> >> These checks are looking for the functions used by the transform in the
> >> list of functions that pg_dump has loaded, but in 9.5, we don't load any
> >> of the function in pg_catalog, and even with my patches, we only dump
> >> the functions in pg_catalog that have an ACL which has been changed from
> >> the default.
> 
> > This issue is not specific to transforms.  For example, if you create a 
> > user-defined cast using a built-in function, the cast won't be dumped. 
> > Obviously, this is a problem, but it needs a more general solution.
> 
> Actually, I believe it will be dumped.  selectDumpableCast believes it
> should dump casts with OID >= FirstNormalObjectId.  That's a kluge no
> doubt, but reasonably effective; looks like we've been doing that since
> 9.0.

No, it isn't dumped.  An interesting example test case is this:

-
create function pg_catalog.toint(varchar) returns integer
  strict immutable language sql as
  'select cast($1 as integer);';

create cast (varchar as integer) with function toint(varchar);
-

Note that neither the function nor the cast is dumped, and this was with
9.5.  This is because:

getFuncs() intentionally skips all functions in pg_catalog (unless they
are members of extensions...).  dumpCast() attempts to find the function
referenced by the cast in the set of functions which getFuncs()
collected, and simply gives up and doesn't dump the cast if it can't
find the function.

For my 2c, this coding pattern:

--
funcInfo = findFuncByOid(cast->castfunc);
if (funcInfo == NULL)
return;
--

in pg_dump is really bad.  Thankfully, it looks like that's only
happening in dumpCast() and dumpTransform().  We used to have a similar
issue, it looks like, with extensions, which was fixed in 89c29c0.  It
looks like we need to either stop excluding the functions in pg_catalog
during getFuncs(), or add in more exceptions to the "don't collect info
about functions in pg_catalog" rule.

I'm also inclined to think that we should fix dumping of non-initdb
functions which have been created in pg_catalog.  I'm not sure how far
to take that though, as we have a similar issue for most object types
when it comes to pg_catalog.  Perhaps selectDumpableObject() should be
considering FirstNormalObjectId and constraining what component we dump
for from-initdb objects to only ACL, and pg_catalog, as a namespace,
should be set to dump all components (in HEAD, and just set to not dump
anything for from-initdb objects in back-branches, but everything for
user-defined objects).

> pg_dump appears not to have a special-case selectDumpableTransform
> routine.  Instead it falls back on the generic selectDumpableObject
> for transforms.  That's a bad idea because the only useful bit of
> knowledge selectDumpableObject has is to look at the containing
> namespace ... and transforms don't have one, just as casts don't.
> 
> My recommendation is to clone that cast logic for transforms.

We should do this also, if we don't change selectDumpableObject, but
we should do it because we might have from-initdb transforms one day
and the current logic would end up selecting those transforms to dump
out (though dumpTransform would end up skipping them currently because
they'd be referring to functions that we didn't get information about,
but hopefully we're going to fix that).

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] what to revert

2016-05-05 Thread Kevin Grittner

On Thu, May 5, 2016 at 3:39 AM, Ants Aasma  wrote:
> 5. mai 2016 6:14 AM kirjutas kuupäeval "Andres Freund" :
>>
>> On 2016-05-05 06:08:39 +0300, Ants Aasma wrote:
>>> On 5 May 2016 1:28 a.m., "Andres Freund"  wrote:
 On 2016-05-04 18:22:27 -0400, Robert Haas wrote:
> How would the semantics change?

 Right now the time for computing the snapshot is relevant, if
 maintenance of xids is moved, it'll likely be tied to the time xids
 are assigned. That seems perfectly alright, but it'll change behaviour.

Basically this feature allows pruning or vacuuming rows that would
still be visible to some snapshots, and then throws an error if one
of those snapshots is used for a scan that would generate incorrect
results because of the early cleanup.  There are already several
places that we relax the timings in such a way that it allows
better performance by not being as aggressive as theoretically
possible in the cleanup.  From my perspective, the performance
problems on NUMA when the feature is in use just show that this
approach wasn't taken far enough, and the solutions don't do
anything that isn't conceptually happening anyway.  Some rows that
currently get cleaned up in vacuum N will get cleaned up in vacuum
N+1 with the proposed changes, but I don't see that as a semantic
change.  In many of those cases we might be able to add some
locking and clean up the rows in vacuumm N-1, but nobody wants
that.

>>> FWIW moving the maintenance to a clock tick process will not change user
>>> visible semantics in any significant way. The change could easily be
>>> made in the next release.
>>
>> I'm not convinced of that - right now the timeout is computed as a
>> offset to the time a snapshot with a certain xmin horizon is
>> taken. Moving the computation to GetNewTransactionId() or a clock tick
>> process will make it relative to the time an xid has been generated
>> (minus a fuzz factor).  That'll behave differently in a number of cases,
>> no?

Not in what I would consider any meaningful way.  This feature is
not about trying to provoke the error, it is about preventing bloat
while minimizing errors.  I have gotten many emails off-list from
people asking whether the feature was broken because they had a
case which was running with correct results but not generating any
errors, and it often came down to such things as cursor use which
had materialized a result set -- correct results were returned from
the materialized cursor results, so no error was needed.  As long
as bloat is limited to something near the old_snapshot_threshold
and incorrect results are never returned the contract of this
feature is maintained.  It's reaching a little bit as a metaphore,
but we don't say that the semantics of autovacuum are changed in
any significant way based slight variations in the timing of
vacuums.

> Timeout is calculated in TransactionIdLimitedForOldSnapshots() as
> GetSnapshotCurrentTimestamp() minus old_snapshot_timeout rounded down to
> previous minute. If the highest seen xmin in that minute is useful in
> raising cleanup xmin the threshold is bumped. Snapshots switch behavior when
> their whenTaken is past this threshold. Xid generation time has no effect on
> this timing, it's completely based on passage of time.
>
> The needed invariant is, that no snapshot having whenTaken after timeout
> timestamp can have xmin less than calculated bound. Moving the xmin
> calculation and storage to clock tick actually makes the bound tighter. The
> ordering between xmin calculation and clok update/read needs to be correct
> to ensure the invariant.

Right.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Delete query on materialized view

2016-05-05 Thread hari.prasath

Hi all

  I am trying to delete/insert a row on materialized view which has join 
from a UDF by using SPI_execute.

  

  Materialized views are not allowed to do any DML changes once created,so 
by bypassed that check by enabling MatViewIncrementalMaintenanceIsEnabled. so 
DML queries can be executed in MV. But i dont want to bypass that check i tried 
to change the relkind for the materialized view to 'r' (normal table relation) 
in pg_class table and tried the same query but its throwing an error.

## ERROR:  could not find attribute -1 in subquery targetlist ##



Is there a way to enable DML queries on materialized views which has join 
without overwriting matview_maintenance_depth to 1 as the method was local in 
matview.c?







cheers

- Harry

Re: [HACKERS] pg_dump vs. TRANSFORMs

2016-05-05 Thread Peter Eisentraut


On 5/4/16 11:23 PM, Tom Lane wrote:

Actually, I believe it will be dumped.  selectDumpableCast believes it
should dump casts with OID >= FirstNormalObjectId.  That's a kluge no
doubt, but reasonably effective; looks like we've been doing that since
9.0.

pg_dump appears not to have a special-case selectDumpableTransform
routine.  Instead it falls back on the generic selectDumpableObject
for transforms.  That's a bad idea because the only useful bit of
knowledge selectDumpableObject has is to look at the containing
namespace ... and transforms don't have one, just as casts don't.

My recommendation is to clone that cast logic for transforms.


Hmm.  The way I understand it is that Stephen wants to make dumping that 
test case work.  But note that that test case is bogus; it wouldn't 
actually do anything useful in practice.  There aren't any functions in 
the system catalog that could be used for actual transforms.  So making 
these changes in pg_dump isn't really of much practical value.  Perhaps 
it would be easier to change the test case or adjust the testing procedure?


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] atomic pin/unpin causing errors

2016-05-05 Thread Teodor Sigaev


I'll try to get a coredump after SIGSEGV, but it could take a time.


Got it!

#0  0x0008014321d7 in sbrk () from /lib/libc.so.7
#1  0x000801431ddd in sbrk () from /lib/libc.so.7
#2  0x00080142e5bb in sbrk () from /lib/libc.so.7
#3  0x00080142e085 in sbrk () from /lib/libc.so.7
#4  0x00080142de28 in sbrk () from /lib/libc.so.7
#5  0x00080142e1cf in sbrk () from /lib/libc.so.7
#6  0x000801439815 in free () from /lib/libc.so.7
#7  0x00080149e3d6 in nsdispatch () from /lib/libc.so.7
#8  0x0008014a41c6 in __cxa_finalize () from /lib/libc.so.7
#9  0x00080144525c in exit () from /lib/libc.so.7
#10 0x008e1bc2 in quickdie (postgres_signal_arg=3) at postgres.c:2623
#11 
#12 0x000801431847 in sbrk () from /lib/libc.so.7
#13 0x000801431522 in sbrk () from /lib/libc.so.7
#14 0x00080142d47f in sbrk () from /lib/libc.so.7
#15 0x000801434628 in malloc () from /lib/libc.so.7
#16 0x00aca278 in AllocSetAlloc (context=0x801c0bb88, size=24) at 
aset.c:853
#17 0x00acca0e in MemoryContextAlloc (context=0x801c0bb88, size=24) at 
mcxt.c:764

#18 0x00aebdb8 in PushActiveSnapshot (snap=0xf4ae10) at snapmgr.c:652
#19 0x008e54bd in exec_bind_message (input_message=0x7fffdf60) at 
postgres.c:1602
#20 0x008e3957 in PostgresMain (argc=1, argv=0x801d3c968, 
dbname=0x801d3c948 "teodor", username=0x801d3c928 "teodor") at postgres.c:4105

#21 0x00839744 in BackendRun (port=0x801c991c0) at postmaster.c:4258
#22 0x00838d54 in BackendStartup (port=0x801c991c0) at postmaster.c:3932
#23 0x00835617 in ServerLoop () at postmaster.c:1690
#24 0x00832c69 in PostmasterMain (argc=4, argv=0x7fffe420) at 
postmaster.c:1298

#25 0x0075f228 in main (argc=4, argv=0x7fffe420) at main.c:228

Seems, we have some memory corruption, but it could either separate or the same 
problem.


Another one:

#0  0x0008014321d7 in sbrk () from /lib/libc.so.7
#1  0x000801431ddd in sbrk () from /lib/libc.so.7
#2  0x00080142e5bb in sbrk () from /lib/libc.so.7
#3  0x00080142e085 in sbrk () from /lib/libc.so.7
#4  0x00080142de28 in sbrk () from /lib/libc.so.7
#5  0x00080142e1cf in sbrk () from /lib/libc.so.7
#6  0x000801439815 in free () from /lib/libc.so.7
#7  0x00080149e3d6 in nsdispatch () from /lib/libc.so.7
#8  0x0008014a41c6 in __cxa_finalize () from /lib/libc.so.7
#9  0x00080144525c in exit () from /lib/libc.so.7
#10 0x008e1bc2 in quickdie (postgres_signal_arg=3) at postgres.c:2623
#11 
#12 0x00080143277a in sbrk () from /lib/libc.so.7
#13 0x0008014318b5 in sbrk () from /lib/libc.so.7
#14 0x00080142e483 in sbrk () from /lib/libc.so.7
#15 0x00080142e75b in sbrk () from /lib/libc.so.7
#16 0x0008014398bd in free () from /lib/libc.so.7
#17 0x00aca676 in AllocSetFree (context=0x801e710d0, 
pointer=0x801e65038) at aset.c:976

#18 0x00acbe93 in pfree (pointer=0x801e65038) at mcxt.c:1015
#19 0x004a7986 in ginendscan (scan=0x801e61de0) at ginscan.c:445
#20 0x00504818 in index_endscan (scan=0x801e61de0) at indexam.c:339
#21 0x00719d21 in ExecEndBitmapIndexScan (node=0x801e619c8) at 
nodeBitmapIndexscan.c:183

#22 0x006fce9e in ExecEndNode (node=0x801e619c8) at execProcnode.c:685
#23 0x00719195 in ExecEndBitmapHeapScan (node=0x801d63700) at 
nodeBitmapHeapscan.c:508

#24 0x006fceaf in ExecEndNode (node=0x801d63700) at execProcnode.c:689
#25 0x0072b64a in ExecEndModifyTable (node=0x801d632a0) at 
nodeModifyTable.c:1978

#26 0x006fcde3 in ExecEndNode (node=0x801d632a0) at execProcnode.c:638
#27 0x006f6ed9 in ExecEndPlan (planstate=0x801d632a0, 
estate=0x801d63038) at execMain.c:1451
#28 0x006f6e56 in standard_ExecutorEnd (queryDesc=0x801e42af0) at 
execMain.c:468
#29 0x0008020038f2 in pgss_ExecutorEnd (queryDesc=0x801e42af0) at 
pg_stat_statements.c:938

#30 0x006f6d3c in ExecutorEnd (queryDesc=0x801e42af0) at execMain.c:437
#31 0x008ea387 in ProcessQuery (plan=0x801e43898, sourceText=0x801e42838 
"update foo set count=count+1 where text_array @> $1", params=0x801e428b8, 
dest=0xf3fcc8,

completionTag=0x7fffdd00 "UPDATE 1") at pquery.c:230
#32 0x008e9540 in PortalRunMulti (portal=0x801dc5038, isTopLevel=1 
'\001', dest=0xf3fcc8, altdest=0xf3fcc8, completionTag=0x7fffdd00 "UPDATE 
1") at pquery.c:1267
#33 0x008e8cd6 in PortalRun (portal=0x801dc5038, 
count=9223372036854775807, isTopLevel=1 '\001', dest=0x801c96450, 
altdest=0x801c96450,

completionTag=0x7fffdd00 "UPDATE 1") at pquery.c:813
#34 0x008e61ef in exec_execute_message (portal_name=0x801c96038 "", 
max_rows=9223372036854775807) at postgres.c:1979
#35 0x008e39ae in PostgresMain (argc=1, argv=0x801d56bc8, 
dbname=0x801d56ba8 "teodor", username=0x801d56b88 "teodor") at postgres.c:4122

#36 0x00839744 in BackendRun (port=0x801d571c0)

Re: [HACKERS] Naming of new tsvector functions

2016-05-05 Thread Gavin Flower


On 05/05/16 21:20, Stas Kelvich wrote:

On 04 May 2016, at 20:15, Tom Lane  wrote:

Stas Kelvich  writes:

On 04 May 2016, at 16:58, Tom Lane  wrote:
The other ones are not so problematic because they do not conflict with
SQL keywords.  It's only delete() and filter() that scare me.

Okay, so changed functions to ts_setweight, ts_delete, ts_unnest, ts_filter.

Somehow, I don't think you read what I wrote.

Renaming the pre-existing setweight() function to ts_setweight() is
not going to happen; it's been like that for half a dozen years now.
It would make no sense to call the new variant ts_setweight() while
keeping setweight() for the existing function, either.

Oh, I accidentally renamed one of the old functions, my mistake.


I also don't see that much point in ts_unnest(), since unnest()
in our implementation is a function not a keyword.  I don't have
a strong opinion about that one, though.

Just to keep some level of uniformity in function names. But also i’m
not insisting.


Also, I'd supposed that we'd rename to tsvector_something, since
the same patch also introduced tsvector_to_array() and
array_to_tsvector().  What's the motivation for using ts_ as the
prefix?

There is already several functions named ts_* (ts_rank, ts_headline, ts_rewrite)
and two named starting from tsvector_* (tsvector_update_trigger, 
tsvector_update_trigger_column).

Personally I’d prefer ts_ over tsvector_ since it is shorter, and still keeps 
semantics.


regards, tom lane


I've not been involved in doing any tsvector stuff, nor likely to in the 
near future - but if i was, I think I'd find simpler to get into if 
tsvector specific functions followed a common pattern of naming, like 
Stas is suggesting.



Cheers,
Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More inaccurate results from numeric pow()

2016-05-05 Thread Dean Rasheed

On 2 May 2016 at 18:38, Tom Lane  wrote:
> I don't much care for the hardwired magic number here, especially since
> exp_var() does not have its limit expressed as "6000" but as
> "NUMERIC_MAX_RESULT_SCALE * 3".  I think you should rephrase the limit
> to use that expression, and also add something like this in exp_var():

OK, that makes sense. Done that way.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] atomic pin/unpin causing errors

2016-05-05 Thread Teodor Sigaev


Any chance you could package up that data directory for me to download?


Sent by personal email to Alexander, Andres and Jeff

In /var/log/message I found

May  4 22:04:07 xor kernel: pid 14010 (postgres), uid 1001: exited on signal 6 
(core dumped)
May  4 22:04:25 xor kernel: pid 14032 (postgres), uid 1001: exited on signal 11 
(core dumped)
May  4 22:04:52 xor kernel: pid 14037 (postgres), uid 1001: exited on signal 6 
(core dumped)



Sometimes postgres is crashed with SIGSEGV signal instead of SIGABRT (which 
comes form abort() in assert)


I'll try to get a coredump after SIGSEGV, but it could take a time.
--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Naming of new tsvector functions

2016-05-05 Thread Stas Kelvich

> On 04 May 2016, at 20:15, Tom Lane  wrote:
> 
> Stas Kelvich  writes:
>>> On 04 May 2016, at 16:58, Tom Lane  wrote:
>>> The other ones are not so problematic because they do not conflict with
>>> SQL keywords.  It's only delete() and filter() that scare me.
> 
>> Okay, so changed functions to ts_setweight, ts_delete, ts_unnest, ts_filter.
> 
> Somehow, I don't think you read what I wrote.
> 
> Renaming the pre-existing setweight() function to ts_setweight() is
> not going to happen; it's been like that for half a dozen years now.
> It would make no sense to call the new variant ts_setweight() while
> keeping setweight() for the existing function, either.

Oh, I accidentally renamed one of the old functions, my mistake.

> I also don't see that much point in ts_unnest(), since unnest()
> in our implementation is a function not a keyword.  I don't have
> a strong opinion about that one, though.

Just to keep some level of uniformity in function names. But also i’m
not insisting.

> Also, I'd supposed that we'd rename to tsvector_something, since
> the same patch also introduced tsvector_to_array() and
> array_to_tsvector().  What's the motivation for using ts_ as the
> prefix?

There is already several functions named ts_* (ts_rank, ts_headline, 
ts_rewrite) 
and two named starting from tsvector_* (tsvector_update_trigger, 
tsvector_update_trigger_column).

Personally I’d prefer ts_ over tsvector_ since it is shorter, and still keeps 
semantics.

> 
>   regards, tom lane


-- 
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] what to revert

2016-05-05 Thread Ants Aasma

5. mai 2016 6:14 AM kirjutas kuupäeval "Andres Freund" :
>
> On 2016-05-05 06:08:39 +0300, Ants Aasma wrote:
> > On 5 May 2016 1:28 a.m., "Andres Freund"  wrote:
> > > On 2016-05-04 18:22:27 -0400, Robert Haas wrote:
> > > > How would the semantics change?
> > >
> > > Right now the time for computing the snapshot is relevant, if
> > > maintenance of xids is moved, it'll likely be tied to the time xids
are
> > > assigned. That seems perfectly alright, but it'll change behaviour.
> >
> > FWIW moving the maintenance to a clock tick process will not change user
> > visible semantics in any significant way. The change could easily be
made
> > in the next release.
>
> I'm not convinced of that - right now the timeout is computed as a
> offset to the time a snapshot with a certain xmin horizon is
> taken. Moving the computation to GetNewTransactionId() or a clock tick
> process will make it relative to the time an xid has been generated
> (minus a fuzz factor).  That'll behave differently in a number of cases,
no?

Timeout is calculated in TransactionIdLimitedForOldSnapshots() as
GetSnapshotCurrentTimestamp() minus old_snapshot_timeout rounded down to
previous minute. If the highest seen xmin in that minute is useful in
raising cleanup xmin the threshold is bumped. Snapshots switch behavior
when their whenTaken is past this threshold. Xid generation time has no
effect on this timing, it's completely based on passage of time.

The needed invariant is, that no snapshot having whenTaken after timeout
timestamp can have xmin less than calculated bound. Moving the xmin
calculation and storage to clock tick actually makes the bound tighter. The
ordering between xmin calculation and clok update/read needs to be correct
to ensure the invariant.

Regards,
Ants Aasma

Re: [HACKERS] New pgbench functions are misnamed

2016-05-05 Thread Fabien COELHO




I noticed that commit 7e137f846 added functions named max() and min()
to pgbench's expression syntax.  Unfortunately, these functions have
zilch to do with what max() and min() do in SQL.  They're actually more
like the greatest() and least() server-side functions.


Yep.

While I can't imagine that we'd ever want to implement true aggregates 
in pgbench expressions, it still seems like this is a recipe for 
confusion.  Shouldn't we rename these to greatest() and least()?


My 0,02€: I like the simplicity of min/max names and I think that anyone 
would manage to deal with this level of confusion in a pgbench script, so 
I would not bother.


But if it is to be changed, best do it now!

--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] Breakage with VACUUM ANALYSE + partitions

2016-05-05 Thread Fabien COELHO



Hello Tom,


I understood the point and I do not see real disadvantages. The C standard
really says that an enum is an int, and compilers just do that.


No, it doesn't say that, and compilers don't just do that.
A compiler is specifically allowed to store an enum in char or short if 
the enum's declared values would all fit into that width.


Hmm. That is a bit of a subtle one:

Spec says that enum *constants* are ints "The identifiers in an enumerator 
list are declared as constants that have type int and may appear wherever 
such are permitted."  But indeed, as you point out, per spec the storage 
could be made smaller.


However, I'm yet to meet a compiler which does not do ints:

  typedef enum { false, true } boolean;
  assert(sizeof(boolean) == sizeof(int)); // ok with gcc & clang

I can guess why: such a compiler would not be able to work with libraries 
compiled with gcc or clang, which would be an pretty annoying feature. Now 
it does not mean that such a compiler does not exist in some realm (8/16 
bits architectures maybe? but then ints would be 16 bits on these...).


(Admittedly, if you're choosing the values as powers of 2, an OR of them 
would still fit;


Yep.


but if you think "oh, an enum is just an int", you will get burned.)


In theory, yes. In practice, the compiler writer would have get burned 
before me:-).



* compiler warnings if you forget some members of the enum in a switch


Sure. Using a switch on a bitfield is an uncommon pattern, though.


* debugger ability to print variables symbolically


Yep.

Names are lost by defines, which is my preliminary concern to try to avoid 
them, even at the price of beting against the spec letter:-)



At that point you might as well use the #defines rather than playing
language lawyer about whether what you're doing meets the letter of
the spec.


Hmmm... the compilers are the real judges, the spec is just the law:-)

I note that C99 specifically mentions this as something a compiler might 
warn about: [...]


Indeed. Neither gcc nor clang emit such warnings... but they might some 
day, which would be a blow for my suggestion!


Another option would have been explicit bitfields, something like:

  typedef struct {
int create : 1,
return_null : 1,
fail : 1,
create_in_recovery : 1,
// ...
;
  } extension_bitfield_t;

  void foo(extension_bitfield_t behavior)
  {
if (behavior.create) printf("create...\n");
if (behavior.return_null) printf("null...\n");
if (behavior.fail) printf("fail...\n");
if (behavior.create_in_recovery) printf("recovery...\n");
  }

  void bla(void)
  {
foo((extension_bitfield_t) { .fail = 1, .create_in_recovery = 1 });
  }

With gdb it is quite readable:

  // (gdb) p behavior
  // $1 = {create = 0, return_null = 0, fail = -1, create_in_recovery = -1}

Anyway, thanks for the discussion!

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is pg_control file crashsafe?

2016-05-05 Thread Amit Kapila

On Thu, May 5, 2016 at 11:52 AM, Thomas Munro 
wrote:
>
> On Thu, May 5, 2016 at 4:32 PM, Tom Lane  wrote:
> > Amit Kapila  writes:
> >> How about using 512 bytes as a write size and perform direct writes
rather
> >> than going via OS buffer cache for control file?
> >
> > Wouldn't that fail outright under a lot of implementations of direct
write;
> > ie the request needs to be page-aligned, for some not-very-determinate
> > value of page size?
> >

Right, it should be atleast page size.

>
> > To repeat, I'm pretty hesitant to change this logic.  While this is not
> > the first report we've ever heard of loss of pg_control, I believe I
could
> > count those reports without running out of fingers on one hand --- and
> > that's counting since the last century. It will take quite a lot of
> > evidence to convince me that some other implementation will be more
> > reliable.  If you just come and present a patch to use direct write, or
> > rename, or anything else for that matter, I'm going to reject it out of
> > hand unless you provide very strong evidence that it's going to be more
> > reliable than the current code across all the systems we support.
>
> I'm not sure how those ideas address the reported problem anyway: the
> *length* was unexpectedly zero after a crash.  UpdateControlFile
> doesn't change the length of the control file, since it doesn't
> specify O_TRUNC or O_APPEND and it always writes the same size.  So it
> seems like a pretty weird failure mode affecting filesystem metadata
> (which I wouldn't expect to change anyway, but I would expect to be
> journaled if it did), not a file-contents-atomicity problem.  Whether
> or not the page cache is involved in a write to a preallocated file
> doesn't seem relevant to a case of unexpected truncation, and the
> atomic rename trick doesn't seem relevant either unless someone with
> expert knowledge of NTFS could explain how a crash could lead to
> truncation in the first place, and how rename would help.
>

I think the real reason for truncation is not known or not discussed here.
It seems to me that the ideas are being discussed on the mere speculation
that current way of writing can lead to corruption in certain cases.  I
think it would be better to first dig into the actual reason of problem.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Segmentation fault when max_parallel degree is very High

2016-05-05 Thread Amit Kapila

On Wed, May 4, 2016 at 8:31 PM, Tom Lane  wrote:
>
> Dilip Kumar  writes:
> > When parallel degree is set to very high say 7, there is a
segmentation
> > fault in parallel code,
> > and that is because type casting is missing in the code..
>
> I'd say the cause is not having a sane range limit on the GUC.
>

I think it might not be advisable to have this value more than the number
of CPU cores, so how about limiting it to 512 or 1024?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Is pg_control file crashsafe?

2016-05-05 Thread Thomas Munro

On Thu, May 5, 2016 at 4:32 PM, Tom Lane  wrote:
> Amit Kapila  writes:
>> How about using 512 bytes as a write size and perform direct writes rather
>> than going via OS buffer cache for control file?
>
> Wouldn't that fail outright under a lot of implementations of direct write;
> ie the request needs to be page-aligned, for some not-very-determinate
> value of page size?
>
> To repeat, I'm pretty hesitant to change this logic.  While this is not
> the first report we've ever heard of loss of pg_control, I believe I could
> count those reports without running out of fingers on one hand --- and
> that's counting since the last century. It will take quite a lot of
> evidence to convince me that some other implementation will be more
> reliable.  If you just come and present a patch to use direct write, or
> rename, or anything else for that matter, I'm going to reject it out of
> hand unless you provide very strong evidence that it's going to be more
> reliable than the current code across all the systems we support.

I'm not sure how those ideas address the reported problem anyway: the
*length* was unexpectedly zero after a crash.  UpdateControlFile
doesn't change the length of the control file, since it doesn't
specify O_TRUNC or O_APPEND and it always writes the same size.  So it
seems like a pretty weird failure mode affecting filesystem metadata
(which I wouldn't expect to change anyway, but I would expect to be
journaled if it did), not a file-contents-atomicity problem.  Whether
or not the page cache is involved in a write to a preallocated file
doesn't seem relevant to a case of unexpected truncation, and the
atomic rename trick doesn't seem relevant either unless someone with
expert knowledge of NTFS could explain how a crash could lead to
truncation in the first place, and how rename would help.

-- 
Thomas Munro
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

81 matches

Mail list logo