Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-08-23 Thread Josh Berkus
On 08/22/2017 11:04 PM, Masahiko Sawada wrote:
> WARNING:  what you did is ok, but you might have wanted to do something else
> 
> First of all, whether or not that can properly be called a warning is
> highly debatable.  Also, if you do that sort of thing to your spouse
> and/or children, they call it "nagging".  I don't think users will
> like it any more than family members do.

Realistically, we'll support the backwards-compatible syntax for 3-5
years.  Which is fine.

I suggest that we just gradually deprecate the old syntax from the docs,
and then around Postgres 16 eliminate it.  I posit that that's better
than changing the meaning of the old syntax out from under people.

-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-08-10 Thread Josh Berkus
On 08/09/2017 10:49 PM, Michael Paquier wrote:
> On Fri, Aug 4, 2017 at 8:19 AM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
>> On Fri, Jul 28, 2017 at 2:24 PM, Noah Misch <n...@leadboat.com> wrote:
>>> This item appears under "decisions to recheck mid-beta".  If anyone is going
>>> to push for a change here, now is the time.
>>
>> It has been 1 week since the previous mail. I though that there were
>> others argued to change the behavior of old-style setting so that a
>> quorum commit is chosen. If nobody is going to push for a change we
>> can live with the current behavior?
> 
> FWIW, I still see no harm in keeping backward-compatibility here, so I
> am in favor of a statu-quo.
> 

I am vaguely in favor of making quorum the default over "ordered".
However, given that anybody using sync commit without
understanding/customizing the setup is going to be sorry regardless,
keeping backwards compatibility is acceptable.

-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Better error message for trying to drop a DB with open subscriptions?

2017-07-20 Thread Josh Berkus
All:

The problem:

postgres=# drop database bookdata;
ERROR:  database "bookdata" is being accessed by other users
DETAIL:  There is 1 other session using the database.
postgres=# \c bookdata
You are now connected to database "bookdata" as user "postgres".
bookdata=# drop subscription wholedb;
NOTICE:  dropped replication slot "wholedb" on publisher
DROP SUBSCRIPTION
bookdata=# \c postgres
You are now connected to database "postgres" as user "postgres".
postgres=# drop database bookdata;
DROP DATABASE

Is there any easy way for us to detect that the "user" accessing the
target database is actually a logical replication subscription, and give
the DBA a better error message (e.g. "database 'bookdata' still has open
subscrptions")?

-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Notes on testing Postgres 10b1

2017-06-10 Thread Josh Berkus
On 06/09/2017 07:54 PM, Greg Stark wrote:
> On 7 June 2017 at 01:01, Josh Berkus <j...@berkus.org> wrote:
>> P3: apparently jsonb_to_tsvector with lang parameter isn't immutable?
>> This means that it can't be used for indexing:
>>
>> libdata=# create index bookdata_fts on bookdata using gin ((
>> to_tsvector('english',bookdata)));
>> ERROR:  functions in index expression must be marked IMMUTABLE
> 
> I don't have a machine handy to check on but isn't this a strange
> thing to do? Isn't there a GIN opclass on jsonb itself which would be
> the default if you didn't have that to_tsvector() call  -- and which
> would also work properly with the jsonb operators?
> 

The above is the documented way to create an FTS index on a JSONB field.

-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] jsonb_to_tsvector should be immutable

2017-06-08 Thread Josh Berkus
Wanted to pull this out of my general report, because nobody seems to
have seen it:

P3: apparently jsonb_to_tsvector with lang parameter isn't immutable?
This means that it can't be used for indexing:

libdata=# create index bookdata_fts on bookdata using gin ((
to_tsvector('english',bookdata)));
ERROR:  functions in index expression must be marked IMMUTABLE

... and indeed it's not:

select proname, prosrc, proargtypes, provolatile from pg_proc where
proname = 'to_tsvector';
   proname   | prosrc | proargtypes | provolatile
-++-+-
 to_tsvector | jsonb_to_tsvector  | 3802| s
 to_tsvector | to_tsvector_byid   | 3734 25 | i
 to_tsvector | to_tsvector| 25  | s
 to_tsvector | json_to_tsvector   | 114 | s
 to_tsvector | jsonb_to_tsvector_byid | 3734 3802   | s
 to_tsvector | json_to_tsvector_byid  | 3734 114| s

Both of the _byid functions should be marked immutable, no?  Otherwise
how can users use the new functions for indexing?



-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Notes on testing Postgres 10b1

2017-06-08 Thread Josh Berkus
On 06/07/2017 06:37 PM, Peter Eisentraut wrote:
> On 6/7/17 21:19, Josh Berkus wrote:
>> The user's first thought is going to be a network issue, or a bug, or
>> some other problem, not a missing PK.  Yeah, they can find that
>> information in the logs, but only if they think to look for it in the
>> first place, and in some environments (AWS, containers, etc.) logs can
>> be very hard to access.
> 
> You're not going to get very far with using this feature if you are not
> looking in the logs for errors.  These are asynchronously operating
> background workers, so the only way they can communicate problems is
> through the log.

Well, we *could* provide a system view, as we now do for archiving, and
for the same reasons.

The issue isn't that the error detail is in the log.  It's somehow
letting the user know that they need to look at the log, as opposed to
somewhere else.  Consider that this is asynchonous for the user as well;
they are likely to find out about the broken replication well after it
happens, and thus have a lot of log to search through.

Activity logs are a *terrible* UI for debugging systems problems.  I
realize that there is information it's hard for us to provide any other
way.  But the logs should be our "monitoring of last resort", where we
put stuff after we've run out of ideas on where else to put it, because
they are the hardest thing to access for a user.

-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Notes on testing Postgres 10b1

2017-06-08 Thread Josh Berkus
On 06/07/2017 07:01 PM, Petr Jelinek wrote:
> On 08/06/17 03:50, Josh Berkus wrote:
>> On 06/07/2017 06:25 PM, Petr Jelinek wrote:
>>> On 08/06/17 03:19, Josh Berkus wrote:
>>>>
>>>> Peter and Petr:
>>>>
>>>> On 06/07/2017 05:24 PM, Peter Eisentraut wrote:
>>>>> On 6/7/17 01:01, Josh Berkus wrote:
>>>>>> * Having defaults on the various _workers all devolve from max_workers
>>>>>> is also great.
>>>>>
>>>>> I'm not aware of anything like that happening.
>>>>>
>>>>>> P1. On the publishing node, logical replication relies on the *implied*
>>>>>> correspondence of the application_name and the replication_slot both
>>>>>> being named the same as the publication in order to associate a
>>>>>> particular publication with a particular replication connection.
>>>>>> However, there's absolutely nothing preventing me from also creating a
>>>>>> binary replication connection by the same name  It really seems like we
>>>>>> need a field in pg_stat_replication or pg_replication_slots which lists
>>>>>> the publication.
>>>>>
>>>>> I'm not quite sure what you are getting at here.  The application_name
>>>>> seen on the publisher side is the subscription name.  You can create a
>>>>> binary replication connection using the same application_name, but
>>>>> that's already been possible before.  But the publications don't care
>>>>> about any of this.
>>>>
>>>> My point is that there is no system view where I can see, on the origin
>>>> node, what subscribers are subscribing to which publications.  You can
>>>> kinda guess that from pg_stat_replication etc., but it's not dependable
>>>> information.
>>>>
>>>
>>> That's like wanting the foreign server to show you which foreign tables
>>> exist on the local server. This is not a tightly coupled system and you
>>> are able to setup both sides without them being connected to each other
>>> at the time of setup, so there is no way publisher can know anything.
>>
>> Why wouldn't the publisher know who's connected once the replication
>> connection as been made and the subscription has started?  Or is it just
>> a log position, and the publisher really has no idea how many
>> publications are being consumed?
>>
> 
> Plugin knows while the connection exists, but that's the thing, it goes
> through pluggable interface (that can be used by other plugins, without
> publications) so there would have to be some abstracted way for plugins
> to give some extra information for the pg_stat_replication or similar
> view. I am afraid it's bit too late to design something like that in
> PG10 cycle.

OK, consider it a feature request for PG11, then.


-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Notes on testing Postgres 10b1

2017-06-07 Thread Josh Berkus
On 06/07/2017 06:25 PM, Petr Jelinek wrote:
> On 08/06/17 03:19, Josh Berkus wrote:
>>
>> Peter and Petr:
>>
>> On 06/07/2017 05:24 PM, Peter Eisentraut wrote:
>>> On 6/7/17 01:01, Josh Berkus wrote:
>>>> * Having defaults on the various _workers all devolve from max_workers
>>>> is also great.
>>>
>>> I'm not aware of anything like that happening.
>>>
>>>> P1. On the publishing node, logical replication relies on the *implied*
>>>> correspondence of the application_name and the replication_slot both
>>>> being named the same as the publication in order to associate a
>>>> particular publication with a particular replication connection.
>>>> However, there's absolutely nothing preventing me from also creating a
>>>> binary replication connection by the same name  It really seems like we
>>>> need a field in pg_stat_replication or pg_replication_slots which lists
>>>> the publication.
>>>
>>> I'm not quite sure what you are getting at here.  The application_name
>>> seen on the publisher side is the subscription name.  You can create a
>>> binary replication connection using the same application_name, but
>>> that's already been possible before.  But the publications don't care
>>> about any of this.
>>
>> My point is that there is no system view where I can see, on the origin
>> node, what subscribers are subscribing to which publications.  You can
>> kinda guess that from pg_stat_replication etc., but it's not dependable
>> information.
>>
> 
> That's like wanting the foreign server to show you which foreign tables
> exist on the local server. This is not a tightly coupled system and you
> are able to setup both sides without them being connected to each other
> at the time of setup, so there is no way publisher can know anything.

Why wouldn't the publisher know who's connected once the replication
connection as been made and the subscription has started?  Or is it just
a log position, and the publisher really has no idea how many
publications are being consumed?


-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Notes on testing Postgres 10b1

2017-06-07 Thread Josh Berkus

Peter and Petr:

On 06/07/2017 05:24 PM, Peter Eisentraut wrote:
> On 6/7/17 01:01, Josh Berkus wrote:
>> * Having defaults on the various _workers all devolve from max_workers
>> is also great.
> 
> I'm not aware of anything like that happening.
> 
>> P1. On the publishing node, logical replication relies on the *implied*
>> correspondence of the application_name and the replication_slot both
>> being named the same as the publication in order to associate a
>> particular publication with a particular replication connection.
>> However, there's absolutely nothing preventing me from also creating a
>> binary replication connection by the same name  It really seems like we
>> need a field in pg_stat_replication or pg_replication_slots which lists
>> the publication.
> 
> I'm not quite sure what you are getting at here.  The application_name
> seen on the publisher side is the subscription name.  You can create a
> binary replication connection using the same application_name, but
> that's already been possible before.  But the publications don't care
> about any of this.

My point is that there is no system view where I can see, on the origin
node, what subscribers are subscribing to which publications.  You can
kinda guess that from pg_stat_replication etc., but it's not dependable
information.


>> P2: If I create a subscription on a table with no primary key, I do not
>> recieve a warning.  There should be a warning, since in most cases such
>> a subscription will not work.  I suggest the text:
>>
>> "logical replication target relation "public.fines" has no primary key.
>> Either create one, or set REPLICA IDENTITY index and set the published
>> relation to REPLICA IDENTITY FULL."
> 
> At that point, we don't know what is being published.  If only inserts
> are being published or REPLICA IDENTITY FULL is set, then it will work.
> We don't want to give warnings about things that might not be true.
> 
> More guidance on some of the potential failure cases would be good, but
> it would need more refinement.

Hmmm, yah, I see.  Let me explain why this is a UX issue as-is though:

1. User forgets to create a PK on the subscriber node.

2. User starts a subscription to the tables.

3. Subscription is successful.

4. First update hits the publisher node.

5. Subscription fails and disconnects.

The user's first thought is going to be a network issue, or a bug, or
some other problem, not a missing PK.  Yeah, they can find that
information in the logs, but only if they think to look for it in the
first place, and in some environments (AWS, containers, etc.) logs can
be very hard to access.

We really need the subscription to fail at step (2), not wait for the
first update to fail.  And if it doesn't fail at step 2, then we should
at least give a warning.

-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Notes on testing Postgres 10b1

2017-06-06 Thread Josh Berkus
Folks,

I've put together some demos on PostgreSQL 10beta1.  Here's a few
feedback notes based on my experience with it.

Things I tested


* Logical replication pub/sub with replicating only two tables out of a
12-table FK heirarchy, including custom data types

* Partitioning a log-structured table, including a range type, exclusion
constraint, and foreign key.

* Various Parallel index queries on a 100m-row pgbench table

* Full text JSON search in a books database

* SCRAM authentication for local connections and replication

Positive changes beyond the obvious
---

* Yay defaults with replication on!

* Having defaults on the various _workers all devolve from max_workers
is also great.

* Constraint exclusion + partitioning Just Worked.

Questions
--

Q1. Why does wal_level default to "replica" and not "logical"?

Q2: I thought we were going to finally change the pg_dump default to
"custom" format in this release?  No?

Problems


P1. On the publishing node, logical replication relies on the *implied*
correspondence of the application_name and the replication_slot both
being named the same as the publication in order to associate a
particular publication with a particular replication connection.
However, there's absolutely nothing preventing me from also creating a
binary replication connection by the same name  It really seems like we
need a field in pg_stat_replication or pg_replication_slots which lists
the publication.


P2: If I create a subscription on a table with no primary key, I do not
recieve a warning.  There should be a warning, since in most cases such
a subscription will not work.  I suggest the text:

"logical replication target relation "public.fines" has no primary key.
Either create one, or set REPLICA IDENTITY index and set the published
relation to REPLICA IDENTITY FULL."


P3: apparently jsonb_to_tsvector with lang parameter isn't immutable?
This means that it can't be used for indexing:

libdata=# create index bookdata_fts on bookdata using gin ((
to_tsvector('english',bookdata)));
ERROR:  functions in index expression must be marked IMMUTABLE

... and indeed it's not:

select proname, prosrc, proargtypes, provolatile from pg_proc where
proname = 'to_tsvector';
   proname   | prosrc | proargtypes | provolatile
-++-+-
 to_tsvector | jsonb_to_tsvector  | 3802| s
 to_tsvector | to_tsvector_byid   | 3734 25 | i
 to_tsvector | to_tsvector| 25  | s
 to_tsvector | json_to_tsvector   | 114 | s
 to_tsvector | jsonb_to_tsvector_byid | 3734 3802   | s
 to_tsvector | json_to_tsvector_byid  | 3734 114| s

... can we fix that?


-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Missing feature in Phrase Search?

2017-04-21 Thread Josh Berkus
Oleg, Teodor, folks:

I was demo'ing phrase search for a meetup yesterday, and the user
feedback I got showed that there's a missing feature with phrase search.
 Let me explain by example:


'fix <-> error' will match 'fixed error', 'fixing error'
but not 'fixed language error' or 'fixed a small error'

'fix <2> error' will match 'fixed language error',
but not 'fixing error' or 'fixed a small error'

'fix <3> error' will match 'fixed a small error',
but not any of the other strings.


This is because the # in <#> is an exact match.

Seems like we could really use a way for users to indicate that they
want a range of word gaps.  Like, in the example above, users could
search on:

'fix <1:3> error'

... which would search for any phrase where "error" followed "fix" by
between 1 and 3 words.

Not wedded to any particular syntax for that, of course.

-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 2017-03 CF Closed

2017-04-08 Thread Josh Berkus
On 04/08/2017 09:12 AM, Tom Lane wrote:
> Michael Paquier <michael.paqu...@gmail.com> writes:
>> On Sat, Apr 8, 2017 at 11:27 PM, Andres Freund <and...@anarazel.de> wrote:
>>> Thanks for your work on managing the fest!
> 
>> +1. Great work!
> 
> Seconded.  That's a huge amount of generally-underappreciated work.
> 

And only a week over schedule.  Good work.


-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-10 Thread Josh Berkus
On 03/09/2017 10:12 AM, Sven R. Kunze wrote:
> On 08.03.2017 20:52, Magnus Hagander wrote:
>> On Wed, Mar 8, 2017 at 11:48 AM, Peter van Hardenberg <p...@pvh.ca
>> <mailto:p...@pvh.ca>> wrote:
>>
>> Small point of order: YAML is not strictly a super-set of JSON.
>>
>> Editorializing slightly, I have not seen much interest in the
>> world for YAML support though I'd be interested in evidence to the
>> contrary.
>>
>>
>> The world of configuration management seems to for some reason run off
>> YAML, but that's the only places I've seen it recently (ansible,
>> puppet etc).
> 
> SaltStack uses YAML for their tools, too. I personally can empathize
> with them (as a user of configuration management) about this as writing
> JSON would be nightmare with all the quoting, commas, curly braces etc.
> But that's my own preference maybe.
> 
> (Btw. does "run off" mean like or avoid? At least my dictionaries tend
> to the latter.)

Yes, but automated tools can easily convert between JSON and
newline-delimited YAML and back.

-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for changes to recovery.conf API

2017-03-07 Thread Josh Berkus
On 03/02/2017 07:13 AM, David Steele wrote:
> Hi Simon,
> 
> On 2/25/17 2:43 PM, Simon Riggs wrote:
>> On 25 February 2017 at 13:58, Michael Paquier <michael.paqu...@gmail.com> 
>> wrote:
>>
>>> - trigger_file is removed.
>>> FWIW, my only complain is about the removal of trigger_file, this is
>>> useful to detect a trigger file on a different partition that PGDATA!
>>> Keeping it costs also nothing..
>>
>> Sorry, that is just an error of implementation, not intention. I had
>> it on my list to keep, at your request.
>>
>> New version coming up.
> 
> Do you have an idea when the new version will be available?

Please?  Having yet another PostgreSQL release go by without fixing
recovery.conf would make me very sad.


-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for changes to recovery.conf API

2017-02-26 Thread Josh Berkus
On 02/26/2017 12:55 AM, Robert Haas wrote:
> On Wed, Jan 11, 2017 at 11:23 PM, Simon Riggs <si...@2ndquadrant.com> wrote:
>>> I think the issue was that some people didn't want configuration files
>>> in the data directory.  By removing recovery.conf we accomplish that.
>>> Signal/trigger files are not configuration (or at least it's much easier
>>> to argue that), so I think having them in the data directory is fine.
>>
>> There were a considerable number of people that pushed to make the
>> data directory non-user writable, which is where the signal directory
>> came from.
> 
> Specifically, it's a problem for Debian's packaging conventions,
> right?  The data directory can contain anything that the server itself
> will write, but configuration files that are written for the server to
> read are supposed to go in some external location dictated by Debian's
> packaging policy.
> 
> Things like trigger files aren't configuration files per se, so maybe
> it's OK if those still get written into the data directory.  Even if
> not, that seems like a separate patch.  In my view, based on Michael's
> description of what the current patch version does, it's a clear step
> forward.  Other steps can be taken at another time, if required.
> 

>From the perspective of containerized Postgres, you want config files to
go into one (non-writeable) directory, and anything which is writeable
by the DB server to go into another directory (and preferably, a single
directory).

A trigger file (that is, assuming an empty one, and recovery config
merged with pg.conf) would thus be writeable, non-configuration data
which goes in the data directory.

Users manually writing the trigger file doesn't show up as a problem
since, in a containerized environment, they can't.  It's either written
by postgres itself, or by management software which runs as the postgres
user.

-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Official adoption of PGXN

2017-02-14 Thread Josh Berkus
On 02/14/2017 12:05 PM, Tom Lane wrote:
> Jim Nasby <jim.na...@bluetreble.com> writes:
>> First, just to clarify: my reasons for proposing "core adoption" of PGXN 
>> are not technical in nature.
> 
> What do you think "core adoption" means?  Surely not that anything
> associated with PGXN would be in the core distro.

One part of this would need to be having a designated committee of the
Postgres community pick a set of "blessed" extensions for packagers to
package.  Right now, contrib serves that purpose (badly).  One of the
reasons we haven't dealt with the extension distribution problem is that
nobody wanted to take on the issue of picking a list of blessed extensions.

> 
>> Right now contrib is serving two completely separate purposes:
> 
>> 1) location for code that (for technical reasons) should be tied to 
>> specific PG versions
>> 2) indication of "official endorsement" of a module by the community
> 
> This argument ignores what I think is the real technical reason for
> keeping contrib, which is to have a set of close-at-hand test cases
> for extension and hook mechanisms.  Certainly, not every one of the
> existing contrib modules is especially useful for that purpose, but
> quite a few of them are.

Yes.  But there's a bunch that aren't, and those are the ones which we
previously discussed, the ones with indifferent maintenance like ISN and
Intarray.

You have to admit that it seems really strange in the eyes of a new user
that ISN is packaged with PostgreSQL, whereas better-written and more
popular extensions (like plv8, pg_partman or pgq) are not.


-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removal of deprecated views pg_user, pg_group, pg_shadow

2017-02-13 Thread Josh Berkus
On 02/13/2017 11:00 AM, Robert Haas wrote:
> My big objection to removing these views is that it will break pgAdmin
> 3, which uses all three of these views.  I understand that the pgAdmin
> community is now moving away from pgAdmin 3 and toward pgAdmin 4, but
> I bet that pgAdmin 3 still has significant usage and will continue to
> have significant usage for at least a year or three.  When it's
> thoroughly dead, then I think we can go ahead and do this, unless
> there are other really important tools still depending on those views,
> but it's only been 3 months since the final pgAdmin 3 release.

How long would you suggest is appropriate?  Postgres 11? 12?  Let's set
a target date; that way we can communicate it more than once.

-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] removing tsearch2

2017-02-10 Thread Josh Berkus
On 02/10/2017 10:18 AM, Peter Geoghegan wrote:
> On Fri, Feb 10, 2017 at 3:28 AM, Robert Haas <robertmh...@gmail.com> wrote:
>>>> Works for me.
>>>
>>> +1
>>
>> OK, that's three votes in favor of removing tsearch2 (from core,
>> anyone who wants it can maintain a copy elsewhere).
> 
> +1.
> 
> I'd also be in favor of either removing contrib/isn, or changing it so
> that the ISBN country code prefix enforcement went away. That would
> actually not imply and real loss of functionality from a practical
> perspective, since you can still enforce that the check digit is
> correct without any of that. I think that the existing design of some
> parts of contrib/isn is just horrible.

+1 to quick-fix it, -1 to just delete it.

There's a bunch of these things in /contrib which really ought to be
PGXN extensions (also CUBE, earthdistance, etc.).  However, one of the
steps in that would be getting the mainstream platforms to package them
so that users have a reasonable upgrade path, so I would not propose
doing it for 10.

-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] removing tsearch2

2017-02-10 Thread Josh Berkus
On 02/10/2017 06:41 AM, David Steele wrote:
> On 2/10/17 6:28 AM, Robert Haas wrote:
>> On Thu, Feb 9, 2017 at 7:37 PM, Andres Freund <and...@anarazel.de> wrote:
>>> On 2017-02-09 19:19:21 -0500, Stephen Frost wrote:
>>>> * Robert Haas (robertmh...@gmail.com) wrote:
>>>>> On Thu, Feb 9, 2017 at 4:24 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>>>>>> Also, our experience with contrib/tsearch2 suggests that the extension
>>>>>> shouldn't be part of contrib, because we have zero track record of 
>>>>>> getting
>>>>>> rid of stuff in contrib, no matter how dead it is.
>>>>>
>>>>> Let's nuke tsearch2 to remove this adverse precedent, and then add the
>>>>> new thing.
>>>>>
>>>>> Anybody who still wants tsearch2 can go get it from an old version, or
>>>>> somebody can maintain a fork on github.
>>>>
>>>> Works for me.
>>>
>>> +1
>>
>> OK, that's three votes in favor of removing tsearch2 (from core,
>> anyone who wants it can maintain a copy elsewhere).  Starting a new
>> thread to make sure we collect all the relevant votes, but I really,
>> really think it's past time for this to go away.  The last actual
>> change to tsearch2 which wasn't part of a wider cleanup was
>> 3ca7eddbb7c4803729d385a0c9535d8a972ee03f in January 2009, so it's been
>> 7 years since there's been any real work done on this -- and the
>> release where we brought tsearch into core is now EOL, plus three more
>> releases besides.
> 
> +1
> 

+1

-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-02-09 Thread Josh Berkus
On 02/09/2017 05:19 PM, Michael Paquier wrote:
> On Fri, Feb 10, 2017 at 10:16 AM, Stephen Frost <sfr...@snowman.net> wrote:
>>> As someone mentioned, forcing a user to install an extension makes
>>> the deprecation visible. Another option would be to have the backend
>>> spit out a WARNING the first time you access anything that's
>>> deprecated. Both of those are pertinent reminders to people that
>>> they need to change their tools.
>>
>> Ugh.  Please, no.  Hacking up the backend to recognize that a given
>> query is referring to a deprecated view and then throwing a warning on
>> it is just plain ugly.
>>
>> Let's go one step further, and throw an ERROR if someone tries to query
>> these views instead.
> 
> FWIW, I am of the opinion to just nuke them as the "soft of"
> deprecation period has been very long. Applications should have
> switched to pg_authid and pg_roles long ago already.
> 

We will definitely break a lot of client code by removing these -- I
know that, deprecated or not, a lot of infrequently-updated
driver/orm/GUI code still refers to pg_shadow/pg_user.

I think Postgres 10 is the right time to break that code (I mean, we
have to do it someday, and we're already telling people about breakage
in 10), but be aware that there will be shouting and carrying on.

-1 on a warning.  Very little code today which references the deprecated
code is interactive, so who's going to see the warning?

-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-02-09 Thread Josh Berkus
On 02/09/2017 12:53 PM, Stephen Frost wrote:
> * Josh Berkus (j...@berkus.org) wrote:
>> On 02/09/2017 12:42 PM, Stephen Frost wrote:
>>> * Josh Berkus (j...@berkus.org) wrote:
>>>> On 02/09/2017 11:08 AM, Tom Lane wrote:
>>>>> Agreed, let's just get it done.
>>>>>
>>>>> Although this doesn't really settle whether we ought to do 3a (with
>>>>> backwards-compatibility function aliases in core) or 3b (without 'em).
>>>>> Do people want to re-vote, understanding that those are the remaining
>>>>> choices?
>>>>
>>>> Does 3a) mean keeping the aliases more-or-less forever?
>>>>
>>>> If not, I vote for 3b.  If we're going to need to break stuff, let's
>>>> just do it.
>>>>
>>>> If we can keep the aliases for 6-10 years, then I see no reason not to
>>>> have them (3a).  They're not exactly likely to conflict with user-chosen
>>>> names.
>>>
>>> When we remove pg_shadow, then I'll be willing to agree that maybe we
>>> can start having things in PG for a couple releases that are just for
>>> backwards-compatibility and will actually be removed later.
>>>
>>> History has shown that's next to impossible, however.
>>
>> That's why I said 6-10 years.  If we're doing 3a, realistically we're
>> supporting it until PostgreSQL 16, at least, and more likely 20.  I'm OK
>> with that.
> 
> Uh, to be clear, I think it's an entirely bad thing that we've had those
> views and various other cruft hang around for over 10 years.
> 
> And removing them today will probably still have people crying about how
> pgAdmin3 and other things still use them.
> 
>> What I'm voting against is the idea that we'll have aliases in core, but
>> remove them in two releases.  Either that's unrealistic, or it's just
>> prolonging the pain.
> 
> Waiting 10+ years doesn't make the pain go away when it comes to
> removing things like that.

Sure it does.  That's two whole generations of client tools.  For
example, at that point, pgAdmin3 won't reliably run on any supported
platform, so it won't be a problem if we break it.

If we clearly mark the old function names as deprecated aliases, client
tools will gradually move to the new names.

Counter-argument: moving the directory is going to break many tools
anyway, so why bother with function aliases?

-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-02-09 Thread Josh Berkus
On 02/09/2017 12:42 PM, Stephen Frost wrote:
> * Josh Berkus (j...@berkus.org) wrote:
>> On 02/09/2017 11:08 AM, Tom Lane wrote:
>>> Agreed, let's just get it done.
>>>
>>> Although this doesn't really settle whether we ought to do 3a (with
>>> backwards-compatibility function aliases in core) or 3b (without 'em).
>>> Do people want to re-vote, understanding that those are the remaining
>>> choices?
>>
>> Does 3a) mean keeping the aliases more-or-less forever?
>>
>> If not, I vote for 3b.  If we're going to need to break stuff, let's
>> just do it.
>>
>> If we can keep the aliases for 6-10 years, then I see no reason not to
>> have them (3a).  They're not exactly likely to conflict with user-chosen
>> names.
> 
> When we remove pg_shadow, then I'll be willing to agree that maybe we
> can start having things in PG for a couple releases that are just for
> backwards-compatibility and will actually be removed later.
> 
> History has shown that's next to impossible, however.

That's why I said 6-10 years.  If we're doing 3a, realistically we're
supporting it until PostgreSQL 16, at least, and more likely 20.  I'm OK
with that.

What I'm voting against is the idea that we'll have aliases in core, but
remove them in two releases.  Either that's unrealistic, or it's just
prolonging the pain.

-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-02-09 Thread Josh Berkus
On 02/09/2017 11:08 AM, Tom Lane wrote:
> Agreed, let's just get it done.
> 
> Although this doesn't really settle whether we ought to do 3a (with
> backwards-compatibility function aliases in core) or 3b (without 'em).
> Do people want to re-vote, understanding that those are the remaining
> choices?

Does 3a) mean keeping the aliases more-or-less forever?

If not, I vote for 3b.  If we're going to need to break stuff, let's
just do it.

If we can keep the aliases for 6-10 years, then I see no reason not to
have them (3a).  They're not exactly likely to conflict with user-chosen
names.

-- 
Josh Berkus
Containers & Databases Oh My!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Retiring from the Core Team

2017-01-11 Thread Josh Berkus
Hackers:

You will have noticed that I haven't been very active for the past year.
 My new work on Linux containers and Kubernetes has been even more
absorbing than I anticipated, and I just haven't had a lot of time for
PostgreSQL work.

For that reason, as of today, I am stepping down from the PostgreSQL
Core Team.

I joined the PostgreSQL Core Team in 2003.  I decided to take on project
advocacy, with the goal of making PostgreSQL one of the top three
databases in the world.  Thanks to the many contributions by both
advocacy volunteers and developers -- as well as the efforts by
companies like EnterpriseDB and Heroku -- we've achieved that goal.
Along the way, we proved that community ownership of an OSS project can
compete with, and ultimately outlast, venture-funded startups.

Now we need new leadership who can take PostgreSQL to the next phase of
world domination.  So I am joining Vadim, Jan, Thomas, and Marc in
clearing the way for others.

I'll still be around and still contributing to PostgreSQL in various
ways, mostly around running the database in container clouds.  It'll
take a while for me to hand off all of my PR responsibilities for the
project (assuming that I ever hand all of them off).

It's been a long, fun ride, and I'm proud of the PostgreSQL we have
today: both the database, and the community.  Thank you for sharing it
with me.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RustgreSQL

2017-01-10 Thread Josh Berkus
On 01/09/2017 05:54 PM, Joel Jacobson wrote:
> On Mon, Jan 9, 2017 at 3:22 PM, Jim Nasby <jim.na...@bluetreble.com> wrote:
>> I do wonder if there are parts of the codebase that would be much better
>> suited to a language other than C, and could reasonably be ported.
>> Especially if that could be done in such a way that the net result is still
>> C code so we're not adding dependencies to non developers (similar to
>> bison).
>>
>> Extensions are a step in that direction, but they're ultimately not core
>> Postgres (which is a different issue).
> 
> I think this is a great idea!
> 
> That way the amount of C code could be reduced over time,
> while safely extending the official version with new functionality on
> the surface,
> without increasing the amount of C code.

Even if you don't ever end up touching core Postgres, being able to
write extensions in languages other than C (like, full-featured
extensions) would be its own benefit.

Why not start there?  That is, assuming that Joel has gobs of time to
work on this?  For that matter, I know that Jeff Davis is quite fond of
Rust.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for changes to recovery.conf API

2017-01-03 Thread Josh Berkus
On 01/03/2017 08:47 AM, Robert Haas wrote:
> On Tue, Jan 3, 2017 at 11:21 AM, Simon Riggs <si...@2ndquadrant.com> wrote:
>> On 3 January 2017 at 15:50, Robert Haas <robertmh...@gmail.com> wrote:
>>> On Sun, Jan 1, 2017 at 4:14 PM, Simon Riggs <si...@2ndquadrant.com> wrote:
>>>> Trying to fit recovery targets into one parameter was really not
>>>> feasible, though I tried.
>>>
>>> What was the problem?
>>
>> There are 5 different parameters that affect the recovery target, 3 of
>> which are optional. That is much more complex than the two compulsory
>> parameters suggested when the proposal was made and in my view tips
>> the balance over the edge into pointlessness. Michael's suggestion for
>> 2 parameters works well for this case.
> 
> I still think merging recovery_target_type and recovery_target_value
> into a single parameter could make sense, never mind the other three.
> But I don't really want to argue about it any more.
> 

Either solution works for me.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Migration Docs WAS: Proposal for changes to recovery.conf API

2016-12-20 Thread Josh Berkus
On 12/19/2016 01:29 PM, Peter Eisentraut wrote:
> On 12/16/16 8:52 PM, Robert Haas wrote:
>> > If the explanation is just a few sentences long, I see no reason not
>> > to include it in the release notes.
> As far as I can tell from the latest posted patch, the upgrade
> instructions are
> 
> - To trigger recovery, create an empty file recovery.trigger instead of
> recovery.conf.
> 
> - All parameters formerly in recovery.conf are now regular
> postgresql.conf parameters.  For backward compatibility, recovery.conf
> is read after postgresql.conf, but the parameters can now be put into
> postgresql.conf if desired.

Aren't we changing how some of the parameters work?

> 
> Some of that might be subject to patch review, but it's probably not
> going to be much longer than that.  So I think that will fit well into
> the usual release notes section.

Changed the subject line of this thread because people are becoming
confused about the new topic.

I'm not talking about *just* the recovery.conf changes.  We're making a
lot of changes to 10 which will require user action, and there may be
more before 10 is baked.  For example, dealing with the version
numbering change.  I started a list of the things we're breaking for 10,
but I don't have it with me at the moment.  There's more than 3 things
on it.

And then there's docs for stuff which isn't *required* by upgrading, but
would be a good idea.  For example, we'll eventually want a doc on how
to migrate old-style partitioned tables to new-style partitioned tables.

In any case, Peter's response shows *exactly* why I don't want to put
this documentation into the release notes: because people are going to
complain it's too long and want to truncate it. Writing the docs will be
hard enough; if I (or anyone else) has to argue about whether or not
they're too long, I'm just going to drop the patch and walk away.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for changes to recovery.conf API

2016-12-15 Thread Josh Berkus
On 12/15/2016 12:54 PM, Tom Lane wrote:
> Magnus Hagander <mag...@hagander.net> writes:
>> On Thu, Dec 15, 2016 at 1:11 AM, Bruce Momjian <br...@momjian.us> wrote:
>>> You are saying this is more massive than any other change we have made
>>> in the past?  In general, what need to be documented?
> 
>> I don't necessarily think it's because it's more massive than any chance we
>> have made before. I think it's more that this is something that we probably
>> should've had before, and just didn't.
> 
>> Right now we basically have a bulletpoint list of things that are new, with
>> a section about things that are incompatible.  Having an actual section
>> with more detailed descriptions of how to handle these changes would
>> definitely be a win. it shouldn't *just* be for these changes of course, it
>> should be for any other changes that are large enough to benefit from more
>> than a oneliner about the fact that they've changed.
> 
> Yeah, it seems to me that where this is ending up is "we may need to
> write more in the compatibility entries than we have in the past".
> I don't see any problem with that, particularly if someone other than
> Bruce or me is volunteering to write it ;-)

I'm up for writing it (with help from feature owners), provided that I
don't have to spend time arguing that it's not too long, or that I
should put it somewhere different.  So can we settle the "where"
question first?

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for changes to recovery.conf API

2016-12-14 Thread Josh Berkus
On 12/14/2016 08:06 AM, Bruce Momjian wrote:
> On Fri, Dec  9, 2016 at 09:46:44AM +0900, Michael Paquier wrote:
>>>> My own take on it is that the release notes are already a massive
>>>> amount of work, and putting duplicative material in a bunch of other
>>>> places isn't going to make things better, it'll just increase the
>>>> maintenance burden.
>>>
>>> This would mean adding literally pages of material to the release notes.
>>> In the past, folks have been very negative on anything which would make
>>> the release notes longer.  Are you sure?
>>
>> As that's a per-version information, that seems adapted to me. There
>> could be as well in the release notes a link to the portion of the
>> docs holding this manual. Definitely this should be self-contained in
>> the docs, and not mention the wiki. My 2c.
> 
> Yes, that is the usual approach.
> 

So where in the docs should these go, then?  We don't (currently) have a
place for this kind of doc.  Appendices?

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for changes to recovery.conf API

2016-12-08 Thread Josh Berkus
On 12/08/2016 04:16 PM, Tom Lane wrote:
> Josh Berkus <j...@agliodbs.com> writes:
>> On 12/01/2016 05:58 PM, Magnus Hagander wrote:
>>> And in fairness, having such a "guide to changes" chapter in each
>>> release probably *would* be a good idea. But it would take resources to
>>> make that. The release notes are good, but having a more hand-holding
>>> version explaining incompatible changes in "regular sentences" would
>>> probably be quite useful to users.
> 
>> We will have enough major changes in 10.0 to warrant writing one of
>> these.  Maybe not as part of the official docs, but as a set of wiki
>> pages or similar.
> 
> Seems to me this is exactly the release notes' turf.  If you think the
> release notes aren't clear enough, step right up and help improve them.
> 
> My own take on it is that the release notes are already a massive
> amount of work, and putting duplicative material in a bunch of other
> places isn't going to make things better, it'll just increase the
> maintenance burden.

This would mean adding literally pages of material to the release notes.
In the past, folks have been very negative on anything which would make
the release notes longer.  Are you sure?

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for changes to recovery.conf API

2016-12-08 Thread Josh Berkus
On 12/01/2016 05:58 PM, Magnus Hagander wrote:
> >> * Add docs: "Guide to changes in recovery.conf in 10.0"
> >
> > Hmm, we don't usually write the docs in terms of how things are
> > different from a previous version.  Might seem strange in 5 years.
> > Not sure what's best, here.
> 
> A good chunk in the release notes would make sense as well?
> 
> 
> It would.
> 
> And in fairness, having such a "guide to changes" chapter in each
> release probably *would* be a good idea. But it would take resources to
> make that. The release notes are good, but having a more hand-holding
> version explaining incompatible changes in "regular sentences" would
> probably be quite useful to users.

We will have enough major changes in 10.0 to warrant writing one of
these.  Maybe not as part of the official docs, but as a set of wiki
pages or similar.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for changes to recovery.conf API

2016-12-08 Thread Josh Berkus
On 12/04/2016 07:21 PM, Michael Paquier wrote:
> On Mon, Dec 5, 2016 at 11:34 AM, Haribabu Kommi
> <kommi.harib...@gmail.com> wrote:
>> As there was a feedback from others related to the patch and doesn't find
>> any
>> update from author.
>>
>> Closed in 2016-11 commitfest with "returned with feedback" status.
>> Please feel free to update the status once you submit the updated patch or
>> if the current status doesn't reflect the actual status of the patch.
> 
> Having a consensus here is already a great step forward. I am sure
> that a new version will be sent for the next CF.
> 

Please let's make sure this gets done for 10.  We're planning on
breaking a lot of config things in 10, so it would be MUCH easier on
users if we do the recovery.conf changes in that version as well.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba_file_settings view patch

2016-10-26 Thread Josh Berkus
On 10/26/2016 12:24 PM, Tom Lane wrote:
> Robert Haas <robertmh...@gmail.com> writes:
>> FWIW, I'm -1 on using JSON here.  I don't believe that we should start
>> using JSON all over the place just because we can.  If we do that,
>> we'll end up with a mishmash of styles, and maybe look silly when JSON
>> is replaced by the new and much better SDGJHSDR format.
> 
> I concur.  JSON isn't a core datatype and I don't want to see it treated
> as one.  We should redesign this view so that it doesn't rely on anything
> more advanced than arrays.

Huh?  Sure it is.   Ships in PostgreSQL-core.

I mean, I'm not particularly in favor of using JSON for this (arrays
seem OK), but that seems like an invalid reason not to.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Default setting for autovacuum_freeze_max_age

2016-10-26 Thread Josh Berkus
On 10/21/2016 10:29 AM, Robert Haas wrote:
> On Fri, Oct 21, 2016 at 1:17 PM, Josh Berkus <j...@agliodbs.com> wrote:
>> Particularly, with 9.6's freeze map, point (2) is even stronger reason
>> to *lower* autovacuum_max_freeze_age.  Since there's little duplicate
>> work in a freeze scan, a lot of users will find that frequent freezing
>> benefits them a lot ...
> 
> That's a very good point, although I hope that vacuum is mostly being
> triggered by vacuum_freeze_table_age rather than
> autovacuum_freeze_max_age.

Well, depends on the nature of writes to the table.  For insert-mostly
tables, vacuum_freeze_table_age is pretty much never triggered.  Isn't
there a patch for that somewhere?

> 
> On Bruce's original question, there is an answer written into our
> documentation: "Vacuum also allows removal of old files from the
> pg_clog subdirectory, which is why the default is a relatively low 200
> million transactions."

Point.


-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for changes to recovery.conf API

2016-10-21 Thread Josh Berkus
On 09/28/2016 10:13 AM, Robert Haas wrote:
> On Tue, Sep 6, 2016 at 10:11 AM, David Steele <da...@pgmasters.net> wrote:
>> On 9/6/16 8:07 AM, Robert Haas wrote:
>>> On Wed, Aug 31, 2016 at 9:45 PM, Simon Riggs <si...@2ndquadrant.com>
>>> wrote:
>>>> Related cleanup
>>>> * Promotion signal file is now called "promote.trigger" rather than
>>>> just "promote"
>>>> * Remove user configurable "trigger_file" mechanism - use
>>>> "promote.trigger" for all cases
>>>
>>>
>>> I'm in favor of this.  I don't think that it's very hard for authors
>>> of backup tools to adapt to this new world, and I don't see that
>>> allowing configurability here does anything other than create more
>>> cases to worry about.
>>
>> +1 from a backup tool author.
> 
> It's time to wrap up this CommitFest, and this thread doesn't seem to
> contain anything that looks like a committable patch.  So, I'm marking
> this "Returned with Feedback".  I hope that the fact that there's been
> no discussion for the last three weeks doesn't mean this effort is
> dead; I would like very much to see it move forward.

Has this gone anywhere?  Given that we're in "break all the things" mode
for PostgreSQL 10, it would be the ideal time to consolidate
recovery.conf with pg.conf.


-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Default setting for autovacuum_freeze_max_age

2016-10-21 Thread Josh Berkus
On 10/21/2016 07:44 AM, Tom Lane wrote:
> Bruce Momjian <br...@momjian.us> writes:
>> Why is autovacuum_freeze_max_age's default set to 200 million, rather
>> than something like 2 billion?  It seems 2 billion is half way to
>> wrap-around and would be a better default.  Right now, the default seems
>> to freeze 10x more often than it has to.
> 
> Please see the archives.  I do not remember the reasoning, but there
> was some, and you need to justify why it was wrong not just assert
> that you think it's silly.

IIRC, there were a couple reasons (and I think they're still good
reasons, which is why I haven't asked to change the default):

1. By setting it to 10% of the max space, we give users plenty of room
to raise the number if they need to without getting into crisis territory.

2. Raising this threshold isn't an unalloyed good.  The longer you wait
to freeze, the more work you'll need to do when autovac freeze rolls
around.  There's actually situations where you want to make this
threshold *lower*, although generally scheduled manual vacuum freezes
serve that.

Particularly, with 9.6's freeze map, point (2) is even stronger reason
to *lower* autovacuum_max_freeze_age.  Since there's little duplicate
work in a freeze scan, a lot of users will find that frequent freezing
benefits them a lot ... especially if they can take advantage of
index-only scans.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disable autovacuum guc?

2016-10-20 Thread Josh Berkus
On 10/20/2016 06:34 AM, Joshua D. Drake wrote:
> On 10/19/2016 07:22 PM, Josh Berkus wrote:
>> On 10/19/2016 06:27 PM, Joshua D. Drake wrote:
>>> Hello,
>>>
>>> After all these years, we are still regularly running into people who
>>> say, "performance was bad so we disabled autovacuum". I am not talking
>>> about once in a while, it is often. I would like us to consider removing
>>> the autovacuum option. Here are a few reasons:
>>>
>>> 1. It does not hurt anyone
>>> 2. It removes a foot gun
>>> 3. Autovacuum is *not* optional, we shouldn't let it be
>>> 4. People could still disable it at the table level for those tables
>>> that do fall into the small window of, no maintenance is o.k.
>>> 5. People would still have the ability to decrease the max_workers to 1
>>> (although I could argue about that too).
>>
>> People who run data warehouses where all of the data comes in as batch
>> loads regularly disable autovacuum, and should do so.  For the DW/batch
>> load use-case, it makes far more sense to do batch loads interspersed
>> with ANALYZEs and VACUUMS of loaded/updated tables.
> 
> Hrm, true although that is by far a minority of our users. What if we
> made it so we disabled the autovacuum guc but made it so you could
> disable autovacuum per database (ALTER DATABASE SET or something such
> thing?).

Well, that wouldn't fix the problem; people would just disable it per
database, even if it was a bad idea.

If I can't get rid of vacuum_defer_cleanup_age, you're not going to be
able to get rid of autovacuum.

Now, if you want to "fix" this issue, one thing which would help a lot
is making it possible to disable/enable Autovacuum on one table without
exclusive locking (for the ALTER statement), and create a how-to doc
somewhere which explains how and why to disable autovac per-table.  Most
of our users don't know that it's even possible to adjust this per-table.


-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disable autovacuum guc?

2016-10-19 Thread Josh Berkus
On 10/19/2016 06:27 PM, Joshua D. Drake wrote:
> Hello,
> 
> After all these years, we are still regularly running into people who
> say, "performance was bad so we disabled autovacuum". I am not talking
> about once in a while, it is often. I would like us to consider removing
> the autovacuum option. Here are a few reasons:
> 
> 1. It does not hurt anyone
> 2. It removes a foot gun
> 3. Autovacuum is *not* optional, we shouldn't let it be
> 4. People could still disable it at the table level for those tables
> that do fall into the small window of, no maintenance is o.k.
> 5. People would still have the ability to decrease the max_workers to 1
> (although I could argue about that too).

People who run data warehouses where all of the data comes in as batch
loads regularly disable autovacuum, and should do so.  For the DW/batch
load use-case, it makes far more sense to do batch loads interspersed
with ANALYZEs and VACUUMS of loaded/updated tables.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove vacuum_defer_cleanup_age

2016-10-19 Thread Josh Berkus
On 10/19/2016 09:59 AM, Bruce Momjian wrote:
> On Wed, Oct 19, 2016 at 09:00:06AM -0400, Robert Haas wrote:
>> On Wed, Oct 19, 2016 at 8:47 AM, Bruce Momjian <br...@momjian.us> wrote:
>>> On Wed, Oct 19, 2016 at 08:33:20AM -0400, Robert Haas wrote:

>>>> Actually, I think vacuum_defer_cleanup_age is, and always has been, an
>>>> ugly hack.  But for some people it may be the ugly hack that is
>>>> letting them continue to use PostgreSQL.
>>>
>>> I see vacuum_defer_cleanup_age as old_snapshot_threshold for standby
>>> servers --- it cancels transactions rather than delaying cleanup.
>>
>> I think it's the opposite, isn't it?  vacuum_defer_cleanup_age
>> prevents cancellations.
> 
> Uh, vacuum_defer_cleanup_age sets an upper limit on how long, in terms
> of xids, that a standby query can run before cancel, like
> old_snapshot_threshold, no?  After that, we can cancel standby queries. 
> I see hot_standby_feedback as our current behavior on the master where
> we never cancel standby queries.
> 
> To me, hot_standby_feedback extends no-cleanup-no-cancel from the
> standby to the master, while vacuum_defer_cleanup_age behaves like
> old_snapshot_threshold in that it causes cancel for long-running
> queries.

See Andres' response on this thread.  He's already covered why the
setting is still useful, but why we might want to remove it anyway.


-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove vacuum_defer_cleanup_age

2016-10-18 Thread Josh Berkus
On 10/18/2016 01:37 PM, Andres Freund wrote:
> Hi,
> 
> On 2016-10-09 21:51:07 -0700, Josh Berkus wrote:
>> Given that hot_standby_feedback is pretty bulletproof now, and a lot of
>> the work in reducing replay conflicts, I think the utility of
>> vacuum_defer_cleanup_age is at an end.  I really meant so submit a patch
>> to remove it to 9.6, but it got away from me.
> 
> HS feedback doesn't e.g. work well with delayed and/or archived replay,
> whereas defer_cleanup does.

Oh, point!  See, that's why I polled, I knew there was something I was
forgetting about.

> On the other hand, removing it would make some of the reasoning around
> GetOldestXmin() a bit easier.

Enough to make it worth breaking the above?

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove vacuum_defer_cleanup_age

2016-10-18 Thread Josh Berkus
On 10/18/2016 01:28 PM, Robert Haas wrote:
> On Tue, Oct 18, 2016 at 4:18 PM, Josh Berkus <j...@agliodbs.com> wrote:
>> On 10/12/2016 05:00 PM, Robert Haas wrote:
>>> On Sun, Oct 9, 2016 at 9:51 PM, Josh Berkus <j...@agliodbs.com> wrote:
>>>> Given that hot_standby_feedback is pretty bulletproof now, and a lot of
>>>> the work in reducing replay conflicts, I think the utility of
>>>> vacuum_defer_cleanup_age is at an end.  I really meant so submit a patch
>>>> to remove it to 9.6, but it got away from me.
>>>>
>>>> Any objections to removing the option in 10?
>>>
>>> I'm not sure I see the point.
>>
>> Redusing the number of configuration variables is an a-priori good.  In
>> aggregate, the more knobs we have, the harder it is to learn how to
>> admin Postgres.  Therefore any time a config variable becomes obsolete,
>> we should remove it.
> 
> Meh.  I agree that more configuration knobs makes it harder to learn
> to configure the system, but we've got enough of them that removing
> exactly one isn't going to make a material difference.  Against that,
> if you are wrong about it being obsolete and there are actually people
> relying on it heavily, those people will be very sad if we remove it,
> and unless they read this mailing list, we probably won't find out
> until it's too late.

Based on that argument, we would never be able to remove any
configuration parameter ever.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove vacuum_defer_cleanup_age

2016-10-18 Thread Josh Berkus
On 10/12/2016 05:00 PM, Robert Haas wrote:
> On Sun, Oct 9, 2016 at 9:51 PM, Josh Berkus <j...@agliodbs.com> wrote:
>> Given that hot_standby_feedback is pretty bulletproof now, and a lot of
>> the work in reducing replay conflicts, I think the utility of
>> vacuum_defer_cleanup_age is at an end.  I really meant so submit a patch
>> to remove it to 9.6, but it got away from me.
>>
>> Any objections to removing the option in 10?
> 
> I'm not sure I see the point.

Redusing the number of configuration variables is an a-priori good.  In
aggregate, the more knobs we have, the harder it is to learn how to
admin Postgres.  Therefore any time a config variable becomes obsolete,
we should remove it.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Remove vacuum_defer_cleanup_age

2016-10-10 Thread Josh Berkus
Folks,

Given that hot_standby_feedback is pretty bulletproof now, and a lot of
the work in reducing replay conflicts, I think the utility of
vacuum_defer_cleanup_age is at an end.  I really meant so submit a patch
to remove it to 9.6, but it got away from me.

Any objections to removing the option in 10?

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Congrats on the on-time release!

2016-09-29 Thread Josh Berkus
Hackers,

I wanted to congratulate everyone involved (and it's a long list of
people) in having our first on-schedule major release since 9.3.
Especially the RMT, which did a lot to make that happen.

Getting the release train to run on time is a major accomplishment, and
will help both development and adoption.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Quorum commit for multiple synchronous replication.

2016-09-06 Thread Josh Berkus
On 08/29/2016 06:52 AM, Fujii Masao wrote:
> Also I like the following Simon's idea.
> 
> https://www.postgresql.org/message-id/canp8+jlhfbvv_pw6grasnupw+bdk5dctu7gwpnap-+-zwvk...@mail.gmail.com
> ---
> * first k (n1, n2, n3) – does the same as k (n1, n2, n3) does now
> * any k (n1, n2, n3) – would release waiters as soon as we have the
> responses from k out of N standbys. “any k” would be faster, so is
> desirable for performance and resilience

What are we going to do for backwards compatibility, here?

So, here's the dilemma:

If we want to keep backwards compatibility with 9.6, then:

"k (n1, n2, n3)" == "first k (n1, n2, n3)"

However, "first k" is not what most users will want, most of the time;
users of version 13, years from now, will be getting constantly confused
by "first k" behavior when they wanted quorum.  So the sensible default
would be:

"k (n1, n2, n3)" == "any k (n1, n2, n3)"

... however, that will break backwards compatibility.  Thoughts?

My $0.02 is that we break backwards compat somehow and document the heck
out of it.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] increasing the default WAL segment size

2016-08-25 Thread Josh Berkus
On 08/25/2016 01:12 PM, Robert Haas wrote:
>> I agree that #4 is best. I'm not sure it's worth the cost. I'm not worried
>> > at all about the risk of master/slave sync thing, per previous statement.
>> > But if it does have performance implications, per Andres suggestion, then
>> > making it configurable at initdb time probably comes with a cost that's not
>> > worth paying.
> At this point it's hard to judge, because we don't have any idea what
> the cost might be.  I guess if we want to pursue this approach,
> somebody will have to code it up and benchmark it.  But what I'm
> inclined to do for starters is put together a patch to go from 16MB ->
> 64MB.  Committing that early this cycle will give us time to
> reconsider if that turns out to be painful for reasons we haven't
> thought of yet.  And give tool authors time to make adjustments, if
> any are needed.

The one thing I'd be worried about with the increase in size is folks
using PostgreSQL for very small databases.  If your database is only
30MB or so in size, the increase in size of the WAL will be pretty
significant (+144MB for the base 3 WAL segments).  I'm not sure this is
a real problem which users will notice (in today's scales, 144MB ain't
much), but if it turns out to be, it would be nice to have a way to
switch it back *just for them* without recompiling.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PSA: Systemd will kill PostgreSQL

2016-08-15 Thread Josh Berkus
On 08/15/2016 05:18 PM, Tom Lane wrote:
> Josh Berkus <j...@agliodbs.com> writes:
>> On 08/15/2016 02:43 PM, Tom Lane wrote:
>>> Last I heard, there's an exclusion for "system" accounts, so an
>>> installation that's using the Fedora-provided pgsql account isn't
>>> going to have a problem.  It's homebrew installs running under
>>> ordinary-user accounts that are at risk.
> 
>> Presumably people just need to add the system account tag to the unit
>> file, no?
> 
> Well, yeah, it's easy to fix once you know you need to do so.  The
> complaint is basically that out-of-the-box, it's broken, and it's
> not very clear what was gained by breaking it.

You're welcome to argue with Lennart about that.  I'm not personally
supporting the feature, I just don't think it's that hard to work around.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PSA: Systemd will kill PostgreSQL

2016-08-15 Thread Josh Berkus
On 08/15/2016 02:43 PM, Tom Lane wrote:
> Josh Berkus <j...@agliodbs.com> writes:
>> On 07/10/2016 10:56 AM, Joshua D. Drake wrote:
>>> tl;dr; Systemd 212 defaults to remove all IPC (including SYSV memory)
>>> when a user "fully" logs out.
> 
>> That looks like it was under discussion in April, though.  Do we have
>> confirmation it was never fixed?  I'm not seeing systemd killing
>> Postgres under Fedora24.
> 
> Last I heard, there's an exclusion for "system" accounts, so an
> installation that's using the Fedora-provided pgsql account isn't
> going to have a problem.  It's homebrew installs running under
> ordinary-user accounts that are at risk.

Presumably people just need to add the system account tag to the unit
file, no?


-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PSA: Systemd will kill PostgreSQL

2016-08-15 Thread Josh Berkus
On 07/10/2016 10:56 AM, Joshua D. Drake wrote:
> Hackers,
> 
> This just came across my twitter feed:
> 
> https://lists.freedesktop.org/archives/systemd-devel/2014-April/018373.html
> 
> tl;dr; Systemd 212 defaults to remove all IPC (including SYSV memory)
> when a user "fully" logs out.

That looks like it was under discussion in April, though.  Do we have
confirmation it was never fixed?  I'm not seeing systemd killing
Postgres under Fedora24.


-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why we lost Uber as a user

2016-07-28 Thread Josh Berkus
On 07/28/2016 03:58 AM, Geoff Winkless wrote:
> On 27 July 2016 at 17:04, Bruce Momjian <br...@momjian.us
> <mailto:br...@momjian.us>>wrote:
> 
> Well, their big complaint about binary replication is that a bug can
> spread from a master to all slaves, which doesn't happen with statement
> level replication.  
> 
> 
> ​
> ​I'm not sure that that makes sense to me. If there's a database bug
> that occurs when you run a statement on the master, it seems there's a
> decent chance that that same bug is going to occur when you run the same
> statement on the slave.
> 
> Obviously it depends on the type of bug and how identical the slave is,
> but statement-level replication certainly doesn't preclude such a bug
> from propagating.​

That's correct, which is why I ignored that part of their post.

However, we did have issues for a couple of years where replication
accuracy was poorly tested, and did have several bugs associated with
that.  It's not surprising that a few major users got hit hard by those
bugs and decided to switch.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why we lost Uber as a user

2016-07-27 Thread Josh Berkus
On 07/26/2016 08:45 PM, Robert Haas wrote:
> That's why I found Josh's restatement useful - I am assuming without
> proof that his restatement is accurate

FWIW, my restatement was based on some other sites rather than Uber.
Including folks who didn't abandon Postgres.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why we lost Uber as a user

2016-07-26 Thread Josh Berkus
On 07/26/2016 03:07 PM, Tom Lane wrote:
> Josh Berkus <j...@agliodbs.com> writes:

>> That's a recipe for runaway table bloat; VACUUM can't do much because
>> there's always some minutes-old transaction hanging around (and SNAPSHOT
>> TOO OLD doesn't really help, we're talking about minutes here), and
>> because of all of the indexes HOT isn't effective.
> 
> Hm, I'm not following why this is a disaster.  OK, you have circa 100%
> turnover of the table in the lifespan of the slower transactions, but I'd
> still expect vacuuming to be able to hold the bloat to some small integer
> multiple of the minimum possible table size.

Not in practice.  Don't forget that you also have bloat of the indexes
as well.  I encountered multiple cases of this particular failure case,
and often bloat ended up at something like 100X of the clean table/index
size, with no stable size (that is, it always kept growing).  This was
the original impetus for wanting REINDEX CONCURRENTLY, but really that's
kind of a workaround.

  (And if the table is small,
> that's still small.)  I suppose really long transactions (pg_dump?) could
> be pretty disastrous, but there are ways around that, like doing pg_dump
> on a slave.

You'd need a dedicated slave for the pg_dump, otherwise you'd hit query
cancel.

> Or in short, this seems like an annoyance, not a time-for-a-new-database
> kind of problem.

It's considerably more than an annoyance for the people who suffer from
it; for some databases I dealt with, this one issue was responsible for
80% of administrative overhead (cron jobs, reindexing, timeouts ...).

But no, it's not a database-switcher *by itself*.  But is is a chronic,
and serious, problem.  I don't have even a suggestion of a real solution
for it without breaking something else, though.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why we lost Uber as a user

2016-07-26 Thread Josh Berkus
On 07/26/2016 01:53 PM, Josh Berkus wrote:
> The write amplification issue, and its correllary in VACUUM, certainly
> continues to plague some users, and doesn't have any easy solutions.

To explain this in concrete terms, which the blog post does not:

1. Create a small table, but one with enough rows that indexes make
sense (say 50,000 rows).

2. Make this table used in JOINs all over your database.

3. To support these JOINs, index most of the columns in the small table.

4. Now, update that small table 500 times per second.

That's a recipe for runaway table bloat; VACUUM can't do much because
there's always some minutes-old transaction hanging around (and SNAPSHOT
TOO OLD doesn't really help, we're talking about minutes here), and
because of all of the indexes HOT isn't effective.  Removing the indexes
is equally painful because it means less efficient JOINs.

The Uber guy is right that InnoDB handles this better as long as you
don't touch the primary key (primary key updates in InnoDB are really bad).

This is a common problem case we don't have an answer for yet.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why we lost Uber as a user

2016-07-26 Thread Josh Berkus
On 07/26/2016 09:54 AM, Joshua D. Drake wrote:
> Hello,
> 
> The following article is a very good look at some of our limitations and
> highlights some of the pains many of us have been working "around" since
> we started using the software.

They also had other reasons to switch to MySQL, particularly around
changes of staffing (the switch happened after they got a new CTO).  And
they encountered that 9.2 bug literally the week we released a fix, per
one of the mailing lists. Even if they switched off, it's still a nice
testimonial that they once ran their entire worldwide fleet off a single
Postgres cluster.

However, the issues they cite as limitations of our current replication
system are real, or we wouldn't have so many people working on
alternatives.  We could really use pglogical in 10.0, as well as
OLTP-friendly MM replication.

The write amplification issue, and its correllary in VACUUM, certainly
continues to plague some users, and doesn't have any easy solutions.

I do find it interesting that they mention schema changes in passing,
without actually saying anything about them -- given that schema changes
have been one of MySQL's major limitations.  I'll also note that they
don't mention any of MySQL's corresponding weak spots, such as
limitations on table size due to primary key sorting.

One wonders what would have happened if they'd adopted a sharding model
on top of Postgres?

I would like to see someone blog about our testing for replication
corruption issues now, in response to this.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-18 Thread Josh Berkus
On 06/16/2016 11:01 PM, Craig Ringer wrote:
> 
> I thought about raising this, but I think in the end it's replacing one
> confusing and weird versioning scheme for another confusing and weird
> versioning scheme.
> 
> It does have the advantage that that compare a two-part major like
> 090401 vs 090402 won't be confused when they compare 100100 and 100200,
> since it'll be 11 and 12. So it's more backward-compatible. But
> ugly.
> 

Realistically, though, we're more likely to end up with 10.0.1 than
10.1.  I don't think we're anywhere near plumbing the depths of the
stuff which will break because folks are parsing our version numbers
with regexes.  In more major software, this will break nagios
check_postgres.

I'm not happy with it, but I believe that's where we'll end up.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-06-17 Thread Josh Berkus
On 06/17/2016 10:04 AM, Alvaro Herrera wrote:
> Merlin Moncure wrote:
> 
>> Ugliness is a highly subjective qualifier.  OTOH, Backwards
>> compatibility, at least when the checks are properly written :-), is a
>> very objective benefit.
> 
> This is the argument that made us kept the PostgreSQL name instead of
> renaming back to Postgres.  I'm not a fan of it.
> 

Well ... no.

We kept the PostgreSQL name for three reasons.

Back in 2005, which was the last time we could have reasonably changed
it, nobody had the time/energy to do all of the
search-and-replace-and-contact-every-packager required.  The folks who
were most enthusiastic about the change wanted someone else to do the
work.  Plus, our Japanese community, which was like 40% of our worldwide
community at the time, was opposed to the change.

The third reason is that we have a registered trademark on "PostgreSQL",
but "postgres" is public domain.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-08 Thread Josh Berkus
On 06/07/2016 11:01 PM, Robert Haas wrote:
> On Fri, Jun 3, 2016 at 9:39 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmh...@gmail.com> writes:
>>> I think we should just go with max_parallel_workers for a limit on
>>> total parallel workers within max_work_processes, and
>>> max_parallel_workers_per_gather for a per-Gather limit.  It's slightly
>>> annoying that we may end up renaming the latter GUC, but not as
>>> annoying as spending another three weeks debating this and missing
>>> beta2.
>>
>> +1.  I'm not as convinced as you are that we'll replace the GUC later,
>> but in any case this is an accurate description of the current
>> semantics.  And I'm really *not* in favor of fudging the name with
>> the intent of changing the GUC's semantics later --- that would fail
>> all sorts of compatibility expectations.
> 
> Here's a patch change max_parallel_degree to
> max_parallel_workers_per_gather, and also changing parallel_degree to
> parallel_workers.  I haven't tackled adding a separate
> max_parallel_workers, at least not yet.  Are people OK with this?

+1

> 
> Note that there is a dump/restore hazard if people have set the
> parallel_degree reloption on a beta1 install, or used ALTER { USER |
> DATABASE } .. SET parallel_degree.  Can everybody live with that?
> Should I bump catversion when applying this?

IMHO, we just need to call it out in the beta2 announcement.


-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Routine analyze of single column prevents standard autoanalyze from running at all

2016-06-06 Thread Josh berkus
On 06/06/2016 01:38 PM, Tom Lane wrote:

> Also, I'd be a bit inclined to disable the counter reset whenever a column
> list is specified, disregarding the corner case where a list is given but
> it includes all the table's analyzable columns.  It doesn't really seem
> worth the effort to account for that case specially (especially after
> you consider that index expressions should count as analyzable columns).
> 
> Thoughts?

+1.  Better to err on the side of duplicate analyzes than none at all.

Also, I'm not surprised this took so long to discover; I doubt most
users are aware that you *can* analyze individual columns.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-03 Thread Josh berkus
On 06/02/2016 09:33 PM, Peter Eisentraut wrote:
> On 6/3/16 12:21 AM, Petr Jelinek wrote:
>> On 01/06/16 17:55, David G. Johnston wrote:
>>> On Wed, Jun 1, 2016 at 11:45 AM, Petr Jelinek <p...@2ndquadrant.com
>>> <mailto:p...@2ndquadrant.com>>wrote:
>>>
>>> That GUC also controls worker processes that are started by
>>> extensions, not just ones that parallel query starts. This is btw
>>> one thing I don't like at all about how the current limits work, the
>>> parallel query will fight for workers with extensions because they
>>> share the same limit.
>>>
>>>
>>> ​Given that this models reality the GUC is doing its job.  Now, maybe we
>>> need additional knobs to give the end-user the ability to influence how
>>> those fights will turn out.
>>
>> Agreed, my point is that I think we do need additional knob.
> 
> We need one knob to control how many process slots to create at server
> start, and then a bunch of sliders to control how to allocate those
> between regular connections, superuser connections, replication,
> autovacuum, parallel workers, background workers (by tag/label/group),
> and so on.

Now that's crazy talk.  I mean, next thing you'll be saying that we need
the ability to monitor this, or even change it at runtime.  Where does
the madness end?  ;-)

Seriously, you have a point here; it's maybe time to stop tackling
process management per server piecemeal.  Question is, who wants to work
on this?

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-02 Thread Josh berkus
On 06/02/2016 01:42 PM, David G. Johnston wrote:
> ​Are you referring to right now or if we move the goal posts to making
> > this a per-statement reservation?​
> 
> I was assuming that we would have *both* per-operation and per-statement
> limits.  I can see reasons for having both, I can see why power users
> would want both, but it's going to be overwhelming to casual users.
> 
> 
> ​ Got that.  The only problem on that front with the current setup is
> that right now we are saying: "at most use 3 of the 8 available
> processes": i.e., we tie ourselves to a fixed number when I think a
> better algorithm would be: "on/off/auto - default auto" and we detect at
> runtime whatever values we feel are most appropriate based upon the
> machine we are running on.  If the user doesn't like our choices they
> can specify their own values.  But the only specified values in the
> configurations would be those placed there automatically by the user. 
> If value isn't specified but is required it gets set at startup and can
> be seen in pg_settings.
> 

Well, the hard part here is that the appropriate value is based on the
level of concurrency in the database as a whole.  For example, if it's a
website database with 200 active connections on a 32-core machine
already, you want zero parallelism.  Whereas if it's the only current
statement on a 16-core VM, you want like 8.

That sounds like a heuristic, except that the number of concurrent
statements can change in milleseconds.  So we'd really want to base this
on some sort of moving average, set conservatively.  This will be some
interesting code, and will probably need to be revised several times
before we get it right.  Particularly since this would involve scanning
some of the global catalogs we've been trying to move activity off of.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-02 Thread Josh berkus
On 06/02/2016 01:08 PM, David G. Johnston wrote:
> On Thu, Jun 2, 2016 at 3:52 PM, Josh berkus <j...@agliodbs.com
> <mailto:j...@agliodbs.com>>wrote:
> 
> On 06/02/2016 08:53 AM, Tom Lane wrote:
> > Josh berkus <j...@agliodbs.com <mailto:j...@agliodbs.com>> writes:
> >> On 06/02/2016 04:58 AM, Robert Haas wrote:
> >>> Well, I think we could drop node, if you like.  I think parallel
> >>> wouldn't be good to drop, though, because it sounds like we want a
> >>> global limit on parallel workers also, and that can't be just
> >>> max_workers.  So I think we should keep parallel in there for all of
> >>> them, and have max_parallel_workers and
> >>> max_parallel_workers_per_gather(_node).  The reloption and the Path
> >>> struct field can be parallel_workers rather than parallel_degree.
> >
> >> So does that mean we'll rename it if you manage to implement a 
> parameter
> >> which controls the number of workers for the whole statement?
> >
> > That would fit in as something like max_parallel_workers_per_statement.
> 
> ETOOMANYKNOBS
> 
> I'm trying to think of some way we can reasonably automate this for
> users ...
> 
> 
> ​Are you referring to right now or if we move the goal posts to making
> this a per-statement reservation?​

I was assuming that we would have *both* per-operation and per-statement
limits.  I can see reasons for having both, I can see why power users
would want both, but it's going to be overwhelming to casual users.


-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-02 Thread Josh berkus
On 06/02/2016 08:53 AM, Tom Lane wrote:
> Josh berkus <j...@agliodbs.com> writes:
>> On 06/02/2016 04:58 AM, Robert Haas wrote:
>>> Well, I think we could drop node, if you like.  I think parallel
>>> wouldn't be good to drop, though, because it sounds like we want a
>>> global limit on parallel workers also, and that can't be just
>>> max_workers.  So I think we should keep parallel in there for all of
>>> them, and have max_parallel_workers and
>>> max_parallel_workers_per_gather(_node).  The reloption and the Path
>>> struct field can be parallel_workers rather than parallel_degree.
> 
>> So does that mean we'll rename it if you manage to implement a parameter
>> which controls the number of workers for the whole statement?
> 
> That would fit in as something like max_parallel_workers_per_statement.

ETOOMANYKNOBS

I'm trying to think of some way we can reasonably automate this for
users ...

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-02 Thread Josh berkus
On 06/02/2016 04:58 AM, Robert Haas wrote:

> Well, I think we could drop node, if you like.  I think parallel
> wouldn't be good to drop, though, because it sounds like we want a
> global limit on parallel workers also, and that can't be just
> max_workers.  So I think we should keep parallel in there for all of
> them, and have max_parallel_workers and
> max_parallel_workers_per_gather(_node).  The reloption and the Path
> struct field can be parallel_workers rather than parallel_degree.

So does that mean we'll rename it if you manage to implement a parameter
which controls the number of workers for the whole statement?


-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-01 Thread Josh berkus
On 06/01/2016 02:21 PM, Robert Haas wrote:
> If you lined up ten people in a room all of whom were competent
> database professionals and none of whom knew anything about PostgreSQL
> and asked them to guess what a setting called work_mem does and what a
> setting called max_parallel_degree does, I will wager you $5 that
> they'd do better on the second one.  Likewise, I bet the guesses for
> max_parallel_degree would be closer to the mark than the guesses for
> maintenance_work_mem or replacement_sort_tuples or commit_siblings or
> bgwriter_lru_multiplier.

Incidentally, the reason I didn't jump into this thread until the
patches showed up is that I don't think it actually matters what the
parameters are named.  They're going to require documentation
regardless, parallism just isn't something people grok instinctively.

I care about how the parameters *work*, and whether that's consistent
across our various resource management settings.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Josh berkus
On 05/31/2016 01:04 PM, Tom Lane wrote:
> The name should be closely related to what we use for #3.  I could go for
> max_total_parallel_workers for #2 and max_parallel_workers for #3.
> Or maybe max_parallel_workers_total?

How about parallel_worker_pool?

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logic behind parallel default? WAS: Rename max_parallel_degree?

2016-05-31 Thread Josh berkus
On 05/31/2016 11:10 AM, Tom Lane wrote:
> Josh berkus <j...@agliodbs.com> writes:
>> Is there a thread on how we determined this default of 2?  I can't find
>> one under likely search terms.
> 
> The 9.6 open-items list cites
> 
> https://www.postgresql.org/message-id/flat/20160420174631.3qjjhpwsvvx5b...@alap3.anarazel.de

Looks like we didn't decide for the release, just the beta.

I can see two ways to go for the final release:

1. Ship with max_parallel_X = 2 (or similar) and a very low
max_worker_processes (e.g. 4).

2. Ship with max_parallel_X = 1 (or 0, depending), and with a generous
max_worker_processes (e.g. 16).

Argument in favor of (1): we want parallelism to work out of the gate
for users running on low-concurrency systems.  These settings would let
some parallelism happen immediately, without overwhelming a 4-to-8-core
system/vm.  Tuning for the user would then be fairly easy, as we could
just tell them "set max_worker_processes to half the number of cores you
have".

Argument in favor of (2): parallelism is potentially risky for .0, and
as a result we want it disabled unless users choose to enable it.
Also, defaulting to off lets users make more use of the parallel_degree
table attribute to just enable parallelism on select tables.

Thoughts?

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Josh berkus
On 05/31/2016 11:29 AM, Tom Lane wrote:
> Josh berkus <j...@agliodbs.com> writes:
>> One more consistency question: what's the effect of running out of
>> max_parallel_workers?
> 
> ITYM max_worker_processes (ie, the cluster-wide pool size)?

Yes.  Sorry for contributing to the confusion.  Too many
similar-sounding parameter names.


-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Josh berkus
On 05/31/2016 11:27 AM, Peter Geoghegan wrote:
> On Tue, May 31, 2016 at 11:22 AM, Josh berkus <j...@agliodbs.com> wrote:
>>> I think we can hope that developers are going to be less confused about
>>> that than users.
>>
>> Makes sense.
> 
> Maybe EXPLAIN doesn't have to use the term parallel worker at all. It
> can instead use a slightly broader terminology, possibly including the
> term "core".
> 
>> One more consistency question: what's the effect of running out of
>> max_parallel_workers?
>>
>> That is, say max_parallel_workers is set to 10, and 8 are already
>> allocated.  If I ask for max_parallel_X = 4, how many cores to I use?
> 
> Well, it depends on the planner, of course. But when constrained only
> by the availability of worker processes, then your example could use 3
> cores -- the 2 remaining parallel workers, plus the leader itself.
> 
>> Presumably the leader isn't counted towards max_parallel_workers?
 (oops, I meant max_worker_processes above)


So, there's six things we'd like to make consistent to limit user confusion:

1. max_parallel_X and number of cores used

2. max_parallel_X and EXPLAIN output

3. max_parallel_X and gatekeeping via max_worker_processes

4. max_parallel_X and parallelism of operations (i.e. 2 == 2 parallel
scanners)

5. settings in other similar databases (does someone have specific
citations for this)?

6. consistency with other GUCs (0 or -1 to disable settings)

We can't actually make all five things consistent, as some of them are
contradictory; for example, (1) and (3) cannot be reconciled.  So we
need to evaluate which things are more likely to cause confusion.

My general evaluation would be to make the GUC be the number of
additional workers used, not the total number (including the leader).
From my perspective, that makes (2), (3) and (6) consistent, and (4)
cannot be made consistent because different types of parallel nodes
behave different ways (i.e, some are parallel with only 1 additional
worker and some are not).

However, I'm resigned to the fact that user confusion is inevitable
whichever way we choose, and could be persuaded the other way.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Josh berkus
On 05/31/2016 11:17 AM, Peter Eisentraut wrote:
> On 5/31/16 2:02 PM, Josh berkus wrote:
>> I get where you're coming from, but I think Haas's query plan output is
>> going to show us the confusion we're going to get.  So we need to either
>> change the parameter, the explain output, or brace ourselves for endless
>> repeated questions.
> 
> Changing the explain output doesn't sound so bad to me.
> 
> The users' problem is that the parameter setting ought to match the
> EXPLAIN output.
> 
> The developers' problem is that the EXPLAIN output actually corresponds
> to leader + (N-1) workers internally.
> 
> I think we can hope that developers are going to be less confused about
> that than users.

Makes sense.

One more consistency question: what's the effect of running out of
max_parallel_workers?

That is, say max_parallel_workers is set to 10, and 8 are already
allocated.  If I ask for max_parallel_X = 4, how many cores to I use?

Presumably the leader isn't counted towards max_parallel_workers?

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logic behind parallel default? WAS: Rename max_parallel_degree?

2016-05-31 Thread Josh berkus
On 05/31/2016 11:00 AM, Tom Lane wrote:
> !  If this occurs, the plan will run with fewer workers than expected,
> !  which may be inefficient.  The default value is 2.  Setting this
> !  value to 0 disables parallel query execution.

Is there a thread on how we determined this default of 2?  I can't find
one under likely search terms.

I'm concerned about the effect of overallocating parallel workers on
systems which are already running out of cores (e.g. AWS instances), and
run with default settings.  Possibly max_parallel_workers takes care of
this, which is why I want to understand the logic here.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Josh berkus
On 05/31/2016 10:51 AM, Peter Geoghegan wrote:
> On Tue, May 31, 2016 at 10:46 AM, Josh berkus <j...@agliodbs.com> wrote:
>> In parallel seq scan and join, do the "masters" behave as workers as well?
> 
> It depends. They will if they can. If the parallel seq scan leader
> isn't getting enough work to do from workers (enough tuples to process
> from the shared memory queue), it will start acting as a worker fairly 
> quickly.
> With parallel aggregate, and some other cases, that will always happen.
> 
> Even when the leader is consuming input from workers, that's still perhaps
> pegging one CPU core. So, it doesn't really invalidate what I said about
> the number of cores being the primary consideration.
> 

I get where you're coming from, but I think Haas's query plan output is
going to show us the confusion we're going to get.  So we need to either
change the parameter, the explain output, or brace ourselves for endless
repeated questions.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Josh berkus
On 05/31/2016 10:38 AM, Peter Geoghegan wrote:
> On Tue, May 31, 2016 at 10:23 AM, Josh berkus <j...@agliodbs.com> wrote:
>> It's still WAY simpler to understand "max_parallel is the number of
>> parallel workers I requested".
> 
> (Sorry Josh, somehow hit reply, not reply-all)
> 
> Yes, it is. But as long as parallel workers are not really that
> distinct to the leader-as-worker when executing a parallel query, then
> you have another consideration. Which is that you need to care about
> how many cores your query uses first and foremost, and not the number
> of parallel workers used. I don't think that having only one worker
> will cause too much confusion, because users will trust that we won't
> allow something that simply makes no sense to happen.
> 
> In my parallel create index patch, the leader participates as a worker
> to scan and sort runs. It's identical to a worker, practically
> speaking, at least until time comes to merge those runs. Similarly,
> parallel aggregate does not really have much for the leader process to
> do other than act as a worker.

In parallel seq scan and join, do the "masters" behave as workers as well?


-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Josh berkus
On 05/31/2016 10:16 AM, Peter Geoghegan wrote:
> On Tue, May 31, 2016 at 10:10 AM, Josh berkus <j...@agliodbs.com> wrote:
>> "max_parallel_degree is the amount of parallelism in the query, with the
>> understanding that the original parent process counts as 1, which means
>> that if you set it to 1 you get no parallelism, and if you want 4
>> parallel workers you need to set it to 5."
>>
>> Which one of those is going to require more explanations on -general and
>> -novice?  Bets?
>>
>> Let's not be complicated for the sake of being complicated.
> 
> But the distinction between parallel workers and backends that can
> participate in parallel query does need to be user-visible. Worker
> processes are a commodity (i.e. the user must consider
> max_worker_processes).

It's still WAY simpler to understand "max_parallel is the number of
parallel workers I requested".

Any system where you set it to 2 and get only 1 worker on an idle system
is going to cause endless queries on the mailing lists.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Josh berkus
On 05/31/2016 10:03 AM, Tom Lane wrote:
> Josh berkus <j...@agliodbs.com> writes:
>> I realize there's a lot of water under the bridge here, but I think
>> we're going to get 1000 questions on -general of the type:  "I asked for
>> 8 parallel workers, why did I only get 7?".  I believe we will regret
>> this change.
>> So, one vote from me to revert.
> 
> Well, that gets back to the question of whether average users will
> understand the "degree" terminology.  For the record, while I do not
> like the current behavior either, this was not the solution I favored.
> I thought we should rename the GUC and keep it as meaning the number
> of additional worker processes.

I will happily bet anyone a nice dinner in Ottawa that most users will
not understand it.

Compare this:

"max_parallel is the maximum number of parallel workers which will work
on each stage of the query which is parallizable.  If you set it to 4,
you get up to 4 workers."

with this:

"max_parallel_degree is the amount of parallelism in the query, with the
understanding that the original parent process counts as 1, which means
that if you set it to 1 you get no parallelism, and if you want 4
parallel workers you need to set it to 5."

Which one of those is going to require more explanations on -general and
-novice?  Bets?

Let's not be complicated for the sake of being complicated.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Josh berkus
On 05/31/2016 09:15 AM, Robert Haas wrote:
> On Sun, May 29, 2016 at 1:33 AM, Noah Misch <n...@leadboat.com> wrote:
>> On Fri, May 06, 2016 at 02:52:30PM -0400, Robert Haas wrote:
>>> OK, my reading of this thread is that there is a consensus is to
>>> redefine max_parallel_degree=1 as "no parallelism" and
>>> max_parallel_degree>1 as "parallelism using a leader plus N-1
>>> workers", and along with that, to keep the names unchanged.  However,
>>> I don't think I can get that done before beta1, at least not without a
>>> serious risk of breaking stuff.  I can look at this post-beta1.
>>
>> [This is a generic notification.]
>>
>> The above-described topic is currently a PostgreSQL 9.6 open item.  Robert,
>> since you committed the patch believed to have created it, you own this open
>> item.  If some other commit is more relevant or if this does not belong as a
>> 9.6 open item, please let us know.  Otherwise, please observe the policy on
>> open item ownership[1] and send a status update within 72 hours of this
>> message.  Include a date for your subsequent status update.  Testers may
>> discover new open items at any time, and I want to plan to get them all fixed
>> well in advance of shipping 9.6rc1.  Consequently, I will appreciate your
>> efforts toward speedy resolution.  Thanks.
> 
> Here is a patch.  Note that I still don't agree with this change, but
> I'm bowing to the will of the group.
> 
> I think that some of the people who were in favor of this change
> should review this patch, including especially the language I wrote
> for the documentation.  If that happens, and the reviews are positive,
> then I will commit this.  If that does not happen, then I will
> interpret that to mean that there isn't actually all that much
> interest in changing this after all and will accordingly recommend
> that this open item be removed without further action.
> 
> Here is a test which shows how it works:
> 
> rhaas=# set max_parallel_degree = 100;
> SET
> rhaas=# alter table pgbench_accounts set (parallel_degree = 10);
> ALTER TABLE
> rhaas=# explain (analyze) select count(*) from pgbench_accounts;
> 
> QUERY PLAN
> 
>  Finalize Aggregate  (cost=177436.04..177436.05 rows=1 width=8)
> (actual time=383.244..383.244 rows=1 loops=1)
>->  Gather  (cost=177435.00..177436.01 rows=10 width=8) (actual
> time=383.040..383.237 rows=9 loops=1)
>  Workers Planned: 9
>  Workers Launched: 8


I realize there's a lot of water under the bridge here, but I think
we're going to get 1000 questions on -general of the type:  "I asked for
8 parallel workers, why did I only get 7?".  I believe we will regret
this change.

So, one vote from me to revert.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding an alternate syntax for Phrase Search

2016-05-26 Thread Josh berkus
On 05/22/2016 06:53 PM, Teodor Sigaev wrote:
> 
>> to_tsquery(' Berkus & "PostgreSQL Version 10.0" ')
>>
>> ... would be equivalent to:
>>
>> to_tsquery(' Berkus & ( PostgreSQL <-> version <-> 10.0 )')
> 
> select to_tsquery('Berkus') && phraseto_tsquery('PostgreSQL Version 10.0');
> does it as you wish

Aha, you didn't mention this in your presentation.  That seems plenty
good enough for 9.6.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Adding an alternate syntax for Phrase Search

2016-05-22 Thread Josh berkus
Folks,

This came up at pgCon.

The 'word <-> word <-> word' syntax for phrase search is not
developer-friendly.  While we need the <-> operator for SQL and for the
sophisticated cases, it would be really good to support an alternate
syntax for the simplest case of "words next to each other".  My proposal
is enclosing the phrase in double-quotes, which would be intuitive to
users and familiar from search engines.  Thus:

to_tsquery(' Berkus & "PostgreSQL Version 10.0" ')

... would be equivalent to:

to_tsquery(' Berkus & ( PostgreSQL <-> version <-> 10.0 )')

I realize we're already in beta, but pgCon was actually the first time I
saw the new syntax.  I think if we don't do this now, we'll be doing it
for 10.0.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-18 Thread Josh berkus
On 05/18/2016 03:51 PM, Peter Geoghegan wrote:
> On Wed, May 18, 2016 at 8:52 AM, Jeff Janes <jeff.ja...@gmail.com> wrote:
>> How about going with something that says more about why we are doing
>> it, rather than trying to describe in one or two words what it is
>> doing?
>>
>> VACUUM (FORENSIC)
>>
>> VACUUM (DEBUG)
>>
>> VACUUM (LINT)
> 
> +1

Maybe this is the wrong perspective.  I mean, is there a reason we even
need this option, other than a lack of any other way to do a full table
scan to check for corruption, etc.?  If we're only doing this for
integrity checking, then maybe it's better if it becomes a function,
which could be later extended with additional forensic features?

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-05-14 Thread Josh berkus
On 05/13/2016 07:18 PM, Mark Dilger wrote:
> My main concern is that a commitment to never, ever break backwards
> compatibility is a commitment to obsolescence.  It therefore makes sense to
> reserve room in the numbering scheme to be clear and honest about when
> backwards compatibility has been broken.  The major number is the normal
> place to do that.

The problem with that idea is that *minor* backwards compatibility
breakage is much more likely in each-and-every version than major
breakage is at any time in the foreseeable future.  The last major
breakage we really had was version 8.3, which if we'd been going by
compatibility should have been 9.0 (also for other reasons).

And if we adopt the "backwards compatibility" approach, then we'll just
be switching from the argument we're having now to the argument of "is
this enough breakage to rate a .0?  Yes/No?".  Which argument will be
just as long as this one.

So, my vote is now +1 to go to the 2-part numbering scheme.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-05-13 Thread Josh berkus
On 05/13/2016 05:22 PM, Mark Dilger wrote:
>>> >> Any project that starts inflating its numbering scheme sends a message to
>>> >> users of the form, "hey, we've just been taken over by marketing people, 
>>> >> and
>>> >> software quality will go down from now on."
>> > 
>> > I don't think this is about version number inflation, but actually more
>> > the opposite.  What you're calling the major number is really a marketing
>> > number.  There is not a technical distinction between major releases where
>> > we choose to bump the first number and those where we choose to bump the
>> > second.  It's all about marketing.  So to me, merging those numbers would
>> > be an anti-marketing move.  I think it's a good move: it would be more
>> > honest and transparent about what the numbers mean, not less so.
> I find your argument persuasive if there is no possibility of ever needing
> a major number to bump.  But if anything like what I described above could
> someday happen, it seems the major.minor.micro format would come in
> handy.  Perhaps the problem (from my perspective) is that the major number
> has been used for purely marketing purposes in the past, and I've tried to
> avert my eyes to that.  But going forward, my vote (worth less than half a
> cent I'm sure) is to stop using it for marketing reasons.

Per a long discussion on -advocacy, nobody has any specific plans to do
substantial breakage of backwards compatibility.  Theoretically we might
someday want to change the on-disk format, but nobody has plans to do so
in the immediate future.  How long should we hold out for that?  Until 9.27?

And I don't find dropping the "money" type to be substantial breakage.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-05-13 Thread Josh berkus
On 05/13/2016 02:00 PM, Tom Lane wrote:
> I still don't like that much, and just thought of another reason why:
> it would foreclose doing two major releases per year.  We have debated
> that sort of schedule in the past.  While I don't see any reason to
> think we'd try to do it in the near future, it would be sad if we
> foreclosed the possibility by a poor choice of versioning scheme.

Well, we have done two major releases in a year before, mostly due to
one release being late and the succeeding one being on time.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lets (not) break all the things. Was: [pgsql-advocacy] 9.6 -> 10.0

2016-05-13 Thread Josh berkus
On 05/13/2016 01:04 PM, Joshua D. Drake wrote:
> On 05/13/2016 12:03 PM, Josh berkus wrote:
>> On 05/13/2016 11:48 AM, Robert Haas wrote:
>>> On Fri, May 13, 2016 at 12:12 PM, Joshua D. Drake
>>> <j...@commandprompt.com> wrote:
> 
>> Anyway, all of this is a moot point, because nobody has the power to
>> tell the various companies what to do.  We're just lucky that everyone
>> is still committed to writing stuff which adds to PostgreSQL.
> 
> Lucky? No. We earned it. We earned it through years and years of hard
> work. Should we be thankful? Absolutely. Should we be grateful that we
> have such a powerful and engaged commercial contribution base? 100%.

Lucky.  Sure there was work and personal integrity involved, but like
any success story, there was luck.

But we've also been fortunate in not spawning hostile-but-popular forks
by people who left the project, and that none of the companies who
created hostile forks were very successful with them, and that nobody
has seriously tried using lawyers to control/ruin the project.

And, most importantly, we've been lucky that a lot of competing projects
have self-immolated instead of being successful and brain-draining our
contributors (MySQL, ANTS, MonetDB, etc.)

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-05-13 Thread Josh berkus
On 05/13/2016 01:15 PM, Robert Haas wrote:
> On Fri, May 13, 2016 at 2:49 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>> So I think we should solve these problems at a stroke, and save ourselves
>> lots of breath in the future, by getting rid of the whole "major major"
>> idea and going over to a two-part version numbering scheme.  To be
>> specific:
>>
>> * This year's major release will be 9.6.0, with minor updates 9.6.1,
>> 9.6.2, etc.  It's too late to do otherwise for this release cycle.
>>
>> * Next year's major release will be 10.0, with minor updates 10.1,
>> 10.2, etc.
>>
>> * The year after, 11.0.  Etc cetera.
>>
>> No confusion, no surprises, no debate ever again about what the next
>> version number is.
>>
>> This is by no means a new idea, but I think its time has come.
> 
> Man, I hate version number inflation.  I'm running Firefox 45.0.2, and
> I think that's crazy.  It hit 1.0 when were at aversion 7.4!  Granted,
> this wouldn't be that bad, but I have always thought that burning
> through a first digit a few times a decade is much more sensible than
> doing it every year.  We just have to remember to bump the first digit
> occasionally.

Well, FF has this issue because they release a new version every 6
weeks.  Even bumping once per year, we wouldn't hit version 20 until 2027.

> If we don't want to stick with the current practice of debating when
> to bump the same digit, then let's agree that 10.0 will follow 9.6 and
> after that we'll bump the first digit after X.4, as we did with 7.X
> and 8.X.

Why X.4?  Seems arbitrary.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-05-13 Thread Josh berkus
On 05/13/2016 11:49 AM, Tom Lane wrote:
> Alvaro Herrera <alvhe...@2ndquadrant.com> writes:
>> Josh berkus wrote:
>>> Anyway, can we come up with a consensus of some minimum changes it will
>>> take to make the next version 10.0?
> 
>> I think the next version should be 10.0 no matter what changes we put
>> in.
> 
> Here's my two cents: we called 8.0 that on the basis of the native Windows
> port, and 9.0 that on the basis of getting in-core replication support.
> Both of those were game-changing features that deserved a "major major"
> version number bump.  But as the project matures, it gets harder and
> harder to come up with game-changing features in the span of a single
> release.  Parallel query is a great example: a fully mature parallel query
> feature would, IMO, easily justify a 10.0 moniker.  But what we have today
> is more like half or two-thirds of a feature.  If we call this release
> 10.0 on the strength of that, we could justifiably be accused of
> overselling it.  On the other hand, if we wait till next year when
> parallelism presumably will be much more fully baked, it'll be a bit
> anticlimactic to call it 10.0 then.  This isn't going to get better with
> other major features that can be expected to appear in future.  So we can
> expect to continue to waste lots of time debating the "what to call it"
> question, in pretty much every year except for the one or two immediately
> after a "major major" bump.
> 
> There's also the problem that the current numbering scheme confuses
> people who aren't familiar with the project.  How many times have
> you seen people say "I'm using Postgres 8" or "Postgres 9" when asked
> what version they're on?
> 
> So I think we should solve these problems at a stroke, and save ourselves
> lots of breath in the future, by getting rid of the whole "major major"
> idea and going over to a two-part version numbering scheme.  To be
> specific:
> 
> * This year's major release will be 9.6.0, with minor updates 9.6.1,
> 9.6.2, etc.  It's too late to do otherwise for this release cycle.
> 
> * Next year's major release will be 10.0, with minor updates 10.1,
> 10.2, etc.
> 
> * The year after, 11.0.  Etc cetera.
> 
> No confusion, no surprises, no debate ever again about what the next
> version number is.
> 
> This is by no means a new idea, but I think its time has come.

I'm for it.

Note that we will need to do a *bunch* of education around the change in
version numbering schemes.  And a bunch of people and packagers will
need to change their version comparison scripts (while everyone should
be using the sortable version numbers, not everyone does).

So if we're going to make that change, I suggest doing it *now* to get
the word out.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lets (not) break all the things. Was: [pgsql-advocacy] 9.6 -> 10.0

2016-05-13 Thread Josh berkus
On 05/13/2016 11:48 AM, Robert Haas wrote:
> On Fri, May 13, 2016 at 12:12 PM, Joshua D. Drake <j...@commandprompt.com> 
> wrote:
>> Singular point contribution is not the point of my argument. My point is
>> that if three people from EDB and three people from Citus got together and
>> worked on a project in full collaboration it would be more beneficial to the
>> project.
> 
> Well, the scalability work in 9.6 went almost exactly like this,
> assuming you count Andres as three people (which is entirely
> reasonable) and Dilip, Mithun, Amit, and myself as three people (which
> is maybe less reasonable, since I don't really want any of us counted
> as less than a whole person).

Frankly, PostgreSQL is practically a wonderland of inter-company
collaboration.  Yeah, there's some "does not play nice with others"
which happens, but that's pretty much inevitable.

Plus, it's also useful to have some companies go in different directions
sometimes; the best approach to certain problems isn't always well
defined.  We might have a little more of that than is completely ideal,
but it's rather hard to determine that.

Anyway, all of this is a moot point, because nobody has the power to
tell the various companies what to do.  We're just lucky that everyone
is still committed to writing stuff which adds to PostgreSQL.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-05-13 Thread Josh berkus
On 05/13/2016 08:19 AM, Bruce Momjian wrote:
>> > Thoughts?  Is it crazy to go from 9.6beta1 to 10.0beta2?  What would
>> > actually be involved in making the change?
> Someone mentioned how Postgres 8.5 became 9.0, but then someone else
> said the change was made during alpha releases, not beta.  Can someone
> dig up the details?

/me digs through the announcement archives ...

We changed it to 9.0 for Alpha4.  By Beta1, it was already 9.0.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-05-13 Thread Josh berkus
On 05/13/2016 11:31 AM, Alvaro Herrera wrote:
> Josh berkus wrote:
>  
>> Anyway, can we come up with a consensus of some minimum changes it will
>> take to make the next version 10.0?
> 
> I think the next version should be 10.0 no matter what changes we put
> in.
> 

Well, if we adopt 2-part version numbers, it will be.  Maybe that's the
easiest thing?  Then we never have to have this discussion again, which
certainly appeals to me ...

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-05-13 Thread Josh berkus
On 05/13/2016 09:30 AM, Tom Lane wrote:

> More generally, rebranding after beta1 sends a very public signal that
> we're a bunch of losers who couldn't make up our minds in a timely
> fashion.  We should have discussed this last month; now I think we're
> stuck with a decision by default.

Although maybe we could use some controversy to get us back in the press ;-)

Anyway, can we come up with a consensus of some minimum changes it will
take to make the next version 10.0?  Here's my thinking:

1. pglogical is accepted into core, with docs/scripts to make it a hot
upgrade option.

2. parallel continues to make progress

My argument is that even if we get nothing else, the above two are
enough to "bump" it to 10.0.  And if we can have argreement on that now,
then we can avoid a month-long argument about version numbers next year.


-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Academic help for Postgres

2016-05-11 Thread Josh berkus
On 05/11/2016 07:54 AM, Bruce Momjian wrote:
> On Wed, May 11, 2016 at 05:41:21PM +0300, Heikki Linnakangas wrote:
>> On 11/05/16 17:32, Bruce Momjian wrote:
>>> On Wed, May 11, 2016 at 05:31:10PM +0300, Konstantin Knizhnik wrote:
>>>> On 11.05.2016 17:20, Bruce Momjian wrote:
>>>>> I am giving a keynote at an IEEE database conference in Helsinki next
>>>>> week (http://icde2016.fi/).  (Yes, I am not attending PGCon Ottawa
>>>>> because I accepted the Helsinki conference invitation before the PGCon
>>>>> Ottawa date was changed from June to May).
>>>>>
>>>>> As part of the keynote, I would like to mention areas where academia can
>>>>> help us.  The topics I can think of are:
>>>>>
>>>>>   Query optimization
>>>>>   Optimizer statistics
>>>>>   Indexing structures
>>>>>   Reducing function call overhead
>>>>>   CPU locality
>>>>>   Sorting
>>>>>   Parallelism
>>>>>   Sharding
>>>>>
>>>>> Any others?
>>>>>
>>>> Incremental materialized views?
>>>
>>> I don't know.  Is that something academics would research?
>>
>> Absolutely! There are plenty of papers on how to keep materialized views
>> up-to-date.
> 
> Oh, OK. I will add it.
> 

Together with that, automated substitution of materialized views for
query clauses.

Also: optimizing for new hardware, like persistent memory.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Please help update the "How to beta Test" page

2016-05-09 Thread Josh berkus
Folks,

Please help update the wiki page around how to beta test.  Particularly,
please update it with particular things we'd like to see users test for,
like data corruption related to freezing (with some notes on how to test
for this).

https://wiki.postgresql.org/wiki/HowToBetaTest

Thanks!

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Josh berkus
On 05/06/2016 02:12 PM, Andres Freund wrote:
> On 2016-05-06 14:10:04 -0700, Josh berkus wrote:
>> On 05/06/2016 02:08 PM, Andres Freund wrote:
>>
>>> It bothers me more than it probably should: Nobdy tests, reviews,
>>> whatever a complex patch with significant data-loss potential. But as
>>> soon somebody dares to mention an option name...
>>
>> Definitely more than it should, because it's gonna happen *every* time.
>>
>> https://en.wikipedia.org/wiki/Law_of_triviality
> 
> Doesn't mean it should not be frowned upon.

Or made light of, hence my post.  Personally I don't care what the
option is called, as long as we have docs for it.

For the serious testing, does anyone have a good technique for creating
loads which would stress-test vacuum freezing?  It's hard for me to come
up with anything which wouldn't be very time-and-resource intensive
(like running at 10,000 TPS for a week).

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Josh berkus
On 05/06/2016 02:08 PM, Andres Freund wrote:

> It bothers me more than it probably should: Nobdy tests, reviews,
> whatever a complex patch with significant data-loss potential. But as
> soon somebody dares to mention an option name...

Definitely more than it should, because it's gonna happen *every* time.

https://en.wikipedia.org/wiki/Law_of_triviality

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-05-06 Thread Josh berkus
On 05/06/2016 01:58 PM, Andres Freund wrote:
> On 2016-05-06 13:54:09 -0700, Joshua D. Drake wrote:
>> On 05/06/2016 01:50 PM, Andres Freund wrote:

>>> There already is FREEZE - meaning something different - so I doubt it.
>>
>> Yeah I thought about that, it is the word "FORCE" that bothers me. When you
>> use FORCE there is an assumption that no matter what, it plows through
>> (think rm -f). So if we don't use FROZEN, that's cool but FORCE doesn't work
>> either.
> 
> SCANALL?
> 

VACUUM THEWHOLEDAMNTHING


-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-05-04 Thread Josh berkus
On 05/04/2016 06:56 PM, Robert Haas wrote:
> On Wed, May 4, 2016 at 9:41 PM, Noah Misch <n...@leadboat.com> wrote:
>> On Mon, Apr 18, 2016 at 03:37:21PM -0300, Alvaro Herrera wrote:
>>> The RMT will publish aggregate, unattributed results after the poll
>>> closes.
>>
>> Thanks for voting.  Join me in congratulating our top finishers:
>>
>> 1. fd31cd2 Dont vacuum all-frozen pages.
>> 2. "Parallel Query"
>> 3(tie). 3fc6e2d Make the upper part of the planner work by generating and 
>> comparing Paths.
>> 3(tie). 848ef42 Add the "snapshot too old" feature
> 
> Congratulations Kevin, Tom, me, and me!
> 
> I feel like I went to the Olympics and won both the gold *and* silver
> medals in the same event.  Beat that!
> 

Maybe we *should* call this 10.0.  That way people will be ready for
lots of breakage. ;-b

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-04-18 Thread Josh berkus
On 04/18/2016 11:37 AM, Alvaro Herrera wrote:
> Hackers, lurkers,
> 
> The PostgreSQL Project needs you!
> 
> The Release Management Team would like your input regarding the patch or
> patches which, in your opinion, are the most likely sources of major
> bugs or instabilities in PostgreSQL 9.6.
> 
> Please submit your answers before May 1st using this form:
> https://docs.google.com/forms/d/1xNNqhXC116wCMnomqGz9RQ7OuVwZqAcEre7iiU6pT20/viewform
> 
> If, for some reason, you prefer not to fill that form or have further
> input on the topic, you can correspond via private email to one or more
> members of the RMT,
> 
>   Robert Haas <robertmh...@gmail.com>
>   Alvaro Herrera <alvhe...@alvh.no-ip.org>
>   Noah Misch <nmi...@leadboat.com>
> 
> The RMT will publish aggregate, unattributed results after the poll
> closes.

We should send the owner of the scariest patch something as a prize.
Maybe a plastic skeleton or something ...

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lets (not) break all the things. Was: [pgsql-advocacy] 9.6 -> 10.0

2016-04-12 Thread Josh berkus
On 04/12/2016 01:07 PM, Oleg Bartunov wrote:
> 
> Our roadmap http://www.postgresql.org/developer/roadmap/ is the problem.
> We don't have clear roadmap and that's why we cannot plan future feature
> full release. 

As someone who's worked at multiple proprietary software companies,
having a roadmap doesn't magically make code happen.

> There are several postgres-centric companies, which have
> most of developers, who do all major contributions. All these companies
> has their roadmaps, but not the community. I think 9.6 release is
> inflection point, where we should combine our roadmaps and release the
> one for the community. Than we could plan releases and our customers
> will see what to expect. I can't say for other companies, but we have
> big demand for many features from russian customers and we have to
> compete with other databases. Having community roadmap will helps us to
> work with customers and plan our resources.

It would be good to have a place for the companies who do PostgreSQL
feature work would publish their current efforts and timelines, so we at
least have a go-to place for "here's what someone's working on".  But
only if that information is going to be *updated*, something we're very
bad at.  And IMHO, a "roadmap" which is less that 50% accurate is a
waste of time.

There's an easy way for you to kick this off though: have PostgresPro
publish a wiki page or Trello board or github repo or whatever with your
roadmap and invite other full-time PostgreSQL contributors to add their
pieces.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Lets (not) break all the things. Was: [pgsql-advocacy] 9.6 -> 10.0

2016-04-12 Thread Josh berkus
On 04/12/2016 10:43 AM, Robert Haas wrote:
> 1. Large backward compatibility breaks are bad.  Therefore, if any of
> these things are absolutely impossible to do without major
> compatibility breaks, we shouldn't do them at all.

+1

> 2. Small backward compatibility breaks are OK, but don't require doing
> anything special to the version number.

+1

> 3. There's no value in aggregating many small backward compatibility
> breaks into a single release.  That increases pain for users, rather
> than decreasing it, and slows down development, too, because you have
> to wait for the special magic release where it's OK to hose users.  We
> typically have a few small backward compatibility breaks in each
> release, and that's working fine, so I see little reason to change it.

+1

> 4. To the extent that I can guess what the things on Simon's list
> means from what he wrote, and that's a little difficult because his
> descriptions were very short, I think that everything on that list is
> either (a) a bad idea or (b) something that we can do without any
> compatibility break at all.

+1

Here's the features I can imagine being worth major backwards
compatibility breaks:

1. Fully pluggable storage with a clean API.

2. Total elimination of VACUUM or XID freezing

3. Fully transparent-to-the user MM replication/clustering or sharding.

4. Perfect partitioning (i.e. transparent to the user, supports keys &
joins, supports expressions on partition key, etc.)

5. Transparent upgrade-in-place (i.e. allowing 10.2 to use 10.1's tables
without pg_upgrade or other modification).

6. Fully pluggable parser/executor with a good API

That's pretty much it.  I can't imagine anything else which would
justify imposing a huge upgrade barrier on users.  And, I'll point out,
that in the above list:

* nobody is currently working on anything in core except #4.

* we don't *know* that any of the above items will require a backwards
compatibility break.

People keep talking about "we might want to break compatibility/file
format one day".  But nobody is working on anything which will and
justifies it.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Please correct/improve wiki page about abbreviated keys bug

2016-03-30 Thread Josh berkus
On 03/30/2016 02:47 PM, Josh berkus wrote:
> On 03/29/2016 07:43 PM, Peter Geoghegan wrote:
>> Do you think it would be okay if the SQL query to detect potentially
>> affected indexes only considered the leading attribute? Since that's
>> the only attribute that could use abbreviated keys, it ought to be
>> safe to not require users to REINDEX indexes that happen to have a
>> second-or-subsequent text/varchar(n) attribute that doesn't use the C
>> locale. Maybe it's not worth worrying about.
> 
> I think that's a great idea.

Based on that concept, I wrote a query which is now on the wiki page.
Please fix it if it's not showing what we want it to show.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] So, can we stop supporting Windows native now?

2016-03-30 Thread Josh berkus
http://www.zdnet.com/article/microsoft-and-canonical-partner-to-bring-ubuntu-to-windows-10/

... could be good news for us ...

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Please correct/improve wiki page about abbreviated keys bug

2016-03-30 Thread Josh berkus
On 03/29/2016 07:43 PM, Peter Geoghegan wrote:
> On Tue, Mar 29, 2016 at 7:31 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>> A corrupt index could easily fail to detect uniqueness violations (because
>> searches fail to find entries they should find).  Not sure I believe that
>> it would make false reports of a uniqueness conflict that's not really
>> there.

I meant failing to detect a violation, and thus letting the user insert
a duplicate key.

> Sure. But looking at how texteq() is implemented, it certainly seems
> impossible that that could happen. Must have been a miscommunication
> somewhere. I'll fix it.

There was speculation on this in the -bugs thread, and nobody
contradicted it.  If you're fairly sure that it wouldn't happen, your
knowledge of the issue is definitely superior to mine.

> Do you think it would be okay if the SQL query to detect potentially
> affected indexes only considered the leading attribute? Since that's
> the only attribute that could use abbreviated keys, it ought to be
> safe to not require users to REINDEX indexes that happen to have a
> second-or-subsequent text/varchar(n) attribute that doesn't use the C
> locale. Maybe it's not worth worrying about.

I think that's a great idea.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   3   4   5   6   7   8   9   10   >